Lead Image © Rancz Andrei, 123RF.com

Security data analytics and visualization with R

Data Analysis

Article from ADMIN 24/2014

By Russ McRee

Conduct improved security analysis and visualization of security-related data using R, a scripting language for statistical data manipulation and analysis.

In this era of massive computing environments, cloud services, and global infrastructure, it is reasonable to call data "big," although this is the first and last time I'll do so in this article.

The issue of massive data volume driven by scale is not new; the problem space has simply evolved. Data challenges are just more prevalent now given that even a small business or single user can generate significant data, because processing power and storage are commodity items easily attained. Even though it's a subset of the larger sum, security data is no less daunting, and given my bias, in many ways more important to manage, process, maintain, and analyze.

In a quest to conduct better analysis in massively dynamic environments, I embraced R a few months ago and now live in a steady state of epiphany as I uncover new opportunities for awareness and visualization (see the "Coursera Data Science Specialization" box). I've read several books while undertaking this endeavor, and one of the best and most inspirational is by Jay Jacobs and Bob Rudis [3] [4]. These few months later, my R skill level has improved just enough to share some insight with you. Imagine me as somewhere between total noob R script kiddie and modestly creative practitioner.

Coursera Data Science Specialization

Some key principles I use in this article I learned from the "Principles of Analytic Graphics" lecture provided in the Johns Hopkins University Exploratory Data Analysis course, a part of Coursera's Data Science [1] specialization.

You can take each of the courses in this terrific specialization for free online (I highly recommend them as part of learning R);

...

Use Express-Checkout link below to read the full article (PDF).