An Introduction to R

Dec. 14, 2012 | by Edward Farragher

“R is taking over the world.  It’s R.  There’s no debate anymore.”

These are the words of William Beckler, the “Director of Innovation” at lastminute.com; talking at the Big Data Innovation in London earlier this year.

R is a programming language and environment where statistical techniques can be applied to model a variety of data sets; it can be used in many fields, including digital marketing. It is constantly evolving and updating as it relies on the R community to contribute new packages; so new uses are being developed all the time.

R began as a research project by Robert Gentleman and Ross Ihaka in 1993 at the University of Auckland in New Zealand. It is now used by over 2 million analysts worldwide.

This blog post is designed to introduce you to the software, if you don’t already use it, and help get you set up with adding the powerful tool to your arsenal.

Practical Example

Here’s a quick example of how R can be used in digital marketing with some constructed data. The full code inputted into R, along with the R output for this example is located here.

Suppose an online marketing campaign has recently been carried out for a website across a number of regions; the performance of which can be measured by looking at the revenue obtained. An analytics platform has been set up to track the following metrics:

  • Revenue.
  • Bounce rate.
  • Traffic.
  • Percentage of traffic from unique visits.
  • Average time on site.
  • Average number of pages viewed.
  • Percentage of traffic on mobile devices.

Suppose the campaign was run in 41 regions and all the metrics were successfully obtained; with the exception of one region, where the revenue was not recorded due to a tracking error.  The region with the missing revenue only has the available data:

  • Bounce rate: 15.3%
  • Traffic: 36,800
  • Percentage of traffic from unique visits: 81.5%
  • Average time on site: 9.1 minutes
  • Average number of pages viewed: 3.4
  • Percentage of traffic on mobile devices:  16.9%

By analysing the relationship between the variables of the regions where all the data was obtained; R can be used to produce a linear model to describe the revenue in terms of the other metrics.  In other words, a formula for revenue can be produced; in terms of bounce rate, traffic and the other metrics.

Visualising Data

R is useful for producing clear and concise visualisations.  Here we can see a pairwise scatterplot between the variables and you can immediately see if there is any correlation between them.  For example, it appears that as traffic increases; the percentage of unique visits increases respectively.  This also shows that bounce rate and percentage of visits made on mobile devices have little interaction for this campaign.

Modelling Data

A variety of modelling techniques can be applied in R. A linear model can be formed by inputting just one line of code.

In the example, a technique called backwards elimination was applied. This involves removing insignificant variables. The final model to estimate the revenue for a region was as follows (containing the variables Bounce Rate (x1), Traffic (x2), Minutes on Site (x3) and Page Views (x4)):

So entering the values for the missing region we can estimate the revenue:

So this model suggests that the revenue for the region was £22,100 for the campaign. This is just one example of the ways R can revolutionise work within the digital marketing industry, and help all those other research and insight analysts out there!

Please click here for the the full annotated code script and output from R, along with additional graphs, and a description of the assumptions made.

I hope this has been insightful and will encourage you to download, use and participate in the expanding R community; if you are not already.  R is a revolutionary piece of software on the market; it’s free, constantly expanding and can handle small to large data sets.

So if, as William Beckler says, R really is taking over the world, we should embrace it to carry out statistical analysis efficiently, accurately and for free.

Download R

You can download the software from the R Project website by selecting CRAN under the Download, Packages heading.

Further Resources

There is a vast amount of guides and resources online surrounding how to use R.  An Introduction to R by W. N. Venables, D. M. Smith and the R Core Team is a greater in depth introduction to the programming language and software.

Once you have downloaded R, you can find out information for a specific function by typing in help(<function-name>) or ?<function-name> into the R console.

Be Sociable, Share!

    Comments (2)

     
    Please note: the opinions expressed in this post represent the views of the individual, not necessarily those of iCrossing.

    Post a comment

    SUBSCRIBE