Table of Contents
Introduction
R language for statistical computing was initially developed by Ross Ihaka and Robert Gentleman at the University of Auckland in the early 90s. It is considered as an open-source implementation of the S language, which is developed by John Chambers in Bell Laboratories during the early 80s. R is highly extensible and provides solutions for a wide variety of statistical techniques and visualization capabilities. As it is open source and is supported by a large number of contributors, it is emerging as one of the most powerful solutions to bid data analysis.
Like in every programming language, there are many pros and cons with R.
Advantages
Open-source and hence free.
R has top-notch functionalities for effective graphical representation of data
It is easy to use
In comparison with many other statistical software packages, R uses a command-line interface, which allows users to code things in the console and in the form of scripts. This might be frustrating for beginners but makes work reproducible later.
Users can wrap their work in R scripts and this can be easily shared with colleagues. Because of these advantages, R appeals to a large audience, both in academia and in business. Moreover, the ease of creating R packages to solve particular problems makes it more popular. The active R community has created thousands of well-documented R packages for a broad range of applications. R packages can be found for solving business analytics in the financial sector and health care, high-performance computing, distributed computing, statistics, and many more cutting edge research areas.
Disadvantages
However, there are also some disadvantages. R seems to be relatively easy to learn in the beginning, but it is hard to really master it. Further, as R is command-based, it becomes highly inconvenient for many of the statisticians and other non-computer science professionals to use it. This steep learning curve sometimes results in poorly written R code that is very hard to read and to maintain. Such poorly written R code may slow down the process if users are working with large data gets.
R-Environment setup
R is one of the world’s widely used open source programming tool and provides software environment for graphical representation, information reporting, and statistical analytics. R is licensed under GNU public license and is freely available for development and research purpose. It is also available under commercial license for advanced usage and requirements. The binary pre-compiled versions of R are available for all the well-known operating systems such as Windows, Mac, Linux, Fedora, Redhat, and openSUSE. R and its IDE, RStudio can be downloaded from www.rstudio.org.
Programming With R
To begin with R, one needs to start with R console where all the action takes place. The following figure gives a look and feel of the R console.
R console is an execution window where users can execute R commands. Users need to type the required action on the command prompt and upon pressing the Enter key, R interprets the action typed by the user, executes the same, and gives the answer.
Example 01 |
Write the command to perform basic arithmetic with R console.
Solution:
To calculate the sum of two numbers, say 5 and 7, the programmer needs to type 5+7 in the command prompt of the console. Then, R compiles the command typed on the command prompt, calculates the result, and prints the same as a numerical value.
Example 02 |
Display a string on R console
Solution:
To display some text in the R console, programmers need to type the same within double quotes. R interprets the character string within double quotes and simply prints the same as the output.
Example 03 |
Declare variables and retrieve the value of the stored variables in R console.
Solution:
The first step to learn any programming language is to start with the variable declaration. In R, variables are used to store a value or an object. In following example, two variables , namely age and name are declared and initialized with the value 20 and Rayan respectively. These variables can be later used for data access and manipulation.
<< Previous Next>>
;