1.
2 Need for R:
The following points describe why R language should be used:
Free of charge: R is totally free. It is available under the terms of the
Free Software Foundation’s GNU General Public License in source code
form.
Open-source: R and most of its packages are fully open source. Thou-
sands of developers are constantly reviewing the source code of the pack-
ages to check whether there are bugs to fix or things to improve.
Popular: R is a very popular as a statistical programming language and
platform to perform data mining,analysis, and visualization.
Flexible: R is a dynamic script language. It is highly flexible to allow
programming styles in multiple paradigms, including functionality pro-
gramming and object-oriented programming.
Reproducible: When using software based on a graphical user interface,
you only need to choose from menus and click buttons.
Rich resources: R has a huge, rapidly increasing number of online re-
sources. One type of resource is extension packages. There are, at the
time of writing this, more than 7,500 packages available at CRAN (short
for Comprehensive R Archive Network), a world-wide network of mirror
servers from which you can get identical, up-to-date, R distributions and
packages.
Strong community: The community of R consists of not only R devel-
opers but also, (the majority), R users from a wide range of backgrounds
1.2. NEED FOR R: 7
such as statistics, econometrics, finance, bioinformatics, mechanical en-
gineering, physics, medicine, and so on.A great number of R developers
actively contribute to open source projects or packages written in R.
Cutting-edge: Many R users are professional researchers in statistics,
econometrics,or other disciplines.Quite often,authors publish their new pa-
pers along with a new package that includes the cutting-edge techniques
presented in the paper.
1.2.1 R vs Python:
R Programming Language and Python are both used extensively for Data Sci-
ence. Both are very useful and open-source languages as well. For data analy-
sis, statistical computing, and machine learning Both languages are strong tools
with sizable communities and huge libraries for data science jobs. A theoretical
comparison between R and Python is provided below:
S.No R Programming Python
1. R is a statistical language used
for the analysis and visual repre-
sentation of data.
Python is a general-purpose lan-
guage that is used Well-suited
for many programming domains,
including data science, web de-
velopment, software develop-
ment, and gaming.
2. Very popular in academia and
research, finance and data sci-
ence.
Well-suited for many program-
ming domains, including data
science, web development, soft-
ware development, and gaming.
3. R has fewer libraries compared
to Python and is easy to know.
Python has a lot of libraries.
However, it can be complex to
understand all of them.
4. R’s statistical packages are
highly powerful.
Python’s statistical packages are
less powerful.
5. R is generally used when the
data analysis task requires stan-
dalone computation(analysis)
and processing.
Python is mainly used when the
data analysis needs to be inte-
grated with web applications.
6. A few IDEs for the R language
are RStudio, StatET, etc.
There are many Python IDEs
available to choose from, a few
of them are Jupyter Notebook,
Spyder, Pycharm, etc.