0% found this document useful (0 votes)
22 views26 pages

Introduction To R

Data Science involves using tools like R, Python, SQL, Excel, SAS, statistics, experiments, Tableau, visualization, Spark and TensorFlow to solve problems using data in a scientific way. R was created in 1993 and is an environment for statistical computing and data analysis. It has a programming language and over 10,000 packages that allow for data manipulation, complex analysis, and visualization. R Studio provides a graphical user interface to make using R easier.

Uploaded by

Sanyam Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views26 pages

Introduction To R

Data Science involves using tools like R, Python, SQL, Excel, SAS, statistics, experiments, Tableau, visualization, Spark and TensorFlow to solve problems using data in a scientific way. R was created in 1993 and is an environment for statistical computing and data analysis. It has a programming language and over 10,000 packages that allow for data manipulation, complex analysis, and visualization. R Studio provides a graphical user interface to make using R easier.

Uploaded by

Sanyam Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Big DATA

By: Amit Shankar


Data Science is not R.
Data Science is not Python.
Data Science is not SQL.
Data Science is not Excel.
Data Science is not SAS.
Data Science is not Statistics.
Data Science is not Experiments.
Data Science is not Tableau.
Data Science is not Visualisation.
Data Science is not Spark.
Data Science is not TensorFlow.

Data Science is using above tools and techniques, and if required inventing new tools
and techniques, to solve a problem using “data” in a “scientific” way.
Data Science tools
Types of data analytics
Statistics Vs Machine learning
Introduction TO R
History

● R was created by Ross Ihaka and Robert Gentleman at the


University of Auckland.

● First appeared: August 1993

● R and its libraries implement a wide variety of statistical


and graphical techniques, including linear and nonlinear
modelling, classical statistical tests, time-series analysis,
classification, clustering, and others.
What is R
• R is an environment for data manipulation, statistical
computing, data analysis and data visualization.
• Better data handling and storage of output.
• Combination of both simple and complex data analysis.
• Own programing language.
• Similar to “s” language (extension of S plus software)
• 10000 packages.
Why R?
• No cost
• Statistical computing environment
• Open source
• Easy language
• Codes can be saved, run and stored
• Available for all platform
• Built in and contributed packages are available. Users can create their own
packages
• Interpreted computer language not compiler
• Error indication
• Graphics can be saved in different format
Library in R
• In R, a package is a collection of R functions, data and compiled
code. The location where the packages are stored is called the library.
• Base library (MASS, mgcv)
• Special library
library(spatial)
library (help=spatial)
Packages in R
• install.packages(“rmeta”)
• Install.packages(“xlsx”)
Help in R
• Menu
• Google “Baba”
• ?mean
• help.search (“data input”)
• help()/help.start()
• find(“lowess”)
• apropos("lm")
Example and demonstration
• example(lm)
• demo(persp)
• demo(graphics)
Quit
• Q()
Command line and Script
• Command (Enter)
• Script (CTRL+R)
Introduction of R Studio
• Interface between R and us
• Easy for beginners
• Help in coding
• Suggestions
• 4 windows
Script
Console
Environment
Output

You might also like