0% found this document useful (0 votes)
48 views21 pages

Unit - 5 R

Uploaded by

vsksai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views21 pages

Unit - 5 R

Uploaded by

vsksai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Unit-5

Data Visualization using R (l2h) Reading and getting data into R External
Data): XML files, Web Data, JSON files, Databases, Excel files. Working with R
Charts and Graphs: Histograms, Bar Charts, Line Graphs, Scatterplots, Pie
Charts.

1.Discuss about data visualization using R.


Data visualization using R is a powerful tool for exploring, analyzing, and
communicating insights from user data. R offers a wide range of packages
and functions for creating various types of plots and charts, from simple
scatter plots to complex interactive visualizations. Here's a discussion on data
visualization using R:
1. Base Graphics:
R's base graphics system provides functions for creating basic plots like
scatter plots, line plots, bar plots, histograms, and box plots. These functions
are built-in and come pre-installed with R.
Example:
# Scatter plot
plot(x, y)
# Line plot
plot(x, y, type = "l")
# Bar plot
barplot(height, names.arg = labels)
# Histogram hist(x)
# Box plot
boxplot(x)
2. ggplot2 Package:
ggplot2 is a popular package for creating elegant and customizable plots
based on the Grammar of Graphics. It provides a high-level interface for
building complex visualizations with ease.
Example:
library(ggplot2)
df <- data.frame( x = rnorm(100), # 100 random points for x-axis
y = rnorm(100) # 100 random points for y-axis ))
# Scatter plot
ggplot(data = df, aes(x = x, y = y)) + geom_point()
# Line plot
ggplot(data = df, aes(x = x, y = y)) + geom_line()
# Bar plot
ggplot(data = df, aes(x = x, y = y)) + geom_bar(stat = "identity")
# Histogram
ggplot(data = df, aes(x = x)) + geom_histogram()
# Box plot
ggplot(data = df, aes(x = factor(group), y = value)) + geom_boxplot()
3. plotly Package:
plotly is an interactive plotting library that allows user to create interactive
and web-friendly visualizations. It supports various types of plots like scatter
plots, line plots, bar plots, and 3D plots.
Example:
install.packages("plotly")
library(plotly)

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 1


df <- data.frame( x = rnorm(100), # 100 random points for x-axis
y = rnorm(100) # 100 random points for y-axis )

# Scatter plot
plot_ly(data = df, x = ~x, y = ~y, mode = "markers")
# Line plot plot_ly(data = df, x = ~x, y = ~y, mode = "lines")
# Bar plot plot_ly(data = df, x = ~x, y = ~y, type = "bar")
# 3D surface plot plot_ly(z = ~z) %>% add_surface()

4. Interactive Dashboards with shiny:


shiny is an R package that enables user to build interactive web applications
directly from R. User can create dynamic dashboards with interactive plots,
tables, and widgets, allowing users to interactively explore data.
5. Specialized Packages:
R has many specialized packages for creating specific types of visualizations,
such as leaflet for interactive maps, gganimate for creating animated plots,
ggplotly for converting ggplot2 plots into interactive plotly plots, and
networkD3 for visualizing networks and graphs.
Data visualization in R offers a wide range of possibilities for creating
informative and visually appealing plots and charts. Depending on user data
and analysis goals, user can choose the appropriate package and functions to
effectively visualize and communicate insights from user data.

2. Discuss about reading and getting data in R (External Data)


Reading and getting external data into R is a crucial step in the data analysis
process. R provides various functions and packages to import data from
different sources such as text files, Excel files, databases, web APIs, and
more. Here's a discussion on how to read and get external data into R:
1. Reading Data from Text Files:
a. read.table() and read.csv():
Used for reading tabular data from text files with columns separated by
spaces or commas, respectively.
Syntax:
# For reading space-separated files data <- read.table("file.txt", header =
TRUE) # For reading comma-separated files data <- read.csv("file.csv",
header = TRUE)
b. read.delim() and read.delim2():
Similar to read.table() and read.csv(), but specifically designed for reading
tab-separated files and files with different decimal separators, respectively.
2. Reading Data from Excel Files:
a. readxl Package:
Used for reading data from Excel files.
Syntax:
# Install and load the readxl package install.packages("readxl")
library(readxl) # Read data from Excel file data <- read_excel("file.xlsx",
sheet = "Sheet1")
b. openxlsx Package:
Another package for reading data from Excel files.
Syntax:

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 2


# Install and load the openxlsx package install.packages("openxlsx")
library(openxlsx) # Read data from Excel file data <- read.xlsx("file.xlsx",
sheet = 1)
3. Reading Data from Databases:
a. DBI Package:
Used for connecting to databases and fetching data into R.
Syntax:
# Install and load the DBI package install.packages("DBI") library(DBI) #
Connect to database con <- dbConnect(RSQLite::SQLite(), dbname =
"database.db") # Execute query and fetch data into data frame data <-
dbGetQuery(con, "SELECT * FROM table")
b. Specific Database Connectors:
There are specific packages like RMySQL, ROracle, RPostgreSQL, etc., for
connecting to different types of databases.
4. Reading Data from Web APIs:
a. httr Package:
Used for making HTTP requests to web APIs and fetching data.
Syntax:
# Install and load the httr package install.packages("httr") library(httr) #
Make GET request to API response <- GET("https://api.example.com/data") #
Parse JSON response data <- content(response, "parsed")
5. Reading Data from Web Scraping:
a. rvest Package:
Used for web scraping and extracting data from HTML pages.
Syntax:
# Install and load the rvest package install.packages("rvest") library(rvest) #
Read HTML table from a webpage webpage <-
read_html("https://example.com") data <- html_table(webpage)
6. Reading Data from APIs and Other Sources:
R has packages for accessing various APIs like twitteR for Twitter, ggmap for
Google Maps, RSocrata for Socrata Open Data API, etc.
Additionally, packages like quantmod and Quandl provide access to financial
and economic data.
By using these functions and packages, user can efficiently read and get
external data into R for further analysis and visualization, allowing user to
unlock valuable insights from diverse sources.

3.What is the process for importing CSV files into R.


Importing CSV files into R is a straightforward process using the read.csv()
function, which is a part of R's base package. Here's a step-by-step process:
1. Ensure CSV File Accessibility:
Make sure the CSV file user want to import is accessible from userr R
environment. User should know the path to the file or place it in userr working
directory.
2. Set Working Directory (if necessary):
If userr CSV file is not in the current working directory, user may need to set
the working directory using the setwd() function:
setwd("path/to/directory")
3. Use read.csv() Function:
Use the read.csv() function to import the CSV file into R. Provide the path to
the CSV file as the argument.
# Syntax data <- read.csv("file.csv") # Example data <- read.csv("data.csv")

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 3


4. Additional Options:
header: Set to TRUE if the first row of the CSV file contains column names,
FALSE otherwise.
sep: Specify the delimiter used in the CSV file, e.g., ,, ;, \t.
stringsAsFactors: Set to TRUE to convert strings to factors, FALSE otherwise.
na.strings: Specify strings to be interpreted as missing values.
# Example with additional options data <- read.csv("data.csv", header =
TRUE, sep = ",", stringsAsFactors = FALSE, na.strings = "")
5. Check Imported Data:
Verify that the data has been imported correctly by printing the first few rows
of the data frame:
RCopy code
head(data)
Example:
Suppose user have a CSV file named data.csv containing the following data:
Name,Age,Gender John,25,Male Emily,30,Female David,22,Male
User can import this CSV file into R as follows:
data <- read.csv("data.csv", header = TRUE)

4.What is the process for importing XML files in R.


Importing XML files into R involves using packages specifically designed for
XML parsing, such as XML or xml2. Here's a step-by-step process to import
XML files into R:

1. Install and Load XML Parsing Package:


Before importing XML files, user need to install and load the package that
provides XML parsing capabilities. Two commonly used packages are XML and
xml2. User can choose either based on user preference.
# Install and load the XML package install.packages("XML") library(XML) # OR
# Install and load the xml2 package install.packages("xml2") library(xml2)
2. Parse the XML File:
Once user have the package loaded, user can parse the XML file using the
appropriate function provided by the package (xmlTreeParse() for XML
package or read_xml() for xml2 package).
Using XML Package:
# Parse XML file doc <- xmlTreeParse("file.xml")
Using xml2 Package:
# Parse XML file doc <- read_xml("file.xml")
3. Extract Data from XML Document:
After parsing the XML file, user can extract data from the XML document
using various functions provided by the package.
Using XML Package:
# Extract data from XML document data <- xmlToList(doc)
Using xml2 Package:
# Extract data from XML document data <- xml2::as_list(doc)
4. Convert Data to Suitable R Data Structure:
The extracted data may need further processing to convert it into a suitable R
data structure such as data frame, list, or vector, depending on user analysis
needs.
Example:
Suppose user have an XML file named data.xml with the following content:

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 4


<students> <student> <name>John</name> <age>25</age>
<gender>Male</gender> </student> <student> <name>Emily</name>
<age>30</age> <gender>Female</gender> </student> </students>

User can import and parse this XML file into R using xml2 package as follows:
# Install and load the xml2 package
install.packages("xml2") library(xml2) # Parse XML file
doc <- read_xml("data.xml") # Extract data from XML document
data <- xml2::as_list(doc)
This will create a list named data containing the extracted data from the XML
file.
By following these steps, user can import XML files into R and work with the
data for further analysis and manipulation.

5.What is the process for importing web data into R.


Importing web data into R typically involves using web scraping techniques or
accessing web APIs. Here's a general process for importing web data into R:
1. Identify the Data Source:
Determine the website or web service from which user want to import data.
This could be a webpage containing HTML tables, JSON data from a web API,
or other structured data sources.
2. Web Scraping:
If the data is available on a webpage, user can use web scraping techniques
to extract it. The rvest package is commonly used for web scraping in R.
Example using rvest:
# Install and load the rvest package install.packages("rvest") library(rvest)
# Read HTML content from the webpage url <- "https://example.com"
webpage <- read_html(url)
# Extract data from HTML table data <- html_table(webpage)
3. Accessing Web APIs:
Many websites and web services provide APIs that allow user to
programmatically access their data. User can use R packages like httr or
specific packages for accessing specific APIs.
Example using httr:
# Install and load the httr package install.packages("httr") library(httr) #
Make GET request to API url <- "https://api.example.com/data" response <-
GET(url) # Parse JSON response data <- content(response, "parsed")
4. Authentication (if required):
If the web service requires authentication, user may need to include
authentication credentials in userr request headers or parameters.
5. Data Processing:
Once user have obtained the web data, user may need to process it further to
convert it into a suitable R data structure such as data frames, lists, or
vectors.
Example:
Suppose user want to import JSON data from a web API. User can use the httr
package to make a GET request and parse the JSON response:
# Install and load the httr package install.packages("httr") library(httr)
# Make GET request to API url <- "https://api.example.com/data" response <-
GET(url) # Parse JSON response data <- content(response, "parsed")
This will create an R object named data containing the imported web data,
which user can then work with for further analysis and manipulation.

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 5


6.What is the process form importing JSON files in R.
In R, user can import JSON files using the json lite package, which provides
functions for reading and writing JSON data. Here's a step-by-step guide on
how to import JSON files:
Install jsonlite Package: If user haven't installed the jsonlite package yet, user
can do so by running the following command in R:
install.packages("jsonlite")
Load jsonlite Package: After installing the package, user need to load it into
userr R session:
library(jsonlite)
Read JSON File: Use the fromJSON() function to read the JSON file into R. User
need to specify the path to userr JSON file as an argument.
data <- fromJSON("path/to/userr/file.json")
Replace "path/to/userr/file.json" with the actual path to userr JSON file.
Verify Data: User can verify that the data has been imported correctly by
inspecting the structure of the data object using str() function:
str(data)
This will give user an overview of the structure of the imported JSON data.

7.What is the process for importing databases into R.


Importing databases into R typically involves establishing a connection to a
database management system (DBMS) and then executing SQL queries to
retrieve data into R data structures. Here's a general process for importing
databases into R:
Install Necessary Packages: User may need to install packages specific to the
database management system user're using.
For example, if user're working with MySQL, user'll need the RMySQL
package.
install.packages("RMySQL")
If user're using a different DBMS, user'll need to install the corresponding
package. For PostgreSQL, there's RPostgreSQL, for SQLite, there's RSQLite,
and so on.
Load Required Packages: Load the package(s) user installed into userr R
session.
library(RMySQL)
Replace RMySQL with the package user installed if user're using a different
DBMS.
Establish Database Connection: Use the appropriate function to establish a
connection to userr database. For example, if user're connecting to a MySQL
database, user'd use dbConnect() from RMySQL package.
con <- dbConnect(MySQL(), user = "username", password = "password",
dbname = "database_name", host = "localhost")
Replace "username", "password", "database_name", and "localhost" with
userr actual MySQL credentials and database information.
Query Data: Once the connection is established, user can execute SQL
queries to retrieve data from the database into R data structures such as data
frames.
data <- dbGetQuery(con, "SELECT * FROM userr_table")
Replace "SELECT * FROM userr_table" with userr SQL query to retrieve the
desired data from the database.

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 6


Close Connection: After user've finished working with the database, it's good
practice to close the connection.
dbDisconnect(con)

8.What is the process for importing excel files into R.


Importing Excel files into R can be done using several packages, but one of
the most commonly used is the readxl package. Here's a step-by-step process
to import Excel files into R using readxl:
Install and Load the readxl Package: If user haven't already installed the
readxl package, user can do so by running:
install.packages("readxl")
Once installed, load the package:
library(readxl)
Specify the Excel File Path: Determine the path to user Excel file. User can
either specify the full path or place the file in user working directory and just
specify the file name.
file_path <- "path/to/userr/file.xlsx"
Read the Excel File: Use the read_excel() function from the readxl package to
import the Excel file into R. Provide the file path or name as an argument.
data <- read_excel(file_path)
If userr Excel file contains multiple sheets and user want to read a specific
sheet, user can specify the sheet name or index:
RCopy code
data <- read_excel(file_path, sheet = "Sheet1") # Specify sheet name # or
data <- read_excel(file_path, sheet = 1)
# Specify sheet index
Verify Data: User can inspect the structure of the imported data using
functions like head() or str():
head(data)
# View the first few rows of the data str(data) # View the structure of the
data

9.Discuss about working with R charts and graphs.


Working with charts and graphs in R is essential for data visualization and
analysis. R offers a wide range of packages and functions for creating various
types of charts and graphs, from basic plots to highly customized
visualizations. Some of the popular packages for data visualization in R
include ggplot2, base, lattice, plotly, and ggvis. Here's a general overview of
working with R charts and graphs:
Basic Plots with Base R: Base R provides functions like plot(), hist(), boxplot(),
barplot(), etc., for creating basic plots and charts. These functions are
versatile and can handle simple visualizations quickly.
RCopy code
# Example of a scatter plot using base R x <- rnorm(100) y <- rnorm(100)
plot(x, y, main = "Scatter Plot", xlab = "X-axis", ylab = "Y-axis")
Advanced Plots with ggplot2: ggplot2 is a powerful and popular package for
creating sophisticated and highly customizable graphics. It follows a layered
grammar of graphics approach, allowing users to build complex visualizations
piece by piece.
# Example of a scatter plot using ggplot2 library(ggplot2)
ggplot(data = NULL, aes(x = x, y = y)) + geom_point() + labs(title = "Scatter
Plot", x = "X-axis", y = "Y-axis")

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 7


Interactive Visualizations: Packages like plotly and ggplotly allow user to
create interactive plots from ggplot2 objects. These interactive plots can be
manipulated by users, providing a dynamic exploration of the data.
RCopy code
# Example of an interactive scatter plot using plotly library(plotly) p <-
ggplot(data = NULL, aes(x = x, y = y)) + geom_point() + labs(title =
"Interactive Scatter Plot", x = "X-axis", y = "Y-axis") ggplotly(p)
Customizing Visualizations: R allows extensive customization of plots to meet
specific requirements. User can customize various aspects of a plot such as
axes, titles, colors, themes, legends, and annotations.
Saving Plots: Once user've created a plot, user can save it to various file
formats such as PNG, PDF, JPEG, etc., using functions like ggsave() (for
ggplot2 plots) or pdf(), png(), etc., (for base R plots).
Combining Multiple Plots: User can combine multiple plots into a single
display using functions like par() (for base R plots), grid.arrange() (from
gridExtra package), or facets in ggplot2.
Exporting Plots: R allows user to export plots for use in reports, presentations,
or web applications. User can export plots directly from the R console or
integrate them into documents using tools like R Markdown.
Overall, working with charts and graphs in R offers great flexibility and power
for visualizing and interpreting data, making it an indispensable tool for data
analysis and exploration.

10.What is histograms. What is the procedure to create histograms


in R.
A histogram is a graphical representation of the distribution of numerical
data. It consists of a series of contiguous rectangles (bars) where the area of
each bar corresponds to the frequency of data values within the interval it
represents. Histograms are commonly used to visualize the frequency
distribution of continuous variables.
Here's a general procedure to create histograms in R:
Load or Generate Data: First, user need to have data to plot. User can load
data from a file, import it from a database, or generate it within R.
Create Histogram: Use the hist() function in R to create a histogram. The
hist() function takes the data as its main argument and allows user to specify
additional parameters such as the number of bins, axes labels, title, color,
etc.
Customize Histogram: User can customize the appearance of the histogram
by adjusting parameters such as colors, bin width, axis labels, title, etc.
Display or Save the Plot: Once user've created the histogram, user can
display it directly in the R console, save it as an image file, or embed it in a
document or presentation.
Here's an example of creating a histogram in R using randomly generated
data:

# Generate random data data <- rnorm(1000) # Generate 1000 normally


distributed random numbers # Create histogram hist(data, breaks = 20, #
Number of bins main = "Histogram of Random Data", # Title xlab = "Values",
# X-axis label ylab = "Frequency", # Y-axis label col = "skyblue", # Bar color
border = "black" # Border color )

In this example:

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 8


We generate 1000 random numbers from a normal distribution using rnorm().
We create a histogram of the generated data using hist().
We specify parameters such as the number of bins (breaks), main title (main),
x-axis label (xlab), y-axis label (ylab), bar color (col), and border color
(border).
User can further customize the histogram by adjusting these parameters or
by exploring additional options provided by the hist() function. Additionally,
user can combine histograms with other types of plots, customize axes, add
annotations, and more to create informative and visually appealing
visualizations.

11.What is box plots. What is the procedure to create box plots in R.


A box plot, also known as a box-and-whisker plot, is a graphical summary of a
data distribution through five key summary statistics: minimum, first quartile
(Q1), median (Q2), third quartile (Q3), and maximum. It provides a visual
representation of the central tendency, variability, and skewness of a dataset.
Here's the general procedure to create box plots in R:
Load or Generate Data: Similar to creating histograms, user need to have
data available in R to create a box plot. This data could be loaded from a file,
imported from a database, or generated within R.
Create Box Plot: Use the boxplot() function in R to create a box plot. This
function takes the data as its main argument and allows user to specify
additional parameters such as labels, titles, color, etc.
Customize Box Plot: User can customize the appearance of the box plot by
adjusting parameters such as colors, labels, titles, etc.
Display or Save the Plot: Once user've created the box plot, user can display
it directly in the R console, save it as an image file, or embed it in a document
or presentation.
Here's an example of creating a box plot in R using randomly generated data:

# Generate random data data <- rnorm(100) # Generate 100 normally


distributed random numbers # Create box plot boxplot(data, main = "Box
Plot of Random Data", # Title ylab = "Values", # Y-axis label col = "skyblue",
# Box color border = "black" # Border color )

In this example:
We generate 100 random numbers from a normal distribution using rnorm().
We create a box plot of the generated data using boxplot().
We specify parameters such as the main title (main), y-axis label (ylab), box
color (col), and border color (border).
User can further customize the box plot by adjusting these parameters or
exploring additional options provided by the boxplot() function. Additionally,
user can combine box plots with other types of plots, customize axes, add
annotations, and more to create informative and visually appealing
visualizations.

12.What is bar charts. What is the procedure to create bar charts in


R.
A bar chart is a graphical representation of categorical data with rectangular
bars of lengths proportional to the values they represent. Bar charts are
useful for comparing the frequencies or proportions of different categories
within a dataset.

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 9


Here's a general procedure to create bar charts in R:
Load or Prepare Data: Ensure user have the data ready for plotting. If userr
data is not in the desired format, user may need to reshape it using tools like
dplyr or tidyr to prepare it for plotting.
Create Bar Chart: Use functions such as barplot() from base R or geom_bar()
from the ggplot2 package to create a bar chart. These functions take the data
as input along with additional parameters for customization.
Customize Bar Chart: Adjust parameters such as colors, labels, titles, axes,
etc., to customize the appearance of the bar chart according to userr
preferences.
Display or Save the Plot: Once user've created the bar chart, user can display
it in the R console, save it as an image file, or embed it in a document or
presentation.
Here's an example of creating a bar chart in R using randomly generated
categorical data:
RCopy code
# Generate random categorical data categories <- c("A", "B", "C", "D", "E")
values <- sample(1:100, 5) # Create bar chart using base R barplot(values,
names.arg = categories, # Category labels main = "Bar Chart of Random
Data", # Title xlab = "Categories", # X-axis label ylab = "Values", # Y-axis
label col = "skyblue", # Bar color border = "black" # Border color )
In this example:
We generate random categorical data with five categories and corresponding
values using sample().
We create a bar chart of the generated data using barplot().
We specify parameters such as category labels (names.arg), main title
(main), axis labels (xlab and ylab), bar color (col), and border color (border).
User can further customize the bar chart by adjusting these parameters or
exploring additional options provided by the plotting functions. Additionally,
user can combine bar charts with other types of plots, customize axes, add
annotations, and more to create informative and visually appealing
visualizations.

13. What is Line Graphs. What is the procedure to Create line graphs
in R.
A line graph, also known as a line plot or line chart, is a graphical
representation of data points connected by straight line segments. Line
graphs are commonly used to display trends and patterns in time-series data
or to show relationships between continuous variables.
Here's a general procedure to create line graphs in R:
Load or Prepare Data: Ensure user have the data ready for plotting. If userr
data is not in the desired format, user may need to preprocess or reshape it
using tools like dplyr or tidyr to prepare it for plotting.
Create Line Graph: Use functions such as plot() from base R or geom_line()
from the ggplot2 package to create a line graph. These functions take the
data as input along with additional parameters for customization.
Customize Line Graph: Adjust parameters such as colors, labels, titles, axes,
etc., to customize the appearance of the line graph according to userr
preferences.
Display or Save the Plot: Once user've created the line graph, user can
display it in the R console, save it as an image file, or embed it in a document
or presentation.

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 10


Here's an example of creating a line graph in R using randomly generated
time-series data:
# Generate random time-series data time <- 1:10 values <- rnorm(10)
# Create line graph using base R plot(time, values, type = "l",
# Plot as lines main = "Line Graph of Random Data", # Title xlab = "Time",
# X-axis label ylab = "Values", # Y-axis label col = "skyblue", # Line color lwd
= 2 # Line width )
In this example:
We generate random time-series data with 10 time points (time) and
corresponding values (values) using rnorm().
We create a line graph of the generated data using plot().
We specify parameters such as the plot type (type), main title (main), axis
labels (xlab and ylab), line color (col), and line width (lwd).
User can further customize the line graph by adjusting these parameters or
exploring additional options provided by the plotting functions. Additionally,
user can combine line graphs with other types of plots, customize axes, add
annotations, and more to create informative and visually appealing
visualizations.

14.What is scatterplots. What is the procedure to create scatter


plots in R.
A scatter plot is a type of plot that displays values for two variables as points
on a two-dimensional coordinate system. Each point on the plot represents
the value of one variable corresponding to the value of the other variable.
Scatter plots are useful for visualizing relationships between two continuous
variables and identifying patterns, trends, or correlations in the data.
Here's a general procedure to create scatter plots in R:
Load or Prepare Data: Ensure user have the data ready for plotting. If userr
data is not in the desired format, user may need to preprocess or reshape it
using tools like dplyr or tidyr to prepare it for plotting.
Create Scatter Plot: Use functions such as plot() from base R or geom_point()
from the ggplot2 package to create a scatter plot. These functions take the
data as input along with additional parameters for customization.
Customize Scatter Plot: Adjust parameters such as colors, labels, titles, axes,
etc., to customize the appearance of the scatter plot according to userr
preferences.
Display or Save the Plot: Once user've created the scatter plot, user can
display it in the R console, save it as an image file, or embed it in a document
or presentation.
Here's an example of creating a scatter plot in R using randomly generated
data:
RCopy code
# Generate random data for two variables x <- rnorm(100)
# Generate 100 normally distributed random numbers for x y <- rnorm(100)
# Generate 100 normally distributed random numbers for y
# Create scatter plot using base R plot(x, y, main = "Scatter Plot of Random
Data",
# Title xlab = "X-axis", # X-axis label ylab = "Y-axis", # Y-axis label col =
"skyblue",

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 11


# Point color pch = 16, # Point shape cex = 1.5 # Point size )
In this example:
We generate random data for two variables, x and y, using rnorm().
We create a scatter plot of the generated data using plot().
We specify parameters such as the main title (main), axis labels (xlab and
ylab), point color (col), point shape (pch), and point size (cex).
User can further customize the scatter plot by adjusting these parameters or
exploring additional options provided by the plotting functions. Additionally,
user can combine scatter plots with other types of plots, customize axes, add
annotations, and more to create informative and visually appealing
visualizations.

15.What is pie charts. What is the procedure to create pie charts in


R.
A pie chart is a circular statistical graphic that is divided into slices to
illustrate numerical proportions. The size of each slice represents the
proportion of the whole that each category represents. Pie charts are
commonly used to visualize categorical data and to show the composition of a
whole.
Here's a general procedure to create pie charts in R:
Load or Prepare Data: Ensure user have the data ready for plotting. Pie charts
typically work best with categorical data, where user have categories and
corresponding frequencies or proportions.
Create Pie Chart: Use functions such as pie() from base R or geom_bar() with
coord_polar() from the ggplot2 package to create a pie chart. These functions
take the data as input along with additional parameters for customization.

Customize Pie Chart: Adjust parameters such as colors, labels, titles, etc., to
customize the appearance of the pie chart according to user preferences.

Display or Save the Plot: Once user've created the pie chart, user can display
it in the R console, save it as an image file, or embed it in a document or
presentation.
Here's an example of creating a pie chart in R using randomly generated
categorical data:
# Generate random categorical data with frequencies
categories <- c("Category A", "Category B", "Category C", "Category D")
frequencies <- c(20, 30, 15, 35)
# Create pie chart using base R pie(frequencies, labels = categories, #
Category labels main = "Pie Chart of Categories", # Title col =
rainbow(length(categories)) # Slice colors )

In this example:
We generate random categorical data with four categories (categories) and
corresponding frequencies (frequencies).
We create a pie chart of the generated data using pie().
We specify parameters such as category labels (labels), main title (main), and
slice colors (col).
User can further customize the pie chart by adjusting these parameters or
exploring additional options provided by the plotting functions. Additionally,
user can combine pie charts with other types of plots, customize labels, add

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 12


annotations, and more to create informative and visually appealing
visualizations.

Programs :

To import XML files in R, user can use the xml2 or XML packages, both of which provide
functions for parsing XML data.
Here’s how user can do it with both packages:
1. Using the xml2 Package
The xml2 package is modern, fast, and has a user-friendly API.
Install and Load the xml2 Package
If user don’t have the package installed, user can install it using:
r
Copy code
install.packages("xml2")
Then load the package:
library(xml2)
Import an XML File
To read an XML file, use the read_xml() function:
# Load the XML file
xml_file <- read_xml("path/to/userr/file.xml")

# View the structure of the XML document


print(xml_file)
Extract Data
User can navigate through the XML tree using functions like xml_find_all() and xml_text():
# Extract all nodes of a certain type
nodes <- xml_find_all(xml_file, ".//node_name")
# Convert the nodes to text
values <- xml_text(nodes)
print(values)

2. Using the XML Package


The XML package is older but still widely used for working with XML in R.
Install and Load the XML Package
If user don’t have the package installed, user can install it using:
install.packages("XML")
Then load the package:
library(XML)
Import an XML File
To read an XML file, use the xmlParse() function:
# Load the XML file
xml_file <- xmlParse("path/to/userr/file.xml")

# Convert to an R list for easier data extraction


xml_list <- xmlToList(xml_file)

# View the structure of the XML document


print(xml_list)
Extract Data
User can extract data directly from the parsed XML object:
# Extract specific elements using XPath
values <- xpathSApply(xml_file, "//node_name", xmlValue)

print(values)
Choosing Between xml2 and XML
 xml2: Recommended for most tasks due to its speed and simplicity.

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 13


 XML: Useful if user need legacy support or specific features not available in xml2.
Example
Assume user have an XML file named example.xml like this:
xml
Copy code
<root>
<item>
<name>Item 1</name>
<price>10.99</price>
</item>
<item>
<name>Item 2</name>
<price>15.99</price>
</item>
</root>
User can load and parse this file using the xml2 package like this:
r
Copy code
library(xml2)

# Load the XML file


xml_file <- read_xml("example.xml")

# Extract item names


names <- xml_find_all(xml_file, ".//name")
item_names <- xml_text(names)

# Extract item prices


prices <- xml_find_all(xml_file, ".//price")
item_prices <- as.numeric(xml_text(prices))

# Combine into a data frame


df <- data.frame(Name = item_names, Price = item_prices)

print(df)
This will give user a data frame with the item names and prices from the XML file.
4o

In R, user can read and import external data from a variety of file formats, including CSV, Excel,
JSON, XML, databases, and more. Below is a guide on how to read and get data from different
types of external files in R.
1. Reading CSV Files
CSV (Comma-Separated Values) files are one of the most common data formats.
Example:
# Base R method
data <- read.csv("path/to/userr/file.csv")
# View the first few rows
head(data)
# Alternatively, using readr package for faster performance
install.packages("readr")
library(readr)
data <- read_csv("path/to/userr/file.csv")
# View the first few rows
head(data)
2. Reading Excel Files
To read Excel files, user can use the readxl or openxlsx packages.
Example using readxl:
install.packages("readxl")

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 14


library(readxl)
# Read the first sheet of the Excel file
data <- read_excel("path/to/userr/file.xlsx")
# Read a specific sheet by name or index
data <- read_excel("path/to/userr/file.xlsx", sheet = "Sheet1")
# View the first few rows
head(data)
Example using openxlsx:
install.packages("openxlsx")
library(openxlsx)
# Read the first sheet of the Excel file
data <- read.xlsx("path/to/userr/file.xlsx", sheet = 1)
# View the first few rows
head(data)
3. Reading JSON Files
JSON (JavaScript Object Notation) is a lightweight data-interchange format.
Example using jsonlite:
install.packages("jsonlite")
library(jsonlite)
# Read JSON data
data <- fromJSON("path/to/userr/file.json")
# View the structure of the data
str(data)
4. Reading XML Files
User can read XML files using either the xml2 or XML packages, as discussed in the previous
answer.
Example using xml2:
install.packages("xml2")
library(xml2)
# Load the XML file
xml_file <- read_xml("path/to/userr/file.xml")
# Extract data from the XML
data <- xml_find_all(xml_file, ".//node_name")
values <- xml_text(data)
print(values)

5. Reading Data from Databases


To read data from databases, user can use packages like DBI along with a database-specific driver
like RMySQL, RSQLite, odbc, etc.
Example using DBI and RSQLite:
install.packages("DBI")
install.packages("RSQLite")
library(DBI)
library(RSQLite)
# Connect to an SQLite database
con <- dbConnect(RSQLite::SQLite(),
"path/to/userr/database.sqlite")
# List tables
tables <- dbListTables(con)
print(tables)
# Read a table into a data frame
data <- dbReadTable(con, "table_name")
# Close the connection

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 15


dbDisconnect(con)
# View the first few rows
head(data)

6. Reading Data from Web APIs


User can fetch data from web APIs using httr and jsonlite.
Example:
install.packages("httr")
install.packages("jsonlite")
library(httr)
library(jsonlite)

# Fetch data from a web API


response <- GET("https://api.example.com/data")
# Parse JSON content
data <- content(response, as = "parsed", type =
"application/json")
# Convert to a data frame if needed
data_df <- as.data.frame(data)
# View the structure of the data
str(data_df)

7. Reading Text Files


For plain text files, user can use readLines() or read.table().
Example:
# Read all lines from a text file
lines <- readLines("path/to/userr/file.txt")
# View the first few lines
head(lines)
# Alternatively, read as a table
data <- read.table("path/to/userr/file.txt", header = TRUE, sep =
"\t")
# View the first few rows
head(data)

8. Reading SPSS, Stata, and SAS Files


For statistical software files, user can use the haven package.
Example:
install.packages("haven")
library(haven)
# Read SPSS file
data <- read_sav("path/to/userr/file.sav")
# Read Stata file
data <- read_dta("path/to/userr/file.dta")
# Read SAS file
data <- read_sas("path/to/userr/file.sas7bdat")
# View the first few rows
head(data)
9. Reading HTML Tables
User can extract tables from HTML pages using the rvest package.
Example:
install.packages("rvest")
library(rvest)

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 16


# Read the HTML page
page <- read_html("https://example.com")
# Extract tables from the page
tables <- html_table(page)
# View the first table
print(tables[[1]])

Summary
 CSV: read.csv() or read_csv()
 Excel: read_excel() or read.xlsx()
 JSON: fromJSON()
 XML: read_xml() or xmlParse()
 Database: dbReadTable()
 Web APIs: GET() and fromJSON()
 Text: readLines() or read.table()
 SPSS/Stata/SAS: read_sav(), read_dta(), read_sas()
 HTML: html_table()

R is a powerful tool for creating various types of charts and visualizations. Here are some
examples of common chart types in R using the ggplot2 and base R plotting systems.

1. Basic Scatter Plot (Base R)


A scatter plot is used to display values for two variables.
# Sample data
x <- rnorm(100)
y <- rnorm(100)
# Basic scatter plot
plot(x, y, main="Scatter Plot", xlab="X Axis", ylab="Y Axis",
pch=19, col="blue")

2. Scatter Plot (ggplot2)


Using ggplot2 for more customized scatter plots.
library(ggplot2)

# Sample data
df <- data.frame(x = rnorm(100), y = rnorm(100))

# Scatter plot using ggplot2


ggplot(df, aes(x = x, y = y)) +
geom_point(color = 'blue') +
labs(title = "Scatter Plot", x = "X Axis", y = "Y Axis") +
theme_minimal()
3. Line Plot (Base R)
Line plots are useful for time series or trend data.
# Sample data
x <- 1:100
y <- cumsum(rnorm(100))

# Basic line plot


plot(x, y, type="l", main="Line Plot", xlab="Index",
ylab="Value", col="red")

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 17


4. Line Plot (ggplot2)
Creating a line plot using ggplot2.
# Sample data
df <- data.frame(x = 1:100, y = cumsum(rnorm(100)))

# Line plot using ggplot2


ggplot(df, aes(x = x, y = y)) +
geom_line(color = 'red') +
labs(title = "Line Plot", x = "Index", y = "Value") +
theme_minimal()
5. Bar Plot (Base R)
Bar plots are used for displaying categorical data.
# Sample data
categories <- c("A", "B", "C", "D")
values <- c(3, 7, 2, 5)

# Basic bar plot


barplot(values, names.arg=categories, main="Bar Plot",
col="darkgreen")
6. Bar Plot (ggplot2)
Using ggplot2 to create a bar plot.
# Sample data
df <- data.frame(categories = c("A", "B", "C", "D"), values =
c(3, 7, 2, 5))

# Bar plot using ggplot2


ggplot(df, aes(x = categories, y = values)) +
geom_bar(stat="identity", fill="darkgreen") +
labs(title = "Bar Plot", x = "Category", y = "Values") +
theme_minimal()
7. Histogram (Base R)
Histograms are used to display the distribution of a continuous variable.
# Sample data
data <- rnorm(1000)

# Basic histogram
hist(data, main="Histogram", xlab="Values", col="lightblue",
border="black")
8. Histogram (ggplot2)
Creating a histogram using ggplot2.
# Sample data
df <- data.frame(values = rnorm(1000))

# Histogram using ggplot2


ggplot(df, aes(x = values)) +
geom_histogram(binwidth=0.2, fill="lightblue", color="black") +
labs(title = "Histogram", x = "Values", y = "Frequency") +
theme_minimal()
9. Box Plot (Base R)
Box plots are used to show the distribution of a continuous variable and identify outliers.
# Sample data
data <- rnorm(100)

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 18


# Basic box plot
boxplot(data, main="Box Plot", ylab="Values", col="orange")
10. Box Plot (ggplot2)
Creating a box plot with ggplot2.
# Sample data
df <- data.frame(values = rnorm(100))

# Box plot using ggplot2


ggplot(df, aes(y = values)) +
geom_boxplot(fill="orange") +
labs(title = "Box Plot", y = "Values") +
theme_minimal()
These examples should give user a good start with creating charts in R. User can further
customize these plots by adjusting aesthetics, adding titles, labels, and using different themes.
Below are examples of different types of plots in R: histograms, bar plots, line plots, scatter plots,
and pie charts. Both base R and ggplot2 examples are provided.
1. Histogram
Base R:
# Sample data
data <- rnorm(1000)
# Basic histogram
hist(data, main="Histogram", xlab="Values", col="lightblue",
border="black")
ggplot2:
library(ggplot2)

# Sample data
df <- data.frame(values = rnorm(1000))
# Histogram using ggplot2
ggplot(df, aes(x = values)) +
geom_histogram(binwidth=0.2, fill="lightblue", color="black") +
labs(title = "Histogram", x = "Values", y = "Frequency") +
theme_minimal()
2. Bar Plot
Base R:
# Sample data
categories <- c("A", "B", "C", "D")
values <- c(3, 7, 2, 5)

# Basic bar plot


barplot(values, names.arg=categories, main="Bar Plot",
col="darkgreen")
ggplot2:
# Sample data
df <- data.frame(categories = c("A", "B", "C", "D"), values =
c(3, 7, 2, 5))
# Bar plot using ggplot2
ggplot(df, aes(x = categories, y = values)) +
geom_bar(stat="identity", fill="darkgreen") +
labs(title = "Bar Plot", x = "Category", y = "Values") +
theme_minimal()
3. Line Plot
Base R:

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 19


# Sample data
x <- 1:100
y <- cumsum(rnorm(100))
# Basic line plot
plot(x, y, type="l", main="Line Plot", xlab="Index",
ylab="Value", col="red")
ggplot2:
# Sample data
df <- data.frame(x = 1:100, y = cumsum(rnorm(100)))

# Line plot using ggplot2


ggplot(df, aes(x = x, y = y)) +
geom_line(color = 'red') +
labs(title = "Line Plot", x = "Index", y = "Value") +
theme_minimal()
4. Scatter Plot
Base R:
# Sample data
x <- rnorm(100)
y <- rnorm(100)

# Basic scatter plot


plot(x, y, main="Scatter Plot", xlab="X Axis", ylab="Y Axis",
pch=19, col="blue")

ggplot2:
# Sample data
df <- data.frame(x = rnorm(100), y = rnorm(100))

# Scatter plot using ggplot2


ggplot(df, aes(x = x, y = y)) +
geom_point(color = 'blue') +
labs(title = "Scatter Plot", x = "X Axis", y = "Y Axis") +
theme_minimal()
5. Pie Chart
Base R:
# Sample data
slices <- c(10, 20, 30, 40)
labels <- c("A", "B", "C", "D")

# Basic pie chart


pie(slices, labels=labels, main="Pie Chart",
col=rainbow(length(slices)))
ggplot2: Note that ggplot2 does not have a native pie chart function, but it can be created by
transforming a bar chart into a pie chart using coord_polar().
# Sample data
df <- data.frame(category = c("A", "B", "C", "D"), count = c(10,
20, 30, 40))

# Pie chart using ggplot2


ggplot(df, aes(x = "", y = count, fill = category)) +
geom_bar(width = 1, stat = "identity") +
coord_polar("y") +

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 20


labs(title = "Pie Chart") +
theme_void() # Removes background, grid, and axis for cleaner
look

These examples cover basic usage of each plot type in both base R and ggplot2. User can
customize these plots further by modifying aesthetics, labels, themes, and other parameters.

V.SAI KRISHNA M.SC.,M.TECH.,(PHD) –SLNDC - ANANTAPUR Page 21

You might also like