0% found this document useful (0 votes)
26 views2 pages

Vba Exam

The document outlines a process for loading and merging stock data for IBM from two sources: a CSV file and Yahoo Finance using the quantmod package in R. It includes steps for data cleaning, checking for missing values, and saving the final merged dataset to a new CSV file. The final output is a cleaned dataset with correctly formatted date and numerical columns.

Uploaded by

Mariam mushtaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views2 pages

Vba Exam

The document outlines a process for loading and merging stock data for IBM from two sources: a CSV file and Yahoo Finance using the quantmod package in R. It includes steps for data cleaning, checking for missing values, and saving the final merged dataset to a new CSV file. The final output is a cleaned dataset with correctly formatted date and numerical columns.

Uploaded by

Mariam mushtaq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd

##---- PART 1 ----

##SET WORKING DIRECTORY


setwd("C:/Users/HP/Desktop")

##LOADING CSV INTO R


data <- read.csv("IBM Bloomberg Data.csv", header = TRUE, sep = ",")

##INSTALLING QUANTMOD
install.packages("quantmod")
library(quantmod)

# Define the stock symbol, start date, and end date


symbol <- "IBM"
start_date <- as.Date("2020-01-01")
end_date <- as.Date("2024-12-31")

# Download stock data using quantmod


getSymbols(symbol, src = "yahoo", from = start_date, to = end_date)

# Convert to data frame


ibm_data <- data.frame(Date = index(IBM), coredata(IBM))

# Save the data as a CSV file


write.csv(ibm_data, file = "IBM_Quantmod_Data.csv", row.names = FALSE)

##---- PART 2 ----

# Load necessary packages


install.packages("tidyverse")
install.packages("dplyr")
library(tidyverse)
library(dplyr)

# Read the CSV files


bloomberg_data <- read.csv("IBM Bloomberg Data.csv")
quantmod_data <- read.csv("IBM_Quantmod_Data.csv")

# Convert the 'Date' columns to Date type


bloomberg_data$Date <- as.Date(bloomberg_data$Date, format = "%d/%m/%Y")
quantmod_data$Date <- as.Date(quantmod_data$Date, format = "%Y-%m-%d")

# Extract the year from the 'Date' columns


bloomberg_data$Year <- format(bloomberg_data$Date, "%Y")
quantmod_data$Year <- format(quantmod_data$Date, "%Y")

# Merge the datasets by 'Year'


merged_data <- merge(bloomberg_data, quantmod_data, by = "Year", all = FALSE)

# Optionally, save the merged dataset to a new CSV file


write.csv(merged_data, "Merged_Data.csv", row.names = FALSE)

# Check for missing values in the entire dataset


missing_summary <- merged_data %>%
summarise(across(everything(), ~ sum(is.na(.))))

# Display columns with missing values, if any


missing_summary[, missing_summary > 0]
# Check if there are any missing values in the entire dataset
any(is.na(data)) # Returns TRUE if there are missing values, otherwise FALSE

# Count total missing values in the entire dataset


sum(is.na(data)) # Returns 0 if there are no missing values

## NO MISSING VALUES WERE FOUND

## ENSURING CORRECT FORMAT

# Read the CSV file


data <- read.csv("Merged_Data.csv", stringsAsFactors = FALSE)

# Check the first few rows and column names


head(data)
str(data)

# Convert date columns to Date format if they exist


if("Date.x" %in% colnames(data)) {
data$Date.x <- as.Date(data$Date.x, format = "%d/%m/%Y")
}
if("Date.y" %in% colnames(data)) {
data$Date.y <- as.Date(data$Date.y, format = "%d/%m/%Y")
}

# Convert numerical columns to numeric if they exist


num_cols <- c("PE_RATIO", "RETURN_ON_ASSET", "ESG_SCORE", "EQY_RAW_BETA",
"Closing_Price", "Volume")
num_cols <- intersect(num_cols, colnames(data)) # Ensure columns exist before
converting
data[num_cols] <- lapply(data[num_cols], as.numeric)

write.csv(merged_data, "Merged_Clean_Data.csv", row.names = FALSE)

You might also like