0% found this document useful (0 votes)

41 views29 pages

Unik Intern Report

This report details a 12-week Data Analytics internship at Global Fast Enterprises Pvt. Ltd., where the intern, Unik Shyaula, gained practical experience in data preprocessing, analysis, and visualization using tools like Python and Power BI. Key tasks included data cleaning, exploratory data analysis, and dashboard development, which enhanced technical skills and bridged academic learning with real-world applications. The report outlines the internship's objectives, activities, and the organizational context, highlighting the intern's contributions and learning outcomes.

Uploaded by

unikshyaula7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views29 pages

Unik Intern Report

Uploaded by

unikshyaula7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Tribhuvan University Institute of Science and

Technology

A Final Year Internship Report

On
“Data Analyst”
At
Global Fast Enterprises Pvt. Ltd.

Submitted to:
Department of Computer Science and Information Technology
Himalaya College of Engineering
Chyasal, Lalitpur

In Partial fulfillment of the requirements

for the Bachelor’s Degree in Computer Science and Information Technology

Submitted by:
Unik Shyaula (T.U. Symbol No. 26565/077)

Date: 28th June, 2025

Acknowledgement
I want to express my sincere gratitude to the supervisor and Head of Department of CSIT,
Er. Himal Chand Thapa for his guidance, advice, help, support and supervision
throughout the internship period. As the internship progressed, it was thanks to his inputs
and expert comments that I was able to respond to the increasing hurdles.

I would also like to thank Er. Mahesh Kr. Yadav for having faith in me and guiding me
while I was an intern at Global Fast Enterprises Pvt. Ltd. Being a member of such a nice
and vibrant team was a wonderful and amazing experience for me.

I am very thankful to entire at Global Fast Enterprises Solutions team for providing the
internship opportunity with full support and collaboration.

I sincerely appreciate the assistance of these people, and I do so with a great deal of pleasure
and thanks.

I also want to express my gratitude to the entire teaching team at the Department of
Computer Science and Information Technology for their ongoing support, encouragement,
and advice, all of which contributed to the smooth progress of my internship.

I am really appreciative of the department staff members and friends that assisted me in
finishing this internship successfully.

Unik Shyaula

T.U Exam Roll No: 26565/077

i
Abstract
This report presents an overview of my 12-week Data Analytics internship at Global Fast
Enterprises Pvt. Ltd., where I gained hands-on experience in data preprocessing, analysis,
and visualization. Using tools such as Python (pandas, NumPy, matplotlib, seaborn), Excel,
and Power BI, I worked on real-world financial datasets to extract actionable insights. The
internship focused on key tasks like data cleaning, exploratory data analysis, linear
regression modeling, and dashboard development.

Throughout the internship, I enhanced my technical and analytical skills while learning to
communicate insights effectively within a collaborative, Agile-based work environment.
The experience helped me bridge academic learning with practical applications in the field
of business analytics. This report highlights my learning journey, major project
contributions, and the essential skills developed during this valuable professional
experience.

Keywords: Data Analytics, Python Programming, Power BI, Data Visualization,

Statistical Analysis, Machine Learning, Data Cleaning, Exploratory Data Analysis (EDA)

ii
LIST OF ABBREVIATIONS

BI Business Intelligence

CSV Comma Seperated Values

DAX Data Analysis Expression

EDA Exploratory Data Analysis

JSON JavaScript Object Notation

KPI Key Perfromance Indicator

iii
Table of Contents
Acknowledgement ................................................................................................................ i
Abstract ................................................................................................................................ii
LIST OF ABBREVIATIONS ........................................................................................... iii
LIST OF FIGURES ............................................................................................................ vi
LIST OF TABLES .............................................................................................................vii
Chapter 1: Introduction ........................................................................................................ 1
1.1 Introduction ................................................................................................................ 1
1.2 Problem Statement ..................................................................................................... 1
1.3 Objectives .................................................................................................................. 2
1.4 Report Organization ................................................................................................... 2
Chapter 2: Organization Details and Literature Review...................................................... 3
2.1 Introduction to Organization ...................................................................................... 3
2.1.1 Contact Information ............................................................................................ 3
2.2 Organizational Hierarchy ........................................................................................... 4
2.3 Working Domains of Organization ............................................................................ 5
2.4 Description of Intern Department/Unit ...................................................................... 5
2.5 Literature Review....................................................................................................... 6
Chapter 3: Internship Activities ........................................................................................... 7
3.1 Roles and Responsibilities ......................................................................................... 7
3.2 Weekly log ................................................................................................................. 7
3.3 Description of the Project(s) Involved During Internship ......................................... 9
3.4 Tasks / Activities Performed .................................................................................... 10
3.4.1 Data Collection and Import .............................................................................. 10
3.4.2 Data Cleaning and Preprocessing ..................................................................... 10
3.4.3 Feature Engineering .......................................................................................... 11
3.4.4 Exploratory Data Analysis (EDA) .................................................................... 12
3.4.5 Data Visualization ............................................................................................ 12
3.4.6 Outlier Detection and Removal ........................................................................ 15
3.4.7 Regression Modeling ........................................................................................ 16
3.4.8 Model Interpretation and Visualization ............................................................ 17
3.4.9 Modeling with Random Forest ......................................................................... 18
3.4.10 Power BI Dashboard Creation ........................................................................ 19
Chapter 4: Conclusion and Learning Outcomes ................................................................ 20
4.1 Conclusion ............................................................................................................... 20
4.2 Learning Outcome ................................................................................................... 20
iv
Reference ........................................................................................................................... 21

v
LIST OF FIGURES
Figure 2.1: Organizational Hierarchy……………………………………………………...4
Figure 3.1: Data Collection and Import…………………………………………………..10
Figure 3.2: Column Renaming………………….…………………………………….......11
Figure 3.3: Drop and Fill N/A…………………………………………………………….11
Figure 3.4: Feature Engineering….……………………………………………………….11
Figure 3.5: Top Product Sales…………………………………………………………….12
Figure 3.6: Total Sales Per Month…………….………………………………….……….13
Figure 3.7: Heatmap Visualization……………………………………………………….13
Figure 3.8: Pair Plot of months………………..………………………………………….14
Figure 3.9: Removing Outliers………………...………………………………………….15
Figure 3.10: Box Plot after removing Outliers……………………………………………15
Figure 3.11: Linear Regression……………………...……………………………………16
Figure 3.12: Visualization Of Linear Regression…………………………………………17
Figure 3.13: Random Forest Regressor………..………………………………………….18
Figure 3.14: Visualization Of Random Forest Regression……………………………….18
Figure 3.15: PowerBI Dashboard…………………………………………………………19

vi
LIST OF TABLES
Table 2.1: Internship Details ................................................................................................ 5
Table 3.1: Weekly log .......................................................................................................... 7

vii
Chapter 1: Introduction
1.1 Introduction
The internship provided a practical foundation in data analytics, with a strong focus on
working with real-world financial data and gaining hands-on experience in tools widely
used across the industry. It aimed to bridge the gap between theoretical knowledge and
professional application by guiding interns through end-to-end data analysis workflows.
Core tasks such as data wrangling, exploratory data analysis (EDA), visualization, and
linear regression modeling were emphasized to help participants develop problem-solving
skills relevant to business analytics.

Throughout the program, interns worked extensively with Python libraries such as pandas,
NumPy, matplotlib, and seaborn. These tools were used to clean, analyze, and visualize
monthly budget data, identify trends and outliers, and build predictive models. The
internship emphasized structured project execution—from data acquisition and
preprocessing to insight generation and report presentation—ensuring that participants not
only mastered technical concepts but also understood how to communicate their findings
effectively. This experience served as a strong introduction to real-world data analytics
projects in a corporate setting.

1.2 Problem Statement

The internship equipped participants with strategies to integrate these disparate data sources
through standardization and automation techniques. Interns learned how to use Python
scripts to merge datasets from different systems while maintaining data integrity.
Additionally, they practiced implementing validation checks and error-handling procedures
to ensure the reliability of processed data before analysis. These skills are crucial for
organizations dealing with complex, multi-source data environments where accuracy and
consistency directly impact decision-making.

1
1.3 Objectives
The objectives of this project are:

1. To clean and prepare budget data using Python and pandas.

2. To explore data patterns through visualizations and EDA.

3. To build and evaluate a linear regression model.

4. To analyze correlations and identify outliers in sales data.

1.4 Report Organization

The report is organized as follows:

Chapter 1 provides a brief description of the internship on data analyst and the project
focused on the report. It provides an overview of the Objectives, Problem Statement, Scope
and Limitation of the internship.

Chapter 2 explains about the organization, its hierarchy, the working domains, and a
description of the internship department. It also focuses on the Literature Reviews done on
the various research papers related to the internship scope and project.

Chapter 3 contains the roles and responsibilities of the internship. It also contains the
weekly log, description of the project involved and various activities performed during the
internship period.

Chapter 4 summarizes everything about the internship and the various outcomes learned
and knowledge gained from this internship.

2
Chapter 2: Organization Details and Literature Review
2.1 Introduction to Organization
Established in 2078 B.S., Global Fast Enterprises Pvt. Ltd. is an emerging IT solutions
company driven by a team of passionate and dynamic professionals. With a strong
commitment to innovation and excellence, we focus on delivering strategic and customized
IT solutions to address a wide range of business challenges.

Currently, our team consists of 15 dedicated experts who bring a collaborative spirit and
technical expertise to every project. In just 4 years, we have expanded our footprint across
various sectors of the IT industry, offering high-quality services in mobile, web, desktop,
and network solutions. Our diverse portfolio reflects our adaptability and focus on client
satisfaction.

At Global Fast Enterprises, we specialize in Mobile Development, Software Solutions, IT

Consulting, Digital Transformation, and System Integration—empowering businesses to
stay ahead in the fast-paced digital world.

2.1.1 Contact Information

Organization: Global Fast Enterprises Pvt.Ltd.

Organization Type: Private Limited

Address: Madhyapur Thimi, Lokanthali, Bhaktapur

Telephone Number: +977-9768748616

Email: [email protected]

3
2.2 Organizational Hierarchy

Global Fast Enterprises Pvt. Ltd. operates with a well-defined hierarchical structure to
ensure efficient project execution and continuous innovation. At the top of the hierarchy is
the Chief Executive Officer (CEO), who provides strategic direction for the organization.
Supporting the CEO are key roles such as the Chief Technology Officer (CTO) and Project
Manager, who oversee technological advancement and project delivery respectively. The
organization is structured into several core departments led by specialized Team Leads in
Data, Mobile Development, and AI/ML.

A dedicated Quality Assurance (QA) team ensures that all applications meet high standards
of performance and reliability. Additionally, the company employs Human Resources and
Administrative staff to manage internal operations and employee welfare. Interns are also
integrated into the development teams across different departments, gaining practical
experience and contributing to real-world projects under the guidance of experienced
professionals.

Figure 2.1: Organizational Hierarchy

4
2.3 Working Domains of Organization

Global Fast Enterprises Pvt. Ltd. operates in multiple technology-driven domains, including
software development (mobile and web applications), data management and analytics, and
artificial intelligence/machine learning (AI/ML) solutions. The company emphasizes
quality assurance through a dedicated QA team, efficient project management, and
continuous innovation led by its CTO. Additionally, it focuses on human resources, internal
operations, and talent development by integrating interns into its core departments for
hands-on experience in real-world projects.

2.4 Description of Intern Department/Unit

I interned in the Data Analytics at Global Fast Enterprises Solutions, a unit focused on
transforming raw data into business insights. My role involved data cleaning, analysis, and
dashboard development using Python and Power BI. The team followed agile workflows,
allowing me to contribute to real projects like financial and sales analytics while
collaborating with cross-functional teams. This experience provided hands-on exposure to
the end-to-end data pipeline and its impact on organizational decision-making.

Table 2.1: Internship Details

Position Data Analytics Intern
Project Supervisor Er. Himal Chand Thapa
Mentor Er. Mahesh Kr. Yadav
Start Date 17th March, 2025
End Date 16th June, 2025
Working Hours 10 am to 6pm
Working Days Monday to Friday
Internship Duration 3 months

5
2.5 Literature Review
Established fundamental principles for organizing datasets through his concept of "tidy
data," which became instrumental in modern data analysis workflows. These principles,
particularly relevant for structured data formats like CSV and JSON, were directly applied
during the internship's data wrangling tasks to ensure efficient processing and analysis. The
systematic approach to data structuring proposed by Wickham proved invaluable when
working with the internship's financial and sales datasets. (Wickham,2014)

Revolutionized data manipulation through the development of pandas, a Python library that
became central to the internship's analytical work. The text "Python for Data Analysis"
provided essential methodologies for handling common data challenges, particularly in
managing missing values and outliers - frequent issues encountered when processing the
budget.xlsx and sales.xlsx datasets during the internship.(McKinney,2017)

Offered practical guidance on creating effective data visualizations that informed the
internship's dashboard development. The principles outlined in this work were implemented
in both Power BI dashboards and Python-generated visualizations (using Matplotlib and
Seaborn), ensuring clear communication of key insights from the analyzed datasets. These
visualization techniques enhanced the presentation of KPIs such as regional sales
performance and budget utilization.(Healy,2018)

6
Chapter 3: Internship Activities
3.1 Roles and Responsibilities
As a Data Analytics intern at Global Fast Enterprises Solutions, my primary responsibilities
included learning organizational workflows in data-driven decision making and mastering
the end-to-end data analysis process. I gained hands-on experience with key technologies
including Python (pandas, NumPy) and Power BI to clean, analyze, and visualize data from
various business domains. Working within an Agile framework, I participated in daily
stand-ups to report progress on tasks like data preprocessing, exploratory analysis, and
dashboard development, while collaborating with team members on projects such as sales
performance tracking and budget analysis.

A core focus of the internship was developing practical data solutions from raw data to
actionable insights. I designed interactive Power BI dashboards to communicate KPIs,
performed statistical analysis to identify trends, and documented my methodologies to
ensure reproducibility. Through projects like financial expenditure analysis and regional
sales pattern identification, I applied data wrangling techniques, created visualizations, and
presented findings to stakeholders - all while adhering to best practices in data validation
and quality assurance. This experience strengthened both my technical skills in data
analytics and my ability to work effectively in a professional team environment.

3.2 Weekly log

Table 3.1: Weekly log
Week 1 • Attended internship orientation and understood
company objectives.
• Explored the provided sales dataset in Excel format.
• Loaded the dataset into Python using pandas.
• Identified data structure, key columns and total sale

Week 2 • Renamed all column headers for uniformity and

readability.
• Converted data types appropriately: strings for
categorical data, numeric for monthly values.
• Removed duplicate entries and dropped irrelevant or
fully empty columns.
• Handled missing data by filling in zeros for sales and
dropping rows with missing key identifiers.

7
Week 3 • Applied IQR method to detect and remove monthly
outliers.
• Created boxplots (before and after cleaning) to
visualize the effect of outlier removal.
• Finalized a cleaned dataset for further analysis.

Week 4 • Created a line chart showing monthly sales trend

(Jan–Dec 2016).
• Aggregated sales across all products to analyze total
monthly performance.
• Developed bar plots showing total sales by category
and top 5 products.

Week 5 • Built a correlation matrix to analyze relationships

between months.
• Visualized it using a heatmap and identified strongest
and weakest correlated month pairs.
• Created pair plots to examine relationships in
selected months.
• Gained insights into seasonal trends and product
performance variability.

Week 6 • Engineered a new feature: Total Sales (sum of

monthly sales Jan–Dec).
• Added this as a new column to the cleaned dataset.
• This served as the target variable for modeling and
also supported summary analysis.

Week 7 • Built a basic Linear Regression model to predict

Total Sales using limited monthly data.
• Evaluated performance.
• Identified that a simple model with few features had
low predictive power due to insufficient training data.

Week 8 • Used all 12 months (Jan–Dec 2016) as input features

to predict Total Sales.
• Trained a Random Forest Regressor, which
significantly improved accuracy:
• Visualized actual vs predicted total sales to evaluate
the model's predictive performance.

Week 9 • Filtered and exported cleaned data for dashboarding

in Power BI.
• Defined key metrics and charts to include in the
dashboard:

8
• Structured data to allow slicing by Category,
Subcategory, and Month.

Week 10 • Developed visuals in Power BI:

o Line chart: Monthly Sales Trend
o Bar charts: Sales by Category/Subcategory
o Cards: Highest, Lowest, Average Sales
• Implemented filters and slicers for interactivity
(Category, Month, etc.)

Week 11 • Removed Grand Total from visuals to ensure clean

chart outputs.
• Enhanced visuals by adjusting axis titles, fonts,
labels, and layout.
• Tested filter combinations (multi-select) and
interactions between visuals.

Week 12 • Compiled a final internship report summarizing:

• Data cleaning steps
• Visual analysis in Python and Power BI
• Predictive modeling (Random Forest)
• Key insights and business recommendations
• Presented dashboard and findings to
mentor/supervisor.

3.3 Description of the Project(s) Involved During Internship

During my internship, I worked on a project titled “Budget Data Analysis and Forecasting
Using Python” under the Mathematical Modeling department. The objective was to clean,
explore, and analyze a real-world budget dataset containing monthly sales figures for
various product categories. I began by importing the data from an Excel file and conducted
data wrangling tasks such as renaming columns, converting data types, and handling
missing values. I created a new Total Sales column by aggregating monthly sales and used
summary statistics and visualizations to understand the distribution and relationships within
the data. Outliers were detected and removed using statistical techniques like the
Interquartile Range (IQR) method.

As the project progressed, I applied exploratory data analysis (EDA) techniques and
developed a simple linear regression model to predict total sales based on individual
monthly sales inputs. I split the data into training and testing sets, evaluated the model using
MAE and RMSE, and visualized predictions against actual values. Throughout the project,

9
I used Python tools including pandas, matplotlib, seaborn, and scikit-learn within a Jupyter
Notebook environment. The project helped me gain hands-on experience in data
preprocessing, visualization, statistical modeling, and interpretation—demonstrating how
mathematical and analytical techniques can be applied to business datasets for insight and
forecasting.

3.4 Tasks / Activities Performed

3.4.1 Data Collection and Import

The first step in the project involved collecting and importing the dataset into the Python
environment. The dataset was provided in Excel format and contained monthly sales data
for various product categories. Using the pandas library, the Excel file was read into a
DataFrame, allowing for structured access and manipulation of the data. During the import
process, the data was explored to understand its initial structure, which revealed that several
columns were unnamed or not formatted properly.

Figure 3.1: Data Collection and Import

3.4.2 Data Cleaning and Preprocessing

Once the dataset was imported, a comprehensive data cleaning process was conducted. This
included renaming ambiguous or default column headers such as “Unnamed: 1” to
meaningful names like "Category" or month labels such as "Jan, 2016". Data types were
converted appropriately—for example, sales data was converted from object or string types
to numeric types for accurate calculations. Missing values were addressed either by

10
imputing zeros (where appropriate) or by removing rows or columns with excessive null
entries.

Figure 3.2: Column Renaming

Figure 3.3: Drop and Fill N/A

3.4.3 Feature Engineering

After cleaning, the dataset was enhanced by introducing new features. One important feature
was the “Total Sales” column, which was created by summing the monthly sales values
across each row. This new feature allowed for more effective grouping, sorting, and
modeling later in the analysis. Additional lists, such as month_columns, were also defined
in the code to manage operations involving multiple months simultaneously and streamline
the analysis process.

Figure 3.4: Feature Engineering

11
3.4.4 Exploratory Data Analysis (EDA)

Exploratory Data Analysis was a crucial phase, where the focus was on understanding
patterns and distributions within the dataset. Using descriptive statistics and data
visualization tools, the analysis uncovered trends such as seasonality in sales, high-
performing categories, and underperforming products pandas functions like .describe() and
.groupby() provided quantitative insights, while visualizations gave a clearer view of how
data was spread across different dimensions.

3.4.5 Data Visualization

A wide range of data visualization techniques were employed using matplotlib and seaborn.
Bar plots were created to compare total sales across products and categories. Boxplots and
violin plots were used to identify the presence of outliers in monthly sales. Correlation
heatmaps were used to assess the relationships between different months, while pair plots
helped to explore potential multivariate relationships..

Figure 3.5: Top Product Sales

12
Figure 3.6:Total Sales Per Month

Figure 3.7: Heatmap Visualization

13
Figure 3.8: Pair Plot of months

14
3.4.6 Outlier Detection and Removal
Outliers were detected using the Interquartile Range (IQR) method. This statistical approach
involves calculating the first and third quartiles and identifying any data points that fall
significantly outside this range. The rows containing such extreme values were removed to
improve the robustness and accuracy of the analysis. This step was essential before moving
to modeling, as outliers can skew results and reduce the predictive power of regression
models.

Figure 3.9: Removing Outliers

Figure 3.10: Box Plot after removing Outliers

15
3.4.7 Regression Modeling
The cleaned dataset was used to build a multiple linear regression model aimed at predicting
Total Sales using all monthly sales values (January to December) as input features. The
scikit-learn library was used to split the dataset into training and testing sets. The
LinearRegression model was then fitted on the training data. After training, predictions were
made on the test set. The model’s performance was evaluated using metrics such as Mean
Absolute Error (MAE), Root Mean Squared Error (RMSE), and R² Score, which helped
assess how accurately the model predicted the total sales.

Figure 3.11 : Linear Regression

16
3.4.8 Model Interpretation and Visualization

To interpret the model’s performance, the actual versus predicted total sales were visualized
using a line plot. This comparison helped in identifying how closely the predictions matched
the actual values across test samples. Additionally, the model’s regression coefficients and
intercept were analyzed to understand the contribution of each month’s sales towards the
overall annual total. This analysis revealed the relative importance of different months in
influencing total product sales over the year.

Figure 3.12: Visualization Of Linear Regression

17
3.4.9 Modeling with Random Forest

To improve accuracy, a Random Forest Regressor was implemented using all monthly sales
as input features. The model achieved higher accuracy and generalization compared to the
linear model. The Random Forest model achieved an R² score of 0.89, confirming that total
sales could be predicted effectively using monthly data. Feature importance also showed
that certain months had a stronger influence on total sales than others.

Figure 3.13: Random Forest Regressor

Figure 3.14: Visualization Of Random Forest Regression

18
In this analysis, Linear Regression proved to be the better-performing model. It
demonstrated significantly lower prediction error, as reflected in both the Mean Absolute
Error (MAE) and Root Mean Squared Error (RMSE), compared to the Random Forest
model. Additionally, Linear Regression achieved a very high R² score of 0.9998, indicating
that it was able to explain 99.98% of the variance in total sales—almost a perfect fit. On the
other hand, although Random Forest Regression is a more complex and powerful model in
general, it underperformed in this case. This could be due to the fact that the relationship
between monthly sales and total sales in the dataset is highly linear, which aligns well with
the assumptions of Linear Regression. Moreover, Random Forest may have struggled to
generalize effectively, possibly due to the limited sample size or the structure of the data,
leading to overfitting or suboptimal predictions.

3.4.10 Power BI Dashboard Creation

The cleaned dataset was imported into Power BI, where interactive dashboards were
developed to effectively visualize key aspects of the sales data. These included monthly
sales trends, category and subcategory performance, and key performance indicators (KPIs)
using card visuals. To enhance user experience and data exploration, slicers and filters were
added, allowing users to interact with the visuals dynamically. This made it easier to drill
down into specific time periods, categories, or products.

Figure 3.15: PowerBI Dashboard

19
Chapter 4: Conclusion and Learning Outcomes
4.1 Conclusion
During this internship, I gained hands-on experience in real-world data analysis, modeling,
and visualization using Python. By working with a raw and unstructured budget dataset, I
learned how to handle typical data quality issues such as missing values, incorrect data
types, and ambiguous column names. I successfully performed data cleaning and
transformation, followed by exploratory data analysis (EDA) to extract meaningful patterns,
trends, and insights from the data. Visualizing the dataset using libraries like matplotlib and
seaborn helped in better understanding sales behavior across months, products, and
categories.

In addition to data analysis, I also developed a basic understanding of statistical modeling

through the implementation of a simple linear regression model. This process taught me
how to split datasets, train models, make predictions, and evaluate model performance using
MAE and RMSE metrics. I also learned how to interpret model results and visually compare
predicted outcomes with actual data. Overall, this internship strengthened my skills in
Python programming, data manipulation with pandas, and data-driven decision-making
through mathematical modeling. It gave me valuable exposure to how mathematical and
computational techniques can be applied in business and analytics scenarios.

4.2 Learning Outcome

Various learning outcomes were attained over the internship period. Some of them are listed
below:

⚫ Gained hands-on experience in data cleaning and preprocessing using Python (pandas).

⚫ Learned to handle missing values, incorrect data types, and outliers effectively.

⚫ Developed skills in exploratory data analysis (EDA) and feature engineering.

⚫ Created various data visualizations using matplotlib and seaborn.

⚫ Understood and implemented simple linear regression models using scikit-learn.

⚫ Improved ability to interpret statistical results and visualize model predictions.

20
Reference

Healy, K. (2018). Data visualization: A practical introduction. Princeton University Press.

https://socviz.co/

McKinney, W. (2017). Python for data analysis (2nd ed.). O'Reilly Media.
https://wesmckinney.com/book/

Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1-23.

https://doi.org/10.18637/jss.v059.i10

W1954814 - Final Placement Report 20221250
No ratings yet
W1954814 - Final Placement Report 20221250
11 pages
Coding: Development & Advanced Engineering Job Simulation: Sai Krishna Kaushik Paruchuri (1604-21-733-009)
No ratings yet
Coding: Development & Advanced Engineering Job Simulation: Sai Krishna Kaushik Paruchuri (1604-21-733-009)
33 pages
Re Internreport
No ratings yet
Re Internreport
13 pages
MBA Intern's Data Analysis Journey
No ratings yet
MBA Intern's Data Analysis Journey
14 pages
Harsh Pareek IP Report
No ratings yet
Harsh Pareek IP Report
20 pages
Final Report PARTH
No ratings yet
Final Report PARTH
42 pages
Aparna INTERN REPORT 12
No ratings yet
Aparna INTERN REPORT 12
46 pages
Data Analysis
No ratings yet
Data Analysis
6 pages
Keshu Internship Report
No ratings yet
Keshu Internship Report
33 pages
MeriSkill Data Analyst Internship Report
No ratings yet
MeriSkill Data Analyst Internship Report
2 pages
E21CSEU0378 - Initial Report
No ratings yet
E21CSEU0378 - Initial Report
5 pages
Full Internship Report Data Analytics
No ratings yet
Full Internship Report Data Analytics
3 pages
Internship Presentation
No ratings yet
Internship Presentation
15 pages
Sub Bu Intership
No ratings yet
Sub Bu Intership
65 pages
Report 1
No ratings yet
Report 1
7 pages
Data Analysis - Tushar06 - Resume
No ratings yet
Data Analysis - Tushar06 - Resume
3 pages
Summer Internship Project Report
No ratings yet
Summer Internship Project Report
71 pages
Data Analytics Internship Report
No ratings yet
Data Analytics Internship Report
5 pages
Summer Internship Report
No ratings yet
Summer Internship Report
30 pages
Internship Progress Report 2
No ratings yet
Internship Progress Report 2
3 pages
Babi
No ratings yet
Babi
14 pages
Vinayaksirohi
No ratings yet
Vinayaksirohi
28 pages
Harsh It
No ratings yet
Harsh It
16 pages
MAHARISHI DAYANAND UNIVERSITY H
No ratings yet
MAHARISHI DAYANAND UNIVERSITY H
14 pages
Summer Internship Project Report
No ratings yet
Summer Internship Project Report
69 pages
Nasscom - Report (Finally)
No ratings yet
Nasscom - Report (Finally)
31 pages
Internship Progress Repor1-2
No ratings yet
Internship Progress Repor1-2
2 pages
Intern Final
No ratings yet
Intern Final
18 pages
Re Report
No ratings yet
Re Report
28 pages
MUGILLAN Internship Report..2
No ratings yet
MUGILLAN Internship Report..2
18 pages
Summer Internship Report Format
No ratings yet
Summer Internship Report Format
12 pages
Internship Progress Report3-1
No ratings yet
Internship Progress Report3-1
3 pages
Internship Report
No ratings yet
Internship Report
24 pages
Aishwarya Analyst1
No ratings yet
Aishwarya Analyst1
2 pages
Zidio Development
No ratings yet
Zidio Development
36 pages
VIVA
No ratings yet
VIVA
7 pages
Business Analytics1
No ratings yet
Business Analytics1
14 pages
Intership Report
No ratings yet
Intership Report
11 pages
Muskan Internship Report
No ratings yet
Muskan Internship Report
4 pages
Data Analyst Resume
No ratings yet
Data Analyst Resume
2 pages
P Sirisha - Resume
No ratings yet
P Sirisha - Resume
1 page
J S Front Sheer
No ratings yet
J S Front Sheer
8 pages
Shadab Internship Report
No ratings yet
Shadab Internship Report
15 pages
Internship PPT 2025
No ratings yet
Internship PPT 2025
12 pages
Tarun Tomar Internship Report
No ratings yet
Tarun Tomar Internship Report
6 pages
Harissh PPT Internship
No ratings yet
Harissh PPT Internship
7 pages
Internship Report
No ratings yet
Internship Report
19 pages
Radhin Krishna - Data Analyst
No ratings yet
Radhin Krishna - Data Analyst
2 pages
Varun Front
No ratings yet
Varun Front
8 pages
HR Analytics Dashboard for Employee Retention
No ratings yet
HR Analytics Dashboard for Employee Retention
4 pages
Internship Report 2
No ratings yet
Internship Report 2
32 pages
Internship2 Report 2064 (AutoRecovered)
No ratings yet
Internship2 Report 2064 (AutoRecovered)
29 pages
Data Analysis Internship Insights
No ratings yet
Data Analysis Internship Insights
1 page
Internship Progress Report: Data Science
No ratings yet
Internship Progress Report: Data Science
14 pages
InternshipReport YashAnilJain 21810077
No ratings yet
InternshipReport YashAnilJain 21810077
14 pages
Data Science Intern
No ratings yet
Data Science Intern
19 pages
Final Report
No ratings yet
Final Report
76 pages
Modals Practice
No ratings yet
Modals Practice
53 pages
Basic Calculus Summative Test Guide
No ratings yet
Basic Calculus Summative Test Guide
3 pages
Feature Writing: Title Is The Secret of A Feature Article
No ratings yet
Feature Writing: Title Is The Secret of A Feature Article
2 pages
II B.SC., III Sem Java Notes PDF
79% (14)
II B.SC., III Sem Java Notes PDF
219 pages
Cheat Sheet For Hana
100% (1)
Cheat Sheet For Hana
1 page
ECCD Checklist Child S Record 1
0% (1)
ECCD Checklist Child S Record 1
36 pages
EV5 - Tests - Unit Quiz11 - A
100% (1)
EV5 - Tests - Unit Quiz11 - A
3 pages
Arduino Internet Control Using The Arduino Ethernet Shield
No ratings yet
Arduino Internet Control Using The Arduino Ethernet Shield
3 pages
PT 1 Examination Datesheet 2025-26
No ratings yet
PT 1 Examination Datesheet 2025-26
5 pages
Hermeneutics vs. Poetics Analysis
No ratings yet
Hermeneutics vs. Poetics Analysis
18 pages
English Grade 1 Mid Term
No ratings yet
English Grade 1 Mid Term
11 pages
M Tech ADS Question Paper With Answers
No ratings yet
M Tech ADS Question Paper With Answers
72 pages
Test 3 With Solutions
No ratings yet
Test 3 With Solutions
33 pages
English Literature Study Guide
No ratings yet
English Literature Study Guide
10 pages
Daftar Tenaga Pendidik Lombok Timur
100% (1)
Daftar Tenaga Pendidik Lombok Timur
49 pages
(Ebook) Math 311 Custom Edition For Texas A&M University - College Station by Steven J Leon and Susan Jane Colley ISBN 9781256983699, 1256983691 Digital Download
No ratings yet
(Ebook) Math 311 Custom Edition For Texas A&M University - College Station by Steven J Leon and Susan Jane Colley ISBN 9781256983699, 1256983691 Digital Download
122 pages
36510hp Vectra Vei 7 6285486998b83061721133
No ratings yet
36510hp Vectra Vei 7 6285486998b83061721133
3 pages
Celtiberians: Culture and History
No ratings yet
Celtiberians: Culture and History
5 pages
Xi-Computer 2024-2025
No ratings yet
Xi-Computer 2024-2025
2 pages
Scheme - and - Syl - Electronics - and - Electrical - 2014 1 PDF
No ratings yet
Scheme - and - Syl - Electronics - and - Electrical - 2014 1 PDF
127 pages
Review 3 & 4 - Gabarito
No ratings yet
Review 3 & 4 - Gabarito
9 pages
Sadl in The Maliki School - Lampost Productions
100% (1)
Sadl in The Maliki School - Lampost Productions
11 pages
CS2 Checkout and Support Issues
No ratings yet
CS2 Checkout and Support Issues
393 pages
Livingstone High School November 2025 Examination Time-Tables Gr8-11
No ratings yet
Livingstone High School November 2025 Examination Time-Tables Gr8-11
5 pages
Revolver 2.0 Template
No ratings yet
Revolver 2.0 Template
6 pages
Understanding Straight Line Equations
No ratings yet
Understanding Straight Line Equations
16 pages
Litany of Saint Therese
No ratings yet
Litany of Saint Therese
3 pages
Pega Platform 87 Update Tomcat Postgres 0
No ratings yet
Pega Platform 87 Update Tomcat Postgres 0
76 pages
The Exceptional Few Finay7
No ratings yet
The Exceptional Few Finay7
14 pages

Unik Intern Report

Uploaded by

Unik Intern Report

Uploaded by

Tribhuvan University Institute of Science and

A Final Year Internship Report

In Partial fulfillment of the requirements

Date: 28th June, 2025

T.U Exam Roll No: 26565/077

Keywords: Data Analytics, Python Programming, Power BI, Data Visualization,

CSV Comma Seperated Values

DAX Data Analysis Expression

EDA Exploratory Data Analysis

JSON JavaScript Object Notation

KPI Key Perfromance Indicator

1.2 Problem Statement

1. To clean and prepare budget data using Python and pandas.

2. To explore data patterns through visualizations and EDA.

3. To build and evaluate a linear regression model.

4. To analyze correlations and identify outliers in sales data.

1.4 Report Organization

At Global Fast Enterprises, we specialize in Mobile Development, Software Solutions, IT

2.1.1 Contact Information

Organization: Global Fast Enterprises Pvt.Ltd.

Organization Type: Private Limited

Address: Madhyapur Thimi, Lokanthali, Bhaktapur

Telephone Number: +977-9768748616

Figure 2.1: Organizational Hierarchy

2.4 Description of Intern Department/Unit

Table 2.1: Internship Details

3.2 Weekly log

Week 2 • Renamed all column headers for uniformity and

Week 4 • Created a line chart showing monthly sales trend

Week 5 • Built a correlation matrix to analyze relationships

Week 6 • Engineered a new feature: Total Sales (sum of

Week 7 • Built a basic Linear Regression model to predict

Week 8 • Used all 12 months (Jan–Dec 2016) as input features

Week 9 • Filtered and exported cleaned data for dashboarding

Week 10 • Developed visuals in Power BI:

Week 11 • Removed Grand Total from visuals to ensure clean

Week 12 • Compiled a final internship report summarizing:

3.3 Description of the Project(s) Involved During Internship

3.4 Tasks / Activities Performed

3.4.1 Data Collection and Import

Figure 3.1: Data Collection and Import

3.4.2 Data Cleaning and Preprocessing

Figure 3.2: Column Renaming

Figure 3.3: Drop and Fill N/A

3.4.3 Feature Engineering

Figure 3.4: Feature Engineering

3.4.5 Data Visualization

Figure 3.5: Top Product Sales

Figure 3.7: Heatmap Visualization

Figure 3.9: Removing Outliers

Figure 3.10: Box Plot after removing Outliers

Figure 3.11 : Linear Regression

Figure 3.12: Visualization Of Linear Regression

Figure 3.13: Random Forest Regressor

Figure 3.14: Visualization Of Random Forest Regression

3.4.10 Power BI Dashboard Creation

Figure 3.15: PowerBI Dashboard

In addition to data analysis, I also developed a basic understanding of statistical modeling

4.2 Learning Outcome

⚫ Developed skills in exploratory data analysis (EDA) and feature engineering.

⚫ Created various data visualizations using matplotlib and seaborn.

⚫ Understood and implemented simple linear regression models using scikit-learn.

⚫ Improved ability to interpret statistical results and visualize model predictions.

Healy, K. (2018). Data visualization: A practical introduction. Princeton University Press.

Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1-23.

You might also like