0% found this document useful (0 votes)
24 views39 pages

7th Sem Intership Report Format

15 days Intership Report Format

Uploaded by

yarobo7683
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views39 pages

7th Sem Intership Report Format

15 days Intership Report Format

Uploaded by

yarobo7683
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

GUJARAT TECHNOLOGICAL UNIVERSITY

INSTITUTE OF TECHNOLOGY AND RESEARCH (ITR), MEHSANA

Academic year
(2025-2026)
INTERNSHIP REPORT UNDER SUBJECT OF
SUMMER INTERNSHIP (3170001)
B.E. SEMESTER-VII
In partial fulfillment for the award of the degree of
BACHELOR OF ENGINEERING
in
COMPUTER ENGINEERING
Submitted by :- Gupta Aryan p.

BrainyBeam Info Tech Pvt. Ltd.


GTU-INSTITUTE OF TECHNOLOGY AND RESEARCH (ITR)
Near mevad Toll-booth, Ahmedabad-Mehsana Express highway,
Mehsana - 384460, Gujarat, India.

CERTIFICATE

This is to certify that the internship report, based on the internship carried out at

BrainyBeam Info Tech Pvt. Ltd. has been carried out by Mr. Sagar Jasani Sir under

guidance, in partial fulfillment of the requirements for the degree of Bachelor of

Engineering in Computer Engineering, 7th Semester, Gujarat Technological

University, Ahmedabad, during the academic year 2025–2026.

Prof. Mayur Chauhan Prof. Avani Raval


Internal Guide Head of the Department

Mr. Sagar Khatri

External Guide
Company Profile

Company Name: BrainyBeam Info Tech Pvt. Ltd.

Address: F6, Dhanlaxmi Chamber, Near Sarvoday Co-op Bank, Ashram Road,
Ahmedabad- 380014

Contact No: 9033237336

Email Id : [email protected]

Website: www.brainybeam.com

About Us:

BrainyBeam Info Tech Pvt. Ltd. is a technology company dedicated to empowering


businesses with innovative solutions in data analytics, artificial intelligence, and machine
learning. With a mission to drive actionable insights and enhance decision-making
processes, BrainyBeam controls the power of advanced technologies to solve complex
challenges.

Vision:

Our vision at BrainyBeam is to revolutionize the way businesses leverage data to achieve
their goals. We strive to be at the forefront of technological advancements, delivering
impactful solutions that drive growth, efficiency, and competitive advantage for our
clients.

Mission:

At BrainyBeam Info Tech Pvt. Ltd., our mission is to empower organizations with data-
driven intelligence that enables them to unlock new opportunities, optimize processes,
and stay ahead in today's dynamic market landscape. We are committed to delivering
adapted solutions that address the unique needs and objectives of each client.
COMPLETION CERTIFICATE
ACKNOWLEDGEMENT

First I would like to thank Mr. Sagar Jasani sir Project Manager of
BrainyBeam Info Tech Pvt. Ltd. for giving me the opportunity to do an
internship within the organization.

I also would like all the people that worked along with me in the
organization with their patience and openness to create an enjoyable
working environment.

I am highly grateful to Principal Dr. Chirag Vibhakar Sir, for the facilities
provided to accomplish this internship.

I would like to thank my Head of the Department Prof. Avani Raval, for the
constructive criticism throughout my internship.

I would like to thank Prof. Mayur Chauhan internship guide, Department of


CSE for their support and advice to complete internship in BrainyBeam Info
Tech Pvt. Ltd.

It is indeed with a great sense of pleasure and immense sense of gratitude


that I acknowledge the help of these individuals.

I am extremely grateful to my department staff members and friends who


helped me in successful completion of this internship.
ABSTRACT
This project involves analyzing air quality data across different states and years to
understand trends and make predictions about air quality index (AQI) levels. The
analysis includes calculating sub-indices for various pollutants such as SO2, NO2, RSPM,
SPM, and PM2.5, and then aggregating these sub-indices to compute the overall AQI.
Visualizations, including heatmaps, are used to illustrate the data, and a machine
learning model is implemented to predict AQI based on the pollutant levels. The
project's outcome aims to provide actionable insights for environmental monitoring and
policymaking.

Key-Words:
 Air Quality Index (AQI)
 Pollutants
 Linear Regression
 Environmental Analysis
 Machine Learning
 Predictive Modeling
 Data Visualization

Modules:

1. Data Collection and Preprocessing: Handling missing values, Feature


engineering, Data normalization.

2. Pollutant Sub-index Calculation: Functions to calculate SOi, Noi, RSPMi, SPMi,


and PMi based on pollutant concentrations.

3. AQI Calculation: Aggregating sub-indices to calculate the overall AQI,


Categorizing AQI into ranges (Good, Moderate, Poor, etc.)

4. Visualization: Heatmaps to display AQI and pollutant levels by state and year,
Correlation heatmap to show relationships between pollutants and AQI.

5. Machine Learning Model: Splitting data into training and testing sets, Training a
Linear Regression model to predict AQI, Evaluating model performance using R²
and MSE.
Technology:

Front-End: Matplotlib, Seaborn


Back-End: Pandas, Python, NumPy, Scikit-learn

DAY – 1 : BASIC INTRODUCTION AND DOMAIN


KNOWLEDGE

Introduction about Field :


Discuss Introduction of Data Science & Machine Learning,
some important points Pandas, NumPy, Matplotlib, Seaborn,
Feature Extraction, Algorithm

What is Data Science and Machine Learning :


Data science is a field that studies data and how to extract meaning
from it,whereas machine learning is a field devoted to understanding
and building methods that utilize data to improve performance or
inform predictions.Machine learning is a branch of artificial
intelligence.

Support of Data Science and Machine Learning:

 Android Phone
 Android Tablets
 Smart Watch
 Home Application
 Healthcare

Download And Install Anaconda:


Step1: Go to google and type https://www.anaconda.com/

Step2: Download APK according to your


OS
Step3: Double click on the downloaded " Anaconda3-2023.07-2-Windows-x86_64.exe" file.

Step4: After the successful completion of the installation, Click on the


"Next” & “Agree" button to proceed & let the setup to be done. At
last click the Finish button.
Step5: Click on Start a new Android Studio project to build a new app.
Step6: Click on Jupyter Notebook to process a new code.
DAY – 2 :Generate a 2D array using NumPy and
then scale it by multiplying with a scalar.

Creating Arrays:

NumPy arrays are the core data structure for numerical computing. You can create arrays of
various dimensions using NumPy. The above code create a 1D and a 2D array

"Scaling a 2D Array with NumPy":

This code snippet first generates a random 2D array with dimensions 3x4 using NumPy's
random.rand() function. Then, it defines a scalar value (in this case, 5) to be used for scaling.
Finally, it scales the generated 2D array by multiplying each element with the scalar. The
original and scaled arrays are then printed to observe the transformation. This process effectively
scales each element in the array by a factor of the scalar value.
.
DAY – 3 : Use NumPy arrays to create a priority
queue, optimizing element insertion and removal
with math functions.

Array Operations:

NumPy provides various mathematical operations that can be performed on arrays, such as
element-wise addition, subtraction, multiplication, and division. Here's an example of
performing arithmetic operations on arrays.
Priority Queue:

This code defines a PriorityQueue class using NumPy arrays for efficient priority queue
operations. The insert() method appends elements and sorts the array, while remove() retrieves
and removes the highest priority element. It demonstrates inserting and removing elements from
the priority queue.
DAY – 4 : Mastering Array Indexing and
Reshaping Techniques.

Array Indexing and Slicing:

NumPy allows you to access specific elements, rows, or columns of an array using indexing and
slicing. Here's an example of indexing and slicing a NumPy array

[3]:

Array Shape and Reshaping:

Array reshaping is a fundamental operation in NumPy that allows you to change the shape
(dimensions) of an array without changing its data.
Pandas Dataframe Creation and Data Cleaning

Dataframe Creation:

This Python code utilizes pandas to create a DataFrame with sample data representing
individuals' names, ages, genders, and salaries. It then prints the first few rows of the DataFrame,
provides information about its structure and data types.

Data Cleaning and Preprocessing:


This code manipulates a pandas DataFrame: Missing value is inserted in the "Age" column.
Duplicate rows are removed. Categorical "Gender" values are converted to numerical. "Age"
feature is normalized between 0 and 1. Transformed "Gender" and normalized "Age" columns
are printed.
DAY –5 : Applying Data Manipulation and Custom
functions.

Data Manipulation and Transformation:


The code snippet performs several DataFrame operations: it filters rows where "Age" is greater
than 30, computes a new column by squaring "Age", calculates mean age by gender, and merges
two DataFrames based on "Name".

Applying Custom Function:

The provided code defines a function salary_increase to increase salaries by 10%, applies this
function to create a new column 'New_Salary' in the DataFrame 'df', reshapes the DataFrame
using melt function to stack 'Age' and 'Salary' columns, drops duplicate rows from the
DataFrame, and resets the index.
DAY – 6 : Loading data and Data Preprocessing.

Loading CSV File:

The pd.read_csv() function is used to read the CSV file 'cyberbullying_tweets.csv' and load it
into a DataFrame called data. data.head() is used to display the first five rows of the
DataFrame, providing a glimpse of the data's structure and content. This helps in understanding
the columns and the kind of information present in the dataset.

Data Preprocessing:

This Python function `strip_all_entities` takes text input and preprocesses it by removing special
characters, URLs, non-ASCII characters, repeated characters, digits, and punctuation. It also
removes stopwords and returns the processed text. This function is useful for text cleaning and
normalization tasks, often used in natural language processing applications.
DAY – 7 : Loading data and Data Preprocessing.

Filtered data:

In Task 1, we create a DataFrame df containing sample data with columns 'Name', 'Age', and
'Salary'. Using Pandas' filtering capabilities, we select rows where the 'Age' column is less than
40, resulting in a new DataFrame filtered_data containing individuals under 40 years old.

Aggregating data:

In Task 2, we calculate the average salary for individuals over 30 years old. First, we use boolean
indexing to filter the DataFrame df to include only rows where the 'Age' column is greater than
30. Then, we access the 'Salary' column and apply the mean() function to calculate the average
salary over 30, which is stored in the variable average_salary_over_30.
DAY 8 : Utilize Matplotlib to develop interactive
data visualizations.

Sine wave graph:

For the sine wave graph, we generate 1000 points between 0 and 50 on the x-axis and calculate
the corresponding sine values. Then, we plot these points using plt.plot(), showcasing the
sinusoidal pattern over a larger range of x-values, providing a comprehensive view of the sine
wave behavior.

Line graph:
For the line graph with markers, we create an array of x-values from 0 to 9 and generate random
y-values. We plot these points with markers ('o') using plt.plot(), demonstrating discrete data
points along the line plot. This visual representation emphasizes specific data points and
highlights the variability in the y-values across the limited x-axis range.
DAY 9 : Employ Matplotlib to create a variety of
visualizations, including bar and pie charts.

Pie Chart :

This code creates a pie chart in Matplotlib with specified sizes and labels for each slice, along
with percentage labels. It sets the starting angle and ensures the chart is circular. Finally, it
displays the pie chart with a title.

Bar Chart:

This code plots a bar chart showing the frequency of different ages in the dataset. It accesses the
"Age" column, counts occurrences of each age value, and then visualizes it as a bar plot with a
specified figure size.
DAY 10 : Matplotlib to generate graphical
representations such as box plots and histograms.

Box Plot:

The box plot visualizes the distribution of randomly generated normal data, where the central line
represents the median value, the box edges denote the first and third quartiles indicating the
interquartile range (IQR), and the whiskers extend to 1.5 times the IQR from the quartiles, with
outliers plotted individually.

Histogram:

The histogram displays the frequency distribution of another set of normal data, with the x-axis
representing data range divided into bins and the y-axis representing the frequency of
occurrences within each bin. Together, these visualizations provide insights into the distribution
and variability of the datasets, aiding in data analysis and interpretation.
DAY 11 : Explore heatmaps and scatter plots with
Matplotlib.

Heatmap:

The heatmap provides a visual representation of data density in a 2D space by partitioning the
area into hexagonal bins and coloring each bin based on the number of data points it contains.
The plt.hexbin() function creates this heatmap, utilizing parameters such as x and y for data
coordinates and gridsize to determine the number of hexagons. A colormap (cmap) is applied to
signify variations in density, and a colorbar is added to aid interpretation by illustrating the color
scale.
Scatter Plot:

The scatter plot portrays individual data points in a 2D space, with additional variables encoded
via marker size (s) and color (c). Utilizing plt.scatter(), the plot is generated using data
coordinates (x and y), with marker size (s) indicating an additional variable, while marker color
(c) represents another variable. Transparency (alpha) is adjusted to differentiate overlapping
points, and a colorbar is incorporated to elucidate the mapping between marker color and the
encoded variable (z), thereby enhancing comprehension of the plot's insights.
DAY 12 : Examine demographic shifts through the
analysis of population trends with Seaborn.

Population Trends Analysis with Seaborn:

This code employs Seaborn to visualize demographic shifts by plotting population trends over
time. Sample data representing population sizes for specific years is utilized to create a line
plot using Seaborn's lineplot() function. Each data point is marked with a circular marker ('o'),
and the plot is styled with a sky blue color. This visualization offers insights into long-term
demographic trends, aiding in understanding population dynamics over the specified time
period.
DAY 13 : Create visual representations of data using
Seaborn, including bar charts and paired scatterplot
matrices.

Bar Chart in Seaborn:

The code generates a bar chart using Seaborn's `barplot()`. Sample data, with categories and
values, is visualized. The title is repositioned with `plt.title()` to prevent overlap, ensuring
better readability. This visual representation aids in analyzing categorical data distributions
effectively.
Scatter Plot in Seaborn:

This code creates a figure with two subplots side by side, each showing a scatter plot. The scatter
plots compare "Age" with "Balance" and "CreditScore" respectively, with points colored by
"Exited" status. The sizes of the points vary according to specified ranges. Legends are added to
indicate the colors representing churn status. Finally, the plots are displayed using `plt.show()`.
DAY 14 : Create line graphs and swarm plots using Seaborn's
graphics.

Line Graph:

The line plot visualizes the relationship between 'X' and 'Y' variables, showing trends or patterns
in the dataset. It's effective for illustrating how the values of 'Y' change with respect to 'X',
providing insights into the data's behavior over the range of 'X' values.

Swarm Plot in Seaborn:


The swarm plot displays the distribution of 'Y' values for each category ('A' and 'B') along the x-
axis. It's particularly useful for categorical data visualization, showcasing the spread and density of
data points within each category. Unlike traditional scatter plots, swarm plots prevent overlapping
points, offering a clear representation of the data distribution and its variations across categories.
DAY 15 : Seaborn heatmap to demonstrate
correlations between variables.

Heatmap for correlation:

This code generates a heatmap of the correlation matrix for the dataset using
seaborn's heatmap function. The heatmap visualizes the correlations between
different features in the dataset. The values of the correlation coefficients are
annotated on the heatmap. The colormap "RdYlBu" is used to represent the
correlation values, where red indicates positive correlation, blue indicates negative
correlation, and yellow represents no correlation. The plot is displayed with a title,
and the font sizes for the ticks and title are adjusted for readability. Finally,
plt.tight_layout() ensures that the plot components are properly spaced.
This code creates a DataFrame with random values and generates a heatmap using Seaborn with the
'plasma' colormap. The heatmap visualizes the values in the DataFrame, showing patterns and
correlations between different features with different color tones.

You might also like