DATA EXPLORATION AND VISUALIZATION
ASSIGNMENT 1
SUBMITTED BY,
PRADEEP K
2nd YEAR B.Tech (AI&DS)
712523243019
PROBLEM STATEMENT:
Visualizing Traffic Patterns: Analyzing Daily and
Hourly Trends
EXECUTION:
To execute the above visualization programs, follow
these steps in a Jupyter Notebook environment (like
Google Colab) or in any Python IDE that supports
inline plotting, such as JupyterLab.
1.Set Up the Environment:
o Ensure you have pandas, matplotlib, and
seaborn installed. If not, install them by
running:
python
Copy code
!pip install pandas matplotlib seaborn
2.Load and Run the Code:
o Copy and paste each of the program code
snippets into individual cells (if using
Jupyter/Colab) or a single Python script file.
Adjust any file paths and column names as
needed.
o For the path to traffic.csv, you might need to
upload the file or provide the correct path.
3.Viewing Outputs:
o If using Google Colab or Jupyter Notebook,
each visualization will display inline after
executing the cell.
o In an IDE like VS Code, use %matplotlib
inline for inline display or plt.show() at the
end of each program.
Example in Google Colab:
1.Start with this line to upload your traffic.csv file:
python
Copy code
from google.colab import files
uploaded = files.upload()
EXPLANATION:
Histogram
Explanation:
1. plt.hist(data[y_column], bins=30) creates a
histogram with 30 bins (adjustable) for the
traffic_volume.
2. Color and edge styling improve readability,
while plt.xlabel and plt.ylabel label the axes.
Bar Plot
Explanation:
1.data.groupby(x_column)[y_column].sum()
groups the data by each unique category and
calculates the total traffic_volume for each.
2.The .plot(kind='bar') creates a bar plot, with
custom color (skyblue) and edges (black).
3.Labels and rotation are added to make the chart
more readable.
Scatter Plot
Explanation:
1. plt.scatter(data[x_column], data[y_column])
creates a scatter plot of traffic_volume over
time_of_day.
2. The color blue and transparency alpha=0.5
make overlapping points easier to see.
Line Plot
Explanation:
1. plt.plot(data['time'], data[y_column]) plots
traffic volume data points connected by lines.
2. Markers (marker='o') show each point, while
linestyle='-' draws lines connecting them.
Pie Chart
Explanation:
1. data.groupby(category_column)
[value_column].sum() groups and aggregates
traffic_volume for each category.