1. What is Pandas? Name the two main data structures in Pandas.
Pandas is an open-source Python library used for data manipulation and analysis. The two main
data structures are Series and DataFrame.
2. Differentiate between Series and DataFrame with an example.
A Series is a one-dimensional labeled array, while a DataFrame is a two-dimensional labeled data
structure. Example:
Series:
s = [Link]([10, 20, 30])
DataFrame:
df = [Link]({'A': [1, 2], 'B': [3, 4]})
3. Write Python code to create a Series from a list, a dictionary, and a NumPy array.
From list:
[Link]([1, 2, 3])
From dictionary:
[Link]({'a': 1, 'b': 2})
From NumPy array:
[Link]([Link]([4, 5, 6]))
4. How can you change the index of a Pandas Series?
Use the index parameter or assign to .index:
[Link] = ['a', 'b', 'c']
5. Explain the use of head() and tail() methods in Pandas.
head() shows the first 5 rows; tail() shows the last 5 rows of the DataFrame or Series.
6. What are the key features of Pandas?
Key features include:
- Fast and efficient DataFrame object
- Tools for reading/writing data
- Handling missing data
- Data alignment and reshaping
7. Write a Python program to perform basic operations on Series (addition, subtraction).
s1 = [Link]([1, 2, 3])
s2 = [Link]([4, 5, 6])
Addition: s1 + s2
Subtraction: s1 - s2
8. What is the difference between iloc[] and loc[]?
iloc[] is for integer-location based indexing; loc[] is label-based indexing.
9. How do you check for null values in a DataFrame?
Use [Link]() to check nulls; [Link]().sum() to count nulls per column.
10. Write a program to create a DataFrame from a dictionary of lists.
data = {'Name': ['A', 'B'], 'Marks': [90, 85]}
df = [Link](data)
11. What are the different ways to read data into a DataFrame in Pandas?
Ways include: read_csv(), read_excel(), read_json(), read_sql(), read_html()
12. Explain the use of read_csv() and to_csv() with an example.
read_csv('[Link]') reads CSV file.
to_csv('[Link]') writes DataFrame to CSV.
13. How can you display only specific columns from a DataFrame?
Use column names: df[['col1', 'col2']]
14. What does the describe() function return in a DataFrame?
It returns summary statistics (count, mean, std, min, max, etc.) for numeric columns.
15. Write a Python program to read a CSV file and display basic statistics.
df = pd.read_csv('[Link]')
print([Link]())
16. How can you handle missing data in Pandas?
Use methods like fillna(), dropna(), or interpolate().
17. Write code to sort a DataFrame based on values of a particular column.
df.sort_values(by='column_name', ascending=True)
18. What is the use of the groupby() function? Give an example.
groupby() is used for grouping data and applying aggregation.
Example: [Link]('column').mean()
19. Explain the difference between drop() and del.
drop() removes columns/rows and returns a new object. del deletes a column in-place.
20. How can you filter rows in a DataFrame using conditions?
Use boolean indexing: df[df['column'] > value]
21. What is Matplotlib? Why is it used?
Matplotlib is a Python library for creating static, interactive, and animated plots.
22. Differentiate between plot() and bar() functions in Matplotlib.
plot() is for line plots; bar() is for bar charts.
23. Write a Python program to draw a line plot using Matplotlib.
[Link]([1, 2, 3], [4, 5, 6])
[Link]()
24. How do you add labels and title to a plot?
Use [Link](), [Link](), and [Link]()
25. What is the purpose of the legend() function?
legend() displays the labels for different plot elements.
26. Write a program to create a bar chart for student marks in 5 subjects.
subjects = ['Math', 'Sci', 'Eng', 'Hist', 'Geo']
marks = [90, 80, 85, 70, 75]
[Link](subjects, marks)
[Link]()
27. How can you plot multiple lines on the same graph?
Call [Link]() multiple times before [Link]().
28. Explain the parameters of [Link]().
x, y: data points; label: legend label; color: line color; linestyle: style of line etc.
29. What is the role of show() and savefig() in Matplotlib?
show() displays the plot; savefig() saves the plot to a file.
30. Write a program to create a pie chart showing percentage distribution of expenses.
expenses = [300, 200, 150]
labels = ['Rent', 'Food', 'Transport']
[Link](expenses, labels=labels, autopct='%1.1f%%')
[Link]()