0% found this document useful (0 votes)
120 views12 pages

Data Handling and CSV 2024 - 2025

The document provides a comprehensive overview of data handling using the pandas library in Python, including definitions of key terms like Series and DataFrame. It contains multiple code examples demonstrating the creation and manipulation of Series and DataFrames, as well as exercises for creating and analyzing data structures. Additionally, it covers various operations and functions related to pandas, along with tasks for practical implementation.

Uploaded by

steamcrew51
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views12 pages

Data Handling and CSV 2024 - 2025

The document provides a comprehensive overview of data handling using the pandas library in Python, including definitions of key terms like Series and DataFrame. It contains multiple code examples demonstrating the creation and manipulation of Series and DataFrames, as well as exercises for creating and analyzing data structures. Additionally, it covers various operations and functions related to pandas, along with tasks for practical implementation.

Uploaded by

steamcrew51
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

DATA HANDLING USING PANDAS

1. Define the terms: a. pandas b. seriesc. dataframe


2. Write the output:
a. import pandas as pd
s=pd.Series((1,2,3,4),
('a','b','c','d')) print(s)
b. import pandas as pd
s=pd.Series(data=range(0,10,1),
index=range(20,30)) print(s)
print(s[0::2]*
*2) s=s+2
print(s.head(
3))
c. import pandas as
pd import numpy
as np
a=pd.DataFrame([1,1,1,np.nan],columns=['one'],index=['a','b',
'c','d']) print(a)
d. import pandas as pd
data=(1,2,3,4,5)
df=pd.DataFrame(d
ata) print(df)
e. import pandas as pd
s=pd.Series((98,23,10,91,39,43
,74)) print(s)
print(s<50)
print(s[s<5
0])
f. import pandas as pd
sdf=pd.DataFrame(data=((1,2,3),(4,5,6),(7,8,9),
(10,11,12))) sdf.index=(‘a’, ‘b’, ‘c’, ‘d’)
sdf.columns=(‘col1’, ‘col2’,
‘col3’) print(sdf)
g. import pandas as pd
c=pd.Series([4,10,12,2
6]*2) print(c)
print(c.shape)
h.

i. import pandas as pd
values=['Thailand','Switzerland','Fr
ance'] code=['TH','SW','FR']
name=['country']
cdf=pd.DataFrame(data=values, index=code,
columns=name) print(cdf)
j. import pandas as pd
s=pd.Series((2,4,5,8,9,12))
print(s.size,s.ndim)
3. Create a series from a tuple which stores the prices of 6 items in
a shop and display the last 5 rows.
4. Create a series from a list that contains the number of students
of the 6 classses XII G, XII H, XI E, XII P, XI F and XI H. Use class
names as indices. Display the rows except last 3.
5. Create a series sales from the dictionary daily containing
{‘sun’:3400, ‘mon’:4100, ‘tue’:3900, ‘wed’:5000, ‘thu’:4500,
‘fri’:3800, ‘sat’:1200}. Display i) the series except top 2 rows, ii)
the labels , iii) last 4 rows.
6. Create a dataframe from a tuple that contains the following lists
- [‘S1’,'Rani',90],[‘S2’,'Ann',86],[‘S3’,'Sam',830],[‘S4’,'Manu',90]
with row and column labels.
7. Create the following
dataframe: subject
teachers
school
I IT 12 ABC
II Maths 10 XYZ
III Accounts 9 MNO
IV English 11 PQR
V Physics 8 UVW
a) H.
8. Write the code to create an empty series ser_A.
9. Write the code to create an empty dataframe valuedf.
10. Who developed pandas and in which year?
11. Who developed python and in which year?
12. Name the function that is used to multiply 2 series objects.
13. Name the one dimensional and 2 dimensional data structures in
pandas.
14. If the data in a series object is , index must be provided.
15. Write the import statement for pandas library.
16. Differentiate between list and series.
17. Differentiate between dataframe and series.
18. Create a series with 5 numbers and multiply each element with 6.
19. Create a dataframe from a dictionary of series with empname,
salary, department of 5 employees.
a. Display the 2nd and 4th rows.
b. Display empname and department.
c. Display emplname from 1st row 3rd row.
20. Create a dataframe Earth with the given data using a dictionary of
dictionaries. Component Colour Required for
A100 Oxygen Colourless
Breathing S008 Soil
Brown Vegetation W003 Water
Colourless Drinking
P237 Plant Green & BrownFruits & Vegetables
a. Change the colour of Soil to Reddish Brown.
b. Rename the column labels as X, Y, Z using a function.
c. Add a new column Rate with values A, C, B, C.
d. Interchange the row and column labels and display the
dataframe.
e. List the row labels.
f. List the column labels
g. How many elements are the in the dataframe? Write the code to
display it.
h. Write the output: print(Earth.shape)
i. Write the output: print(Earth[[‘Component’, ‘Required for’]])
j. Delete the last row.
21. Create a dataframe student with rollno, name, house colour and
save it as a file in the folder School in C drive without the row
labels.
22. Write the code to read the file game.csv into a dataframe gdf,
excluding 6th and 10th rows.
23. Write the outout:
import pandas as
pd
ad={'no':25,'label':'book'}
bd={'no':50,'label':'bag'}
cd={'no':75,'label':'stationery'}
df=pd.DataFrame({'x':ad,'y':bd,'
z':cd}) print(df)
print(df.shape
( )) print(df.T)
print(df.size)
24. Write statements for the following:
a. To read 1st 6 rows of a file item.csv in D:\Shop folder into a
dataframe.
b. To save a dataframe xdf as file in C:\Files without the column
labels.
c. To read a file books.csv in D:\Library folder into a dataframe
without the column labels.
d. To save a dataframe tdf as file in the same folder as the python
program.
25. Write the output based on the dataframe Climate:

a. print(Climate.iloc[1:3,1:2])
b. print(Climate.head(-2))
c. print(Climate.sort_values(by= ‘Rainfall’))
d. print(Climate.sort_index(ascending = False))
e. print(Climate.iloc[[2,4]])
f. print(Climate.loc[:, ‘MaxTemp’])
g. print(Climate.loc[::3])
h. print(Climate[Climate[‘Year’]>2019])
i. print(Climate.tail(-3))

### 1. Definitions

**a. pandas**: A Python library used for data manipulation and analysis.
It provides data structures like Series and DataFrame to efficiently
handle structured data.

**b. Series**: A one-dimensional array-like structure in pandas that can


hold data of any type (integers, strings, floating-point numbers, etc.)
and has an associated array of data labels called its index.

**c. DataFrame**: A two-dimensional, size-mutable, and potentially


heterogeneous tabular data structure with labeled axes (rows and
columns).

### 2. Outputs

#### a.
```python
import pandas as pd
s = pd.Series((1, 2, 3, 4), index=('a', 'b', 'c', 'd'))
print(s)
```
Output:
```
a 1
b 2
c 3
d 4
dtype: int64
```

#### b.
```python
import pandas as pd
s = pd.Series(data=range(0, 10, 1), index=range(20, 30))
print(s)
print(s[0::2]**2)
s=s+2
print(s.head(3))
```
Output:
```
20 0
21 1
22 2
23 3
24 4
25 5
26 6
27 7
28 8
29 9
dtype: int64
20 0
22 4
24 16
26 36
28 64
dtype: int64
20 2
21 3
22 4
dtype: int64
```

#### c.
```python
import pandas as pd
import numpy as np
a = pd.DataFrame([1, 1, 1, np.nan], columns=['one'], index=['a', 'b', 'c',
'd'])
print(a)
```
Output:
```
one
a 1.0
b 1.0
c 1.0
d NaN
```

#### d.
```python
import pandas as pd
data = (1, 2, 3, 4, 5)
df = pd.DataFrame(data)
print(df)
```
Output:
```
0
0 1
1 2
2 3
3 4
4 5
```

#### e.
```python
import pandas as pd
s = pd.Series((98, 23, 10, 91, 39, 43, 74))
print(s)
print(s < 50)
print(s[s < 50])
```
Output:
```
0 98
1 23
2 10
3 91
4 39
5 43
6 74
dtype: int64
0 False
1 True
2 True
3 False
4 True
5 True
6 False
dtype: bool
1 23
2 10
4 39
5 43
dtype: int64
```

#### f.
```python
import pandas as pd
sdf = pd.DataFrame(data=((1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12)))
sdf.index = ('a', 'b', 'c', 'd')
sdf.columns = ('col1', 'col2', 'col3')
print(sdf)
```
Output:
```
col1 col2 col3
a 1 2 3
b 4 5 6
c 7 8 9
d 10 11 12
```

#### g.
```python
import pandas as pd
c = pd.Series([4, 10, 12, 26] * 2)
print(c)
print(c.shape)
```
Output:
```
0 4
1 10
2 12
3 26
4 4
5 10
6 12
7 26
dtype: int64
(8,)
```

#### h.
```python
import pandas as pd
values = ['Thailand', 'Switzerland', 'France']
code = ['TH', 'SW', 'FR']
name = ['country']
cdf = pd.DataFrame(data=values, index=code, columns=name)
print(cdf)
```
Output:
```
country
TH Thailand
SW Switzerland
FR France
```

#### i.
```python
import pandas as pd
s = pd.Series((2, 4, 5, 8, 9, 12))
print(s.size, s.ndim)
```
Output:
```
61
```

### Additional Tasks

#### 3. Create a series from a tuple which stores the prices of 6 items
in a shop and display the last 5 rows.
```python
import pandas as pd
prices = (100, 150, 200, 250, 300, 350)
price_series = pd.Series(prices)
print(price_series.tail(5))
```
Output:
```
1 150
2 200
3 250
4 300
5 350
dtype: int64
```

#### 4. Create a series from a list that contains the number of


students of the 6 classes XII G, XII H, XI E, XII P, XI F, and XI H. Use class
names as indices. Display the rows except the last 3.
```python
import pandas as pd
students = [30, 25, 28, 32, 20, 27]
classes = ['XII G', 'XII H', 'XI E', 'XII P', 'XI F', 'XI H']
students_series = pd.Series(students, index=classes)
print(students_series[:-3])
```
Output:
```
XII G 30
XII H 25
XI E 28
dtype: int64
```

#### 5. Create a series sales from the dictionary daily containing


{‘sun’: 3400, ‘mon’: 4100, ‘tue’: 3900, ‘wed’: 5000, ‘thu’: 4500, ‘fri’:
3800, ‘sat’: 1200}. Display i) the series except top 2 rows, ii) the labels,
iii) last 4 rows.
```python
import pandas as pd
daily = {'sun': 3400, 'mon': 4100, 'tue': 3900, 'wed': 5000, 'thu': 4500,
'fri': 3800, 'sat': 1200}
sales = pd.Series(daily)
print(sales[2:])
print(sales.index)
print(sales[-4:])
```
Output:
```
tue 3900
wed 5000
thu 4500
fri 3800
sat 1200
dtype: int64
Index(['sun', 'mon', 'tue', 'wed', 'thu', 'fri', 'sat'], dtype='object')
wed 5000
thu 4500
fri 3800
sat 1200
dtype: int64
```

#### 6. Create a dataframe from a tuple that contains the following


lists - ['S1', 'Rani', 90], ['S2', 'Ann', 86], ['S3', 'Sam', 830], ['S4', 'Manu',
90] with row and column labels.
```python
import pandas as pd
data = (['S1', 'Rani', 90], ['S2', 'Ann', 86], ['S3', 'Sam', 830], ['S4', 'Manu',
90])
df = pd.DataFrame(data, columns=['ID', 'Name', 'Score'], index=['row1',
'row2', 'row3', 'row4'])
print(df)
```
Output:
```
ID Name Score
row1 S1 Rani 90
row2 S2 Ann 86
row3 S3 Sam 830
row4 S4 Manu 90
```

#### 7. Create the following dataframe:


```python
import pandas as pd
data = {'subject': ['IT', 'Maths', 'Accounts', 'English', 'Physics'],
'teachers': [12, 10, 9, 11, 8], 'school': ['ABC', 'XYZ', 'MNO', 'PQR', 'UVW']}
df = pd.DataFrame(data, index=['I', 'II', 'III', 'IV', 'V'])
print(df)
```
Output:
```
subject teachers school
I IT 12 ABC
II Maths 10 XYZ
III Accounts 9 MNO
IV English 11 PQR
V Physics 8 UVW
```

#### 8. Write the code to create an empty series ser_A.


```python
import pandas as pd
ser_A = pd.Series()
print(ser_A)
```
Output:
```
Series([], dtype: float64)
``

You might also like