Worksheet -2 From Chapter 2 DataFrame
Fill in the blanks:
1. A _________function is used to display top rows (n) from dataframe.
2. A _______ function is used to display bottom rows (n) from dataframe.
3. _____ refers to passing True and False value as an index in Dataframes.
4. To pass index label _________ keyword is used to in pd.dataFrame()
function.
5. By default you can display ____ no. of top/bottom rows using head()/tail()
function.
Fill in the blanks Answers for dataframe functions:
1. head()
2. tail()
3. ignore_index
4. index
5. 5 (five)
Name City Email Fees
Consider this Dataframe from all questions given below:
MCQs
1 Choose the correct function to rename city columns to location using
rename() function:
a. df.rename(columns={‘City’:’Location’})
b. df.rename(columns={‘City’=’Location’})
c. df.rename(‘City’=’Location’)
d. df.rename(df.columns(‘City’,’Location’))
2 Which of the following statement(s) is/are correct with respect to df.columns
properties to rename columns
1. All columns must be specified
2. Columns must be in the form of a list
3. Old column names not required
4. Columns can be specified with columns number
a Only 1 is correct
b 1, 2 and 3 are correct
c 1 and 3 are correct
d All of them are correct
3 df.index properties can be used to
a rename rows
b rename columns
c rename rows and columns both
d None of these
4. To display 2 rows from the top in the dataframe, which of the following
statement is correct:
a df.head()=2
b. df.head(n=2)
c. df.head(range(2))
d. All of the above
Error based Questions – Dataframe
functions
1. Find errors from the given code fragments:
df.DataFrame({'S.NO':[1,2,3],'Name':['Sapan','Vivek','Vishal']})
df.rename[{'S.No':'SNO','Name':'Sname'}]
df.index=(1,2,3)
Errors Correction
Line 1: Dataframe Object is df = pd.DataFrame({‘S.NO’:[1,2,3],’Name’:
missing [‘Sapan’,’Vivek’,’Vishal’]})
Line 2: columns keyword is
missing, and rename is a function
df.rename(columns={‘S.No’:’SNO’,’Name’:’Sname’})
so bracket needs to be replaced to
correct
Lines 3: index properties requires
df.index=[1,2,3]
a list not in a tuple form
2.
df.DataFrame({'S.NO':[1,2,3],'Name':['Sapan','Vivek','Vishal']})
df.columns=('SNO','Name')
df.index=[1 to 3]
3.
df.DataFrame({'S.NO':[1,2,3],'Name':['Sapan','Vivek','Vishal']})
df.head(3)
df.tail(3)
Important DataFrame Questions
#1 Which of the following is the correct syntax to select or access columns from the
dataframe using column names?
a) df(col1,col2,…,coln)
b) df[[col1,col2,…,coln]]
c) df[col1,col2,…,coln]
d) df{col1:col2:…,:coln}
#2 Ms. Kavitha wants to print a single column from the dataframe, which of the
following is correct syntax for her?
a) df(col)
b) df<col>
c) df[col]
d) df{df:col}
#3 You cannot print columns using dot notation when the column name is having a
space in the dataframe. (True/False)
#4 Observe the following dataframe code:
dt=({'Name':['Akshit','Bharat','Chetan','Dhaval','Gaurang'],
'InternalMarks':[18,19,20,18,19],
'AnnualExam':[76,78,80,76,73]})
df=pd.DataFrame(dt)
Which of the code will print names and Annual marks of students?
Option 1: print(df[['Name','AnnualExam']])
Option 2: Using .loc:
print(df.loc[:, ['Name','AnnualExam']])
Option 3: Using column access:
print(df['Name'])
print(df['AnnualExam'])
Option 4: print(df.loc[:,df.columns!=’InternalMarks’])
5 What will be the output of following code:
dt={'Name':['Akshit','Bharat','Chetan','Dhaval','Gaurang'],
'InternalMarks':[18,19,20,18,19],
'AnnualExam':[76,78,80,76,73]}
df=pd.DataFrame(dt)
print(df.iloc[0:2,0:2])
Answer
Name InternalMarks
0 Akshit 18
1 Bharat 19
Consider the following dataframe and do as directed:
import pandas as pd
d={‘Mouse’:[150,200,300,400],
‘Keyboard’:[180,200,190,300],
‘Scanner’:[200,280,330,450]}
df=pd.DataFrame(d,index=[‘Jan’,’Feb’,’March’,’April’])
A. Write code to access data of Mouse and Scanner columns.
print(df[[‘Mouse’,’Scanner’]])
B. Write code to access data of the Keyboard column using dot
notation and column name.
print(df.Keybaord)
C. Write code to access data of scanners using loc[].
print(df.loc[:,’Scanner’])
D. Write code to access data of all columns where mouse data is more
than 200.
print(df[df[‘Mouse’]>200])
E. Write code to access columns using 0 and 2.
print(df.iloc[:,[0,2]])
F. Write code to access data of rows of Jan and March for scanner and
keyboard.
print(df.loc[[‘Jan’,’March’],[‘Scanner’,’Keyboard’]])
The questions based on output:
Consider the above dataframe and predict the output:
a) print(df.iloc[1][2]) → 280
b) print(df.loc[‘Feb’,’Scanner’]) → 280
c) df1=df[df[‘Keyboard’]>190]
print(df1[[‘Mouse’,’Scanner’]]) → Mouse Scanner
Feb 200 280
April 400 450
d) print(df.iat[2,1]) → 190
What is the difference between loc[] and iloc[]? Explain with Example.
loc[] → Label-based indexing
It is used to access rows/columns using labels (names/index values).
Syntax:
df.loc[row_label , column_label]
import pandas as pd
df = pd.DataFrame({ 'Name':['Akshit','Bharat','Chetan'], 'Marks':[85,90,95]},
index=['a','b','c'])
print(df.loc['b', 'Name']) # Access row with label 'b' and column 'Name'
iloc[] → Integer-location based indexing
It is used to access rows/columns using their integer position (0,1,2,...).
Syntax:
df.iloc[row_index , column_index]
print(df.iloc[1, 0]) # Access row at position 1 and column at position 0
Feature loc[] (Label-based) iloc[] (Integer-based)
Selection by Row/Column labels Row/Column positions
Example df.loc['b','Name'] df.iloc[1,0]
Inclusivity in slicing End index included End index excluded
Delete row columns in Dataframe
1. Which of the following method is used to delete row/column from a
dataframe?
1. delete()
2. remove()
3. discard()
4. drop()
2. Which of the following method is correct for following dataframe to delete row
by specifying index name? dt= ({‘Year1’:[1200,1220,1500]},
{‘Year2’:1800,1700,1400})
df = pd.DataFrame(dt,index=[‘T1′,’T2’])
Which of the following is correct to delete T2 index record from dataframe?
1. df.drop(df.index[‘T2’])
2. df.drop(df.idx=’T2′)
3. df.drop(index=’T2′)
4. df.drop(idx=’T2′)
3. When you perform delete operation on dataframe, it returns a new dataframe
always without changing the old dataframe. To avoid this which parameter
you will use to change the original dataframe?
1. intact = True
2. inplace = True
3. update = True
4. replace = True
4. Which parameter is used to add in drop() method to delete columns?
1. column = n (Where n is column number)
2. col = n (Where n is column number)
3. axis = 0
4. axis = 1
1. 4. drop()
2. 3. df.drop(index=’T2′)
3. 2. inplace = True
4. axis = 1