Topic: Accessing Data from DataFrame
#example
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
* From a DataFrame object, you can select desired rows and columns
1. Selecting or accessing a single column
You can select a column from a DataFrame using the following methods:
Using square brackets
<DataFrame object> [<column name>]
Or
Using dot notation
<DataFrame object>.<Column name>
Example
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
print(df["Age"])
print(df.Age)
print(df.Marks)
2. Selecting or Accessing Multiple columns: You can select multiple columns from a
DataFrame using the following methods:
Using Square Brackets: you can pass a list of column names inside the square brackets
print (df [ [“Age”,”Marks”] ])
Using loc indexer: The loc indexer is used to select rows and columns.
To select multiple columns, use df.loc[ :, ["Age”, “Marks”]].
Here, : specifies all rows and [‘Age’,’Marks’]] specifies the columns.
print(df.loc[ :, ["Age”, “Marks”]])
Example
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
print(df[["Name","Age"]])
print(df.loc[:,['Age','Name']])
Note: Dot notation (df.column_name) only allows access to one column at a time and cannot
be used to select multiple columns.
Trying to use it for multiple columns will raise an error.
3. Subset from a DataFrame using Row/Col:
You can subset a DataFrame using row and column indices or labes
Using ‘loc’:
print(df.loc[:,[‘Age’,’Marks’]])
print(df.loc[1:2, [:,[‘Age’,’Marks’]])
Using ‘iloc’(integer location):
print(df.iloc[:,[1,2]]) - All rows and 1 & 2 cols
print(df.iloc[0:3,[1,2]]) - 0 to 2 rows and 1 & 2 cols
Example
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
print(df.loc[:,['Name','Age']])
print(df.loc[1:2,['Name','Age']])
print(df.iloc[0:3,[0,2]])
Specific row and column using “loc”:
print(df.loc[1,’Age’])
Specific row and column using “iloc”:
print(df.iloc[1,1])
4. Selecting individual values:
You can select data from a DataFrame using the following methods:
Using square brackets: you can access specific cell values by specifying the column and
row label or index directly.
df.set_index(‘Name’, inplace=True) # here now name takes values and Age, Marks(0,1)
print(df.loc[“Sneha”,”Age”])
print(df.iloc[1,0])
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
df.set_index('Name',inplace=True)
print(df.loc[2,'Age'])
print(df.loc['Ravi','Age'])
print(df.iloc[0,2])
Using at and iat Attributes:
the “at” attribute is used for accessing a single value by label.
The “iat” attribute is used for accessing a single value by integer position.
print(df.at[“Vinay”,”Age”])
print(df.iat[1,0])
Example
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
df.set_index('Name',inplace=True)
print(df.at["Vinay","Marks"])
print(df.iat[1,0])
5. Selecting based on Boolean Conditions:
Sometimes you need to select rows/columns from a DataFrame based on a condition
When you compare a DataFrame with a value then Pandas will execute that comparison condition
for each element of the DataFrame and give you True or False accordingly for each element.
print(df["Age"]>17)
Print(df[df[‘Age’] > 17])
Filters the rows where the age column values greater than 17
import pandas as pd
data = { "Name":["Ravi","Vinay","Meera"],
"Age":[17,18,19],
"Marks":[98,95,92]
}
df = pd.DataFrame(data)
print(df)
print(df["Age"]>17)
print(df[df["Age"]>17])