PANDAS
IMPORTANT
FUNCTIONS
follow for more @Lovee Kumar
TOP 15 IMPORTANT
PANDAS FUNCTIONS
pd.read_csv( )
pandas.read_csv() is used to read a CSV (Comma
Separated Values) file and convert it into a pandas
DataFrame.
CODE:
OUTPUT:
df.info()
df.info() is used to display a summary of a data frame,
including the data types and the number of non-null
values in each column.
CODE:
OUTPUT:
df.describe()
df.describe() method in Pandas is used to generate
descriptive statistics of the columns of a DataFrame.
useful for getting a quick overview of the distribution of the
data(mean, median, mode) and measures of dispersion
(standard deviation, range, interquartile range).
CODE:
OUTPUT:
df.assign()
df.assign() method in Pandas is used to add new columns
to a DataFrame.
CODE:
OUTPUT:
df.sample()
df.sample() function returns a random sample of rows
from a DataFrame.By default, it returns one random
row, but the number of rows can be specified as an
argument. For example, df.sample(5) returns 5 random
rows from the DataFrame. This function can be useful
for getting a quick understanding of the distribution of
values in a large dataset.
CODE:
OUTPUT:
df.head()
df.head() function returns the first n (default 5) rows of
a DataFrame.The n number of rows can be specified as
an argument, for example df.head(10) returns the first
10 rows of the DataFrame.
CODE:
OUTPUT:
df.tail()
The df.tail() function returns the last n (default 5) rows
of a DataFrame. The n number of rows can be specified
as an argument, for example df.tail(10) returns the last
10 rows of the DataFrame.
CODE:
OUTPUT:
df.drop( )
df.drop() method in pandas is used to remove rows or
columns from a data frame.
CODE:
OUTPUT: Customer_Age and Age_Group column is dropped from the data frame
df.dropna( )
df.dropna() method in Python is used to remove any rows
that contain missing values (i.e. NaN) from a DataFrame.
This can be useful for cleaning data before analysis or
modeling.
CODE:
OUTPUT:
Before After
df.query()
df.query() is used to filter rows of a DataFrame based
on a condition.
CODE:
OUTPUT:
df.sort_values()
df.sort_values() method in Pandas is used to sort the
rows of a DataFrame based on the values of columns.
CODE: CODE:
OUTPUT:
OUTPUT:
df.groupby().sum()
df.groupby() is used to group a data frame by one or
more columns. The result is a new data frame that has the
grouped columns as the index, and the other columns are
aggregated using a specified aggregation method.
CODE:
OUTPUT:
df.merge()
df.merge() is used to combine two or more DataFrames
into a single data frame. The function works by joining
the DataFrames on one or more common columns,
similar to a SQL JOIN operation
CODE:
OUTPUT:
df.rename()
df.rename() is used to rename one or more columns in
a data frame.
CODE:
OUTPUT:
df.to_csv()
df.to_csv() is used to save a data frame(export data) to
a CSV (Comma Separated Values) file.
CODE:
OUTPUT:
follow for more @Lovee Kumar