Pandas in Python — 30 Essential Interview Questions & Answers
Q1: What is Pandas in Python?
A: Pandas is a powerful data manipulation library in Python. It provides easy-to-use data structures
like Series and DataFrame to work with structured data such as tables.
Q2: What is the difference between a Series and a DataFrame?
A:
A Series is a one-dimensional labeled array.
A DataFrame is a two-dimensional labeled data structure (like a table with rows and
columns).
Q3: How do you create a DataFrame from a dictionary in Pandas?
A: Use pd.DataFrame(dictionary_name) where the dictionary contains column names as keys and
values as lists.
Q4: What is the default index in a Pandas DataFrame?
A: The default index is a numeric sequence starting from 0 for each row.
Q5: How can you check the first 5 rows of a DataFrame?
A: Use the .head() function. For example, df.head() returns the first 5 rows.
Q6: How do you describe statistical summary of data in Pandas?
A: Use df.describe() to get mean, standard deviation, min, max, and quartile values.
Q7: How can you get the column names of a DataFrame?
A: Use df.columns to get the list of column labels.
Q8: How do you select a single column from a DataFrame?
A: Use df['column_name'] to select a specific column.
Q9: How do you filter rows based on a condition?
A: Example: df[df['age'] > 30] filters rows where the age is greater than 30.
Q10: How do you check for missing values in a DataFrame?
A: Use df.isnull() or df.isna() to identify missing values.
Q11: How can you fill missing values with a constant in Pandas?
A: Use df.fillna(0) to replace all missing values with 0.
Q12: How do you drop rows with missing data?
A: Use df.dropna() to remove rows with any missing value.
Q13: How can you rename columns in a DataFrame?
A: Use df.rename(columns={'old_name': 'new_name'}).
Q14: How do you sort a DataFrame by a specific column?
A: Use df.sort_values(by='column_name').
Q15: How do you reset the index of a DataFrame?
A: Use df.reset_index(drop=True).
Q16: What is the use of .iloc[] in Pandas?
A: .iloc[] is used to access rows and columns by position (integer index).
Q17: What is the use of .loc[] in Pandas?
A: .loc[] is used to access rows and columns by label or condition.
Q18: How do you add a new column to an existing DataFrame?
A: Use df['new_col'] = values to assign a new column.
Q19: How can you remove a column from a DataFrame?
A: Use df.drop('column_name', axis=1).
Q20: How can you group data in Pandas?
A: Use df.groupby('column') to group rows and perform aggregations.
Q21: What is the output of df.value_counts()?
A: It shows the frequency of unique values in a column or Series.
Q22: How do you convert a column to datetime in Pandas?
A: Use pd.to_datetime(df['column']).
Q23: How do you get the shape of a DataFrame?
A: Use df.shape, which returns a tuple (rows, columns).
Q24: How do you find unique values in a column?
A: Use df['column'].unique().
Q25: How do you apply a custom function to a column?
A: Use .apply(), like df['column'].apply(my_function).
Q26: How can you concatenate two DataFrames?
A: Use pd.concat([df1, df2]).
Q27: How do you export a DataFrame to a CSV file?
A: Use df.to_csv('filename.csv', index=False).
Q28: What is the difference between .apply() and .map()?
A:
.map() works only on Series and is good for element-wise operations.
.apply() works on both DataFrames and Series, and allows row or column-wise operations.
Q29: How can you merge two DataFrames?
A: Use pd.merge(df1, df2, on='column').
Q30: What is the difference between merge() and concat()?
A:
merge() is similar to SQL join operations and merges based on keys.
concat() joins DataFrames along a particular axis.
Checkout this Resource for Complete Data Analysis Guide :
Link : https://topmate.io/mohd_sahib_raza/1662458
LinkedIn: www.linkedin.com/in/mohdsahibraza
WhatsApp Data Analysis Community: https://chat.whatsapp.com/EGuKuK1CW525S2RUS3Clyw