0% found this document useful (0 votes)
10 views4 pages

Pandas Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views4 pages

Pandas Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Pandas Introduction

Pandas is open-source Python library which is used for data


manipulation and analysis. It consist of data structures and
functions to perform efficient operations on data. It is well-
suited for working with tabular data such
as spreadsheets or SQL tables. It is used in data science
because it works well with other important libraries. It is built
on top of the NumPy library as it makes easier to manipulate
and analyze. Pandas is used in other libraries such as:
 Matplotlib for plotting graphs
 SciPy for statistical analysis
 Scikit-learn for machine learning algorithms.
 It uses many functionalities provided by NumPy library.
Here is a various tasks that we can do using Pandas:
 Data Cleaning, Merging and Joining: Clean and combine
data from multiple sources, handling inconsistencies and
duplicates.
 Handling Missing Data: Manage missing values (NaN) in
both floating and non-floating point data.
 Column Insertion and Deletion: Easily add, remove or
modify columns in a DataFrame.
 Group By Operations: Use "split-apply-combine" to
group and analyze data.
 Data Visualization: Create visualizations with Matplotlib
and Seaborn, integrated with Pandas.
To learn Pandas from basic to advanced refer to our page:
Pandas tutorial
Getting Started with Pandas
Let's see how to start working with the Python Pandas library:
Installing Pandas
First step in working with Pandas is to ensure whether it is
installed in the system or not. If not then we need to install it
on our system using the pip command.
pip install pandas
For more reference, take a look at this article on installing
pandas.
Importing Pandas
After the Pandas have been installed in the system we need
to import the library. This module is imported using:
import pandas as pd
Note: pd is just an alias for Pandas. It’s not required but using
it makes the code shorter when calling methods or
properties.
Data Structures in Pandas Library
Pandas provide two data structures for manipulating data
which are as follows:
1. Pandas Series
A Pandas Series is one-dimensional labeled array capable of
holding data of any type (integer, string, float, Python objects
etc.). The axis labels are collectively called indexes.
Pandas Series is created by loading the datasets from existing
storage which can be a SQL database, a CSV file or an Excel
file. It can be created from lists, dictionaries, scalar values,
etc.
Example: Creating a series using the Pandas Library.
import pandas as pd
import numpy as np

ser = [Link]()
print("Pandas Series: ", ser)

data = [Link](['g', 'e', 'e', 'k', 's'])

ser = [Link](data)
print("Pandas Series:\n", ser)
Output:
Pandas Series
2. Pandas DataFrame
Pandas DataFrame is a two-dimensional data structure with
labeled axes (rows and columns). It is created by loading the
datasets from existing storage which can be a SQL database, a
CSV file or an Excel file. It can be created from lists,
dictionaries, a list of dictionaries etc.
Example: Creating a DataFrame Using the Pandas Library
import pandas as pd

df = [Link]()
print(df)

lst = ['Geeks', 'For', 'Geeks', 'is', 'portal', 'for', 'Geeks']

df = [Link](lst)
print(df)
Output:
Pandas DataFrame
With Pandas, we have a flexible tool to handle data efficiently
whether we're cleaning, analyzing or visualizing it for our
next project.

You might also like