Python Pandas
Pandas is the most popular Python library. It is use for data analysis. Pandas provides highly
optimize performance.
Panda data analysis can be done by using -
1. Series
2. Dataframe
Series
It is 1 D array defined in pandas. It can be used to store any data type.
# example Program
Import pandas as pd
# creating series with data and index
a=pd.Series(data,index=index)
***
Data is a value which can be either be integer or string, or dictionary or Ndarray.
Default Index is from 0,1,2,...n-1
Where n is the length of data
#Create program using scalar values.
d=[2,5,6,3,7,9]
s=pd.Series(d)
#Define Indexes
index=['a','b','c','d','e','f']
si=pd.Series(d,index)
Output
# Default index
02
15
36
43
57
69
# scalar data with defined index
a 2
b 5
c 6
d 3
e 7
f 9
Dataframe
Dataframe is a 2D data structure defined in pandas. Dataframe is tabular
structure. It contains rows and columns.
#creating dataframe
import pandas as pd
df=pd.DataFrame(data)
# It create a DataFrame with data
Data can be
1. Dictionaries
2. Series
3. Numpy Ndarray
# creating DataFrame using dictionary
import pandas as pd
D1={'p':10,'q':20,'r':30}
D2={'p',40,'q':50,'r':60}
data={'First':D1,'Second':D2}
df=pd.DataFrame(data)
Output
First Second
p 10 40
q 20 50
r 30 60
DataFrame using numpy array
While creating DataFrame using Ndarray we have to maintain same
dimensions of the numpy array.
Eg.
import pandas as pd
n1=[[1,2,3],[4,5,6]]
n2=[[7,8,9],[6,3,4]]
data={'First':n1,'Second':n2}
df=pd.DataFrame(data)
Output
First Second
0 [1,2,3] [7,8,9]
1 [4,5,6] [6,3,4]