0 ratings0% found this document useful (0 votes) 44 views11 pagesCreating A Series and Using Matplotlib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
CHAPTER 6 DATA EXPLORING AND ANALYSIS
Creating a Series
Pandas provides a Series() method that is used to create a series
structure. A serious structure of size n should have an index of length
n. By default Pandas creates indices starting at 0 and ending with n-1.
‘A Pandas series can be created using the constructor pandas Series
(data, index, dtype, copy) where data could be an array, constant,
list, etc. The series index should be unique and hashable with length n,
while dtype is a data type that could be explicitly declared or inferred
from the received data. Listing 6-1 creates a series with a default index
and with a set index.
Listing 6-1. Creating a Series
In [5]:
import pandas as pd
import numpy as np
data = np.array(['0",'s','S','A'])
$1 = pd.Series(data) # without adding index
52 = pd,Series(data, index=[100,101,102,103]) # with
adding index print (S1) print ("\n") print (S2)
dtype: object
100 0
101 S
102 S
103 A
dtype: object
244CHAPTER 6 DATA EXPLORING AND ANALYSIS.
In [40]:import pandas as pd
import numpy as np
mmy_series2 = np-random.randn(5, 10)
print ("\nny series2\n", my series2)
This is the output of creating a series of random values of 5 rows and
10 columns.
‘r’aco8se0e77 0.
79400706) - .
eeaTsa
As mentioned earlier, you can create a series from a dictionary;
Listing 6-2 demonstrates how to create an index for a data series.
Listing 6-2. Creating an Indexed Series
In [6]: import pandas as pd
import numpy as np
data = {'X' :0., ‘Yor a,
SERIES1 = pd.Series(data)
print (SERTES1)
X 0.0
Y 1.0
22.0
dtype: floatea
In [7]: import pandas as pd
import numpy as np
data = {'X' 20, Vora, ‘2 2}
SERIES1 = pd.Series(data,indexe["Y','Z','W','X'])
print (SERTES1)
Yio
245CHAPTER 6 DATA EXPLORING AND ANALYSIS
22.0
W NaN
X0.0
dtype: floateg
Ifyou can create series data from a scalar value as shown in Listing 6-3,
then an index is mandatory, and the scalar value will be repeated to match
the length of the given index.
Listing 6-3, Creating a Series Using a Scalar
In [9]: # Use sclara to create a series
import pandas as pd
import numpy as np
Seriesi = pd.Series(7, index=(0, 1, 2, 3, 4])
print (Series1)
0
1
2
3
4
dtype: intea
Accessing Data from a Series with a Position
Like lists, you can access a series data via its index value. The examples in
Listing 6-4 demonstrate different methods of accessing a series of data.
‘The first example demonstrates retrieving a specific element with index 0
‘The second example retrieves indices 0, 1, and 2. The third example
retrieves the last three elements since the starting index is -3 and moves
backward to -2, -1. The fourth and fifth examples retrieve data using the
series index labels.
246CHAPTER 6 DATA EXPLORING AND ANALYSIS
Listing 6-4. Accessing a Data Series
In [18]: import pandas as pd
Seriesi = pd.Series([1,2,3,4,5],index =
['a',"b',"c',"d',"e"])
print ("Example 1:Retrieve the first element")
print (Series[o] )
print ("\nExample 2:Retrieve the first three element")
print (Seriesa[:3])
print ("\nExample 3:Retrieve the last three element")
print(Seriesa{-3:])
print (*\nExanple 4:Retrieve a single elenent")
print (Seriesa[‘a"])
print ("\nExanple 5:Retrieve multiple elements")
print (Seriesi{['a','c','d']])
Example 1:Retrieve the first element
1
Example 2:Retrieve the first three element
aon
b 2
© 8
deype: inte4
Example 3:Retrieve the last three element
c 3
a 4
e 5
type: int6é
Example 4:Retrieve a single element
1
Example multiple elements
aoa
2 3
a 4
dtype: intéd
247CHAPTER 6 DATA EXPLORING AND ANALYSIS
Exploring and Analyzing a Series
‘Numerous
statistical methods can be applied directly on a data series.
Listing 6-5 demonstrates the calculation of mean, max, min, and standard.
deviation of a data series. Also, the .describe() method can be used to
give a data description, including quantiles.
Listing 6-5. Analyzing Series Data
In [10]: import pandas as pd
248
import numpy as np
my_seriesi = pd.Series([5, 6, 7, 8, 9, 10])
print (“ny seriesi\n", ny series1)
print ("\n Series Analysis\n ")
print ("Series mean value : ", my seriest.mean()) #
find mean value in a series
print ("Series max value
find max value in a series
print ("Series min value : ",my_seriest.min()) #
find min value in a series
“,my_seriest.max()) #
print ("Series standard deviation value : ",
my seriest.std()) # find standard deviation
my_seriest
o 5
6
7
8
9
10
1
2
3
4
5
dtype: int6gIn
Series
Series
Series
Series
Series
CHAPTER 6 DATA EXPLORING AND ANALYSIS.
Analysis
mean value : 7.5
max value : 10
min value : 5
standard deviation value : 1.8708286933869707
[11]: my_seriest.describe()
Out[a1]: count
mean
std
min
25%
50%
15%
max
dtype:
6.000000
7.500000
1.870829
5.000000
6.250000
7.500000
8.750000
10.000000
float64
Ifyou copied by reference one series to another, then any changes
to the series will adapt to the other one. After copyingmy_series1 tomy_
series_11, once you change the indices ofmy_series_11, itreflects back
tomy_seriesi, as shown in Listing 6-6.
Listing 6-6, Copying a Series to Another with a Reference
In
[17]: my_series 11 = ny seriest
print (ny_series1)
my_series 11.index = [‘A',
print (my_series 11)
print (my_series1)
°
5
1 6
207
3 8
249CHAPTER 6 DATA EXPLORING AND ANALYSIS
49
5 10
dtype: integ
AS
B 6
co7
D8
Eo 9
F 10
dtype: integ
AS
B 6
co7
D8
E 9
F 10
dtype: integ
‘You can use the .copy() method to copy the data set without having a
reference to the original series. See Listing 6-7.
Listing 6-7. Copying Series Values to Another
In [21]: my series 11 = my seriest.copy()
print (ny_series1)
my series 11,index = ['A', 'B',
print (ny_series_11)
print (my_series1)
°
5
1 6
207
3 8
250CHAPTER 6 DATA EXPLORING AND ANALYSIS
49
5 10
dtype: integ
mmonwe
10
dtype: integ
o 5
1 6
207
3 8
49
5 10
dtype: inte4
Operations on a Series
Numerous operations can be implemented on series data. You can check
whether an index value is available in a series or not. Also, you can check
all series elements against a specific condition, such as ifthe series value is
less than 8 or not. In addition, you can perform math operations on series
data directly or via a defined function, as shown in Listing 6-8.
Listing 6-8, Operations on Series
In [23]: 'F
Out [23]: True
in my_series 11
In [27]: tenp = my series 11 < 8
‘temp
251CHAPTER 6 DATA EXPLORING AND ANALYSIS
out[27]: A True
Bo True
C Tre
D False
— False
F False
dtype: bool
'n [35]: len(my_series_11)
out [35]: 6
In [28]: tenp = my_series 11[my series 11 < 8] *2
temp
out[28]: A 10
B 12
co
dtype: integ
Define a function to add two series and call the function, like this:
In [37]: def AddSeries(x,y)
for i in range (Jen(x)):
print (x[i] + y[i])
in [39]: print ("Add two series\n")
Addseries (my series 11, my series
Add two series
10
R
14
16
18
20
252CHAPTER 6 DATA EXPLORING AND ANALYSIS.
You can visualize data series using the different plotting systems that
are covered in Chapter 7. However, Figure 6-1 demonstrates how to get
an at-a-glance idea of your series data and graphically explore it via visual
plotting diagrams. See Listing 6-9.
Listing 6-9. Visualizing Data Series
In [49]: import matplotlib.pyplot as plt
plt.plot(my_series2)
plt.ylabel ("index")
plt.show()
20
is
10
os
oo
index
~05
“10
“15
-20
oo os 10 15 20 25 30 35 40
Figure 6-1. Line visualization
In [54]: from numpy import *
import math
import matplotlib.pyplot as plt
t = linspace(0, 2*nath.pi, 400)
253CHAPTER 6 DATA EXPLORING AND ANALYSIS
sin(t)
cos(t)
czatb
In [50]: plt.plot(t, a, ‘r') # plotting t, a separately
plt.plot(t, b, 'b’) # plotting t, b separately
plt.plot(t, c, 'g') # plotting t, c separately
plt.show()
We can add multiple plots to the same canvas as shown in Figure 6-2.
is
10
05
00
0 1 2 3 4 5 6
Figure 6-2. Multiplots on the same canvas
Data Frame Data Structures
‘As mentioned earlier, a data frame is a two-dimensional data structure
with heterogeneous data types, ie., tabular data.
254