0% found this document useful (0 votes)
51 views35 pages

Handout Python

dsfsdf

Uploaded by

Farrel Gilbran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views35 pages

Handout Python

dsfsdf

Uploaded by

Farrel Gilbran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Handout Statistics Software Training:

Machine Learning Basics with Python


Novri Suhermi

1 Bahasa Pemrograman Python


1.1 Syntax Dasar Python
1.1.1 Hello Python!
In:
1 print ( ’ Hello Python ! ’)

Out:
1 Hello Python !

1.1.2 Operasi Aritmetika


In:
1 # P e n j u m l a h a n dan P e n g u r a n g a n
2 print (4+10)
3 print (5 -2)

Out:
1 14
2 3

In:
1 # P e r k a l i a n dan P e m b a g i a n
2 print (5*8)
3 print (9.0/2)

Out:
1 40
2 4.5

In:
1 # Operasi Pangkat
2 print (4**2)

Out:
1 16

In:
1 # Modulo
2 print (18%7)

Out:
1 4

1
1.2 Variable dan Type
1.2.1 Variable Assignment
In:
1 # Membuat v a r i a b e l
2 savings = 100
3
4 # M e n c e t a k v a r i a b e l " saving "
5 print ( savings )

Out:
1 100

1.2.2 Perhitungan dengan Variabel


In:
1 # Membuat v a r i a b e l " factor "
2 factor = 1.10
3
4 # M e n g h i t u n g " result "
5 result = savings * ( factor **7)
6
7 # M e n c e t a k " result "
8 print ( ’ result = ’+ str ( result ) )

Out:
1 result = 1 9 4 . 8 7 1 7 1 0 0 0 0 0 0 0 1 2

1.2.3 Tipe variabel lain


In:
1 # Membuat v a r i a b e l " desc "
2 desc = ’ compound interest ’
3
4 # Membuat v a r i a b e l " p r o f i t a b l e "
5 profitable = True

1.2.4 Operasi tipe variabel lain


In:
1 savings = 100
2 factor = 1.1
3 desc = " compound interest "
4
5 # Assign p e r k a l i a n dari factor dan savings ke v a r i a b e l year1
6 year1 = factor * savings
7
8 # M e n c e t a k tipe v a r i a b e l " year1 "
9 print ( type ( year1 ) )
10
11 # Assign p e n j u m l a h a n " desc " dan " desc " ke v a r i a b e l " d o u b l e d e s c "
12 doubledesc = desc + desc
13
14 # Mencetak variabel " doubledesc "
15 print ( doubledesc )

Out:
1 < class ’ float ’ >
2 compound i n t e r e s tc o m p o u n d interest

2
1.2.5 Konversi tipe variabel
In:
1 # M e n d e f i n i s i k a n v a r i a b e l savings dan result
2 savings = 100
3 result = 100 * 1.10 ** 7
4
5 # Print out
6 print ( " I started with $ " + str ( savings ) + " and now have $ " + str ( result ) + " . Awesome ! " )
7
8 # Mendefinisikan pi_string
9 pi_string = ’ 3.1415926 ’
10
11 # M e n k o n v e r s i p i _ s t r i n g ke dalam float
12 pi_float = float ( pi_string )

Out:
1 I started with $100 and now have $194 . 8 7 1 7 1 0 0 0 0 0 0 01 2 . Awesome !

1.3 List
1.3.1 Membuat sebuah list
In:
1 # v a r i a b e l area ( dalam m2 )
2 hall = 11.25
3 kit = 18.0
4 liv = 20.0
5 bed = 10.75
6 bath = 9.50
7
8 # Membuat list " areas "
9 areas = [ hall , kit , liv , bed , bath ]
10
11 # M e n c e t a k " areas "
12 print ( areas )

Out:
1 [11.25 , 18.0 , 20.0 , 10.75 , 9.5]

1.3.2 Membuat list dengan berbeda tipe variabel


In:
1 hall = 11.25
2 kit = 18.0
3 liv = 20.0
4 bed = 10.75
5 bath = 9.50
6
7 # List " areas " baru
8 areas = [ " hallway " , hall , " kitchen " , kit , " living room " , liv , " bedroom " , bed , " bathroom " , bath ]
9
10 # M e n c e t a k areas
11 print ( areas )

Out:
1 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ living room ’ , 20.0 , ’ bedroom ’ , 10.75 , ’ bathroom ’ , 9.5]

3
1.3.3 List of lists
In:
1 hall = 11.25
2 kit = 18.0
3 liv = 20.0
4 bed = 10.75
5 bath = 9.50
6
7 # list of lists " house "
8 house = [[ " hallway " , hall ] ,
9 [ " kitchen " , kit ] ,
10 [ " living room " , liv ] ,
11 [ " bedroom " , bed ] ,
12 [ " bathroom " , bath ]]
13
14 # M e n c e t a k " house "
15 print ( house )
16
17 # M e n c e t a k tipe " house "
18 print ( type ( house ) )

Out:
1 [[ ’ hallway ’ , 11.25] , [ ’ kitchen ’ , 18.0] , [ ’ living room ’ , 20.0] , [ ’ bedroom ’ , 10.75] , [ ’ bathroom ’ ,
9.5]]
2 < class ’ list ’ >

1.4 Subsetting Lists


1.4.1 Subset
In:
1 # list " areas "
2 areas = [ " hallway " , 11.25 , " kitchen " , 18.0 , " living room " , 20.0 , " bedroom " , 10.75 ,
3 " bathroom " , 9.50]
4
5 # M e n c e t a k elemen kedua dari list areas
6 print ( areas [1])
7
8 # M e n c e t a k elemen t e r a k h i r dari list areas
9 print ( areas [ -1])
10
11 # M e n c e t a k area dari " living room "
12 print ( areas [5])

Out:
1 11.25
2 9.5
3 20.0

1.4.2 Subset dan operasi subset


In:
1 # list " areas "
2 areas = [ " hallway " , 11.25 , " kitchen " , 18.0 , " living room " , 20.0 , " bedroom " , 10.75 ,
3 " bathroom " , 9.50]
4
5 # Membuat v a r i a b e l " e a t _ s l e e p _ a r e a "
6 eat_s leep_ar ea = areas [3] + areas [7]
7
8 # Mencetak variabel " eat_sleep_area "
9 print ( eat_s leep_are a )

Out:
1 28.75

4
1.4.3 Slicing di dalam list
In:
1 # list areas
2 areas = [ " hallway " , 11.25 , " kitchen " , 18.0 , " living room " , 20.0 , " bedroom " , 10.75 ,
3 " bathroom " , 9.50]
4
5 # M e n g g u n a k a n slicing untuk membuat list d o w n s t a i r s
6 downstairs = areas [0:6]
7
8 # M e n g g u n a k a n slicing untuk membuat list u p s t a i r s
9 upstairs = areas [6:10]
10
11 # M e n c e t a k d o w n s t a i r s dan u p s t a i r s
12 print ( downstairs )
13 print ( upstairs )

Out:
1 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ living room ’ , 20.0]
2 [ ’ bedroom ’ , 10.75 , ’ bathroom ’ , 9.5]

1.4.4 Slicing di dalam list (2)


In:
1 # list areas
2 areas = [ " hallway " , 11.25 , " kitchen " , 18.0 , " living room " , 20.0 , " bedroom " , 10.75 ,
3 " bathroom " , 9.50]
4
5 # A l t e r n a t i v e slicing
6 downstairs = areas [:6]
7 print ( downstairs )
8
9 # A l t e r n a t i v e slicing
10 upstairs = areas [6:]
11 print ( upstairs )

Out:
1 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ living room ’ , 20.0]
2 [ ’ bedroom ’ , 10.75 , ’ bathroom ’ , 9.5]

1.5 Manipulasi list


1.5.1 Mengganti elemen di dalam list
In:
1 # list " areas "
2 areas = [ " hallway " , 11.25 , " kitchen " , 18.0 , " living room " , 20.0 , " bedroom " , 10.75 ,
3 " bathroom " , 9.50]
4
5 # M e n g g a n t i area b a t h r o o m
6 areas [ -1] = 10.50
7
8 # M e n g g a n t i " living room " menjadi " chill zone "
9 areas [4] = " chill zone "
10
11 print ( areas )

Out:
1 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ chill zone ’ , 20.0 , ’ bedroom ’ , 10.75 , ’ bathroom ’ , 10.5]

5
1.5.2 Ekstensi list
In:
1 # list area
2 areas = [ " hallway " , 11.25 , " kitchen " , 18.0 , " chill zone " , 20.0 ,
3 " bedroom " , 10.75 , " bathroom " , 10.50]
4
5 # e k s t e n s i list baru : areas_1
6 areas_1 = areas + [ " poolhouse " , 24.5]
7
8 # e k s t e n s i llist baru : areas_2
9 areas_2 = areas_1 + [ " garage " , 15.45]
10
11 print ( areas )
12 print ( areas_1 )
13 print ( areas_2 )

Out:
1 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ chill zone ’ , 20.0 , ’ bedroom ’ , 10.75 , ’ bathroom ’ , 10.5]
2 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ chill zone ’ , 20.0 , ’ bedroom ’ , 10.75 , ’ bathroom ’ , 10.5 , ’
poolhouse ’ , 24.5]
3 [ ’ hallway ’ , 11.25 , ’ kitchen ’ , 18.0 , ’ chill zone ’ , 20.0 , ’ bedroom ’ , 10.75 , ’ bathroom ’ , 10.5 , ’
poolhouse ’ , 24.5 , ’ garage ’ , 15.45]

1.5.3 Manipulasi lain


In:
1 # Create list areas
2 areas = [11.25 , 18.0 , 20.0 , 10.75 , 9.50]
3
4 # Create a r e a s _ c o p y
5 areas_copy = areas [:]
6
7 # Change a r e a s _ c o p y
8 areas_copy [0] = 5.0
9
10 # Print areas
11 print ( areas )
12
13 # Print a r e a s _ c o p y
14 print ( areas_copy )

Out:
1 [11.25 , 18.0 , 20.0 , 10.75 , 9.5]
2 [5.0 , 18.0 , 20.0 , 10.75 , 9.5]

1.6 Dictionaries
1.6.1 Membuat dictionary
In:
1 # list c o u n t r i e s dan c a p i t a l s
2 countries = [ ’ spain ’ , ’ france ’ , ’ germany ’ , ’ norway ’]
3 capitals = [ ’ madrid ’ , ’ paris ’ , ’ berlin ’ , ’ oslo ’]
4
5 # membuat d i c t i o n a r y europe
6 europe = { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ berlin ’ , ’ norway ’: ’ oslo ’}
7
8 # Print europe
9 print ( europe )

Out:
1 { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ berlin ’ , ’ norway ’: ’ oslo ’}

6
1.6.2 Akses dictionary
In:
1 # d i c t i o n a r y europe
2 europe = { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ berlin ’ , ’ norway ’: ’ oslo ’ }
3
4 # m e n c e t a k keys d i c t i o n a r y europe
5 print ( europe . keys () )
6
7 # m e n c e t a k value dari keys ’ norway ’
8 print ( europe [ ’ norway ’ ])

Out:
1 dict_keys ([ ’ spain ’ , ’ france ’ , ’ germany ’ , ’ norway ’])
2 oslo

1.6.3 Manipulasi dictionary


In:
1 # d i c t i o n a r y europe
2 europe = { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ berlin ’ , ’ norway ’: ’ oslo ’ }
3
4 # m e n a m b a h k a n italy ke d i c t i o n a r y europe
5 europe [ ’ italy ’] = ’ rome ’
6
7 print ( ’ italy ’ in europe )
8
9 # m e n c e t a k value dari key ’ italy ’
10 print ( europe [ ’ italy ’ ])
11
12 # m e n a m b a h k a n poland ke d i c t i o n a r y europe
13 europe [ ’ poland ’] = ’ warsaw ’
14
15 print ( europe )

Out:
1 True
2 rome
3 { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ berlin ’ , ’ norway ’: ’ oslo ’ , ’ italy ’: ’ rome ’ , ’
poland ’: ’ warsaw ’}

In:
1 # d i c t i o n a r y europe
2 europe = { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ bonn ’ ,
3 ’ norway ’: ’ oslo ’ , ’ italy ’: ’ rome ’ , ’ poland ’: ’ warsaw ’ ,
4 ’ australia ’: ’ vienna ’ }
5
6 # Update capital of germany
7 europe . update ({ ’ germany ’: ’ berlin ’ })
8
9 # Remove a u s t r a l i a
10 del ( europe [ ’ australia ’ ])
11
12 # Print europe
13 print ( europe )

Out:
1 { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ berlin ’ , ’ norway ’: ’ oslo ’ , ’ italy ’: ’ rome ’ , ’
poland ’: ’ warsaw ’}

7
1.6.4 Dictionaryception
In:
1 # D i c t i o n a r y of d i c t i o n a r i e s
2 europe = { ’ spain ’: { ’ capital ’: ’ madrid ’ , ’ population ’ :46.77 } ,
3 ’ france ’: { ’ capital ’: ’ paris ’ , ’ population ’ :66.03 } ,
4 ’ germany ’: { ’ capital ’: ’ berlin ’ , ’ population ’ :80.62 } ,
5 ’ norway ’: { ’ capital ’: ’ oslo ’ , ’ population ’ :5.084 } }
6
7
8 # m e n c e t a k ibu kota France
9 print ( europe [ ’ france ’ ][ ’ capital ’ ])
10
11 # membuat sub - d i c t i o n a r y data
12 data = { ’ capital ’: ’ rome ’ , ’ population ’ :59.83}
13
14 # m e n a m b a h k a n data ke d i c t i o n a r y europe di bawah key ’ italy ’
15 europe . update ({ ’ italy ’: data })
16
17 # Print europe
18 print ( europe )

Out:
1 paris
2 { ’ spain ’: { ’ capital ’: ’ madrid ’ , ’ population ’: 46.77} , ’ france ’: { ’ capital ’: ’ paris ’ , ’ population
’: 66.03} , ’ germany ’: { ’ capital ’: ’ berlin ’ , ’ population ’: 80.62} , ’ norway ’: { ’ capital ’: ’ oslo ’ ,
’ population ’: 5.084} , ’ italy ’: { ’ capital ’: ’ rome ’ , ’ population ’: 59.83}}

1.7 Fungsi (Function)


1.7.1 Beberapa contoh function
In:
1 # Membuat v a r i a b e l " var1 " dan " var2 "
2 var1 = [1 , 2 , 3 , 4]
3 var2 = True
4
5 # M e n c e t a k tipe " var1 " dengan fungsi " type "
6 print ( type ( var1 ) )
7
8 # M e n c e t a k panjang dari list " var1 " dengan fungsi " len "
9 print ( len ( var1 ) )
10
11 # K o n v e r s i " var2 " menjadi tipe " int " m e n g g u n a k a n fungsi " int "
12 out2 = int ( var2 )
13
14 # M e n c e t a k tipe dari " var2 "
15 print ( type ( var2 ) )

Out:
1 < class ’ list ’ >
2 4
3 < class ’ bool ’ >

1.7.2 Multiple arguments


In:
1 # Membuat list first dan second
2 first = [11.25 , 18.0 , 20.0]
3 second = [10.75 , 9.50]
4
5 # M e n g g a b u n g k a n list first dan second di dalam list full
6 full = first + second
7
8 # M e n g u r u t k a n list full dari tinggi ke rendah : f u l l _ s o r t e d
9 full_sorted = sorted ( full , reverse = True )
10
11 # Mencetak full_sorted
12 print ( full_sorted )

8
Out:
1 [20.0 , 18.0 , 11.25 , 10.75 , 9.5]

1.7.3 Membuat Fungsi Sederhana


In:
1 # Define shout with the parameter , word
2 def shout ( word ) :
3 """ Return a string with three e x c l a m a t i o n marks """
4 # C o n c a t e n a t e the strings : s h o u t _ w o r d
5 shout_word = word + ’ !!! ’
6 # Replace print with return
7 return ( shout_word )
8 # Pass ’ c o n g r a t u l a t i o n s ’ to shout : yell
9 yell = shout ( ’ c on gr a tu la ti o ns ’)
10 # Print yell
11 print ( yell )

Out:
1 c on gr at u la ti on s !!!

1.7.4 Fungsi dengan Multiple parameter


In:
1 # Define shout with p a r a m e t e r s word1 and word2
2 def shout ( word1 , word2 ) :
3 """ C o n c a t e n a t e strings with three e x c l a m a t i o n marks """
4 # C o n c a t e n a t e word1 with ’!!! ’: shout1
5 shout1 = word1 + ’ !!! ’
6
7 # C o n c a t e n a t e word2 with ’!!! ’: shout2
8 shout2 = word2 + ’ !!! ’
9
10 # C o n c a t e n a t e shout1 with shout2 : n e w _ s h o u t
11 new_shout = shout1 + shout2
12
13 # Return n e w _ s h o u t
14 return new_shout
15
16 # Pass ’ c o n g r a t u l a t i o n s ’ and ’ you ’ to shout () : yell
17 yell = shout ( ’ c on gr a tu la ti o ns ’ , ’ you ’)
18
19 # Print yell
20 print ( yell )

Out:
1 c on gr at u la ti on s !!! you !!!

1.8 Methods
1.8.1 Beberapa method dalam string
In:
1 # Membuat v a r i a b e l string room
2 room = " poolhouse "
3
4 # m e n g g u n a k a n method " upper () "
5 room_up = room . upper ()
6
7 # M e n c e t a k room dan room_up
8 print ( room )
9 print ( room_up )
10
11 # M e n g g u n a k a n method " count " untuk m e n g h i t u n g b a n y a k n y a k a r a k t e r " o " di dalam v a r i a b e l room
12 print ( room . count ( ’o ’) )

9
Out:
1 poolhouse
2 POOLHOUSE
3 3

1.8.2 Methods di dalam list


In:
1 # list areas
2 areas = [11.25 , 18.0 , 20.0 , 10.75 , 9.50]
3
4 # M e n c e t a k indeks dari element 20.0 dengan method " index () "
5 print ( areas . index (20.0) )
6
7 # M e n c e t a k berapa kali element 14.5 muncul dengan method " count () "
8 print ( areas . count (14.5) )

Out:
1 2
2 0

1.8.3 Methods di dalam list (2)


In:
1 # list areas
2 areas = [11.25 , 18.0 , 20.0 , 10.75 , 9.50]
3
4 # M e n g g u n a k a n method " append () "
5 areas . append (24.5)
6 areas . append (15.45)
7
8
9 # M e n c e t a k areas
10 print ( areas )
11
12 # M e n g g u n a k a n method " reverse () "
13 areas . reverse ()
14
15 # M e n c e t a k areas
16 print ( areas )

Out:
1 [11.25 , 18.0 , 20.0 , 10.75 , 9.5 , 24.5 , 15.45]
2 [15.45 , 24.5 , 9.5 , 10.75 , 20.0 , 18.0 , 11.25]

1.9 Packages
1.9.1 Import package
In:
1 # m e n d e f i n i s i k a n v a r i a b e l radius r
2 r = 0.43
3
4 # Import package " math "
5 import math
6
7 # Menghitung C
8 C = 2 * math . pi * r
9
10 # Menghitung A
11 A = math . pi * r **2
12
13 # M e n c e t a k hasil
14 print ( " Circumference : " + str ( C ) )
15 print ( " Area : " + str ( A ) )

10
Out:
1 Circumference : 2 . 7 0 1 7 6 9 6 8 2 0 8 7 2 2 2
2 Area : 0 . 5 8 0 8 8 0 4 8 1 6 4 8 7 5 2 7

1.9.2 Cara lain import package


In:
1 # v a r i a b e l radius r
2 r = 192500
3
4 # Import fungsi " radians " dalam package math
5 from math import radians
6
7 # m e n g h i t u n g v a r i a b e l " dist "
8 dist = r * radians (12)
9
10 # m e n c e t a k " dist "
11 print ( dist )

Out:
1 40317.10572106901

1.10 Numpy
1.10.1 Numpy array
In:
1 # list b a s e b a l l
2 baseball = [180 , 215 , 210 , 210 , 188 , 176 , 209 , 200]
3
4 # import package numpy sebagai np
5 import numpy as np
6
7 # membuat numpy array dari b a s e b a l l : n p _ b a s e b a l l
8 np_baseball = np . array ( baseball )
9
10 # m e n c e t a k tipe n p _ b a s e b a l l
11 print ( type ( np_baseball ) )

Out:
1 < class ’ numpy . ndarray ’ >

1.10.2 Operasi numpy


In:
1 # list height ( inch )
2 height = [74 , 74 , 72 , 72 , 73 , 69 , 69 , 71 , 76 , 71 , 73 , 73 , 74 , 74 , 69]
3
4 # numpy array n p _ h e i g h t
5 np_height = np . array ( height )
6
7 # mencetak np_height
8 print ( np_height )
9
10 # K o n v e r s i n p _ h e i g h t ( inch ) ke n p _ h e i g h t _ m ( meter )
11 np_height_m = np_height *0.0254
12
13 # Mencetak np_height_m
14 print ( np_height_m )

Out:
1 [74 74 72 72 73 69 69 71 76 71 73 73 74 74 69]
2 [1.8796 1.8796 1.8288 1.8288 1.8542 1.7526 1.7526 1.8034 1.9304 1.8034
3 1.8542 1.8542 1.8796 1.8796 1.7526]

11
1.10.3 Operasi numpy (2)
In:
1 # list weight
2 weight = [180 , 215 , 210 , 210 , 188 , 176 , 209 , 200 , 231 , 180 , 188 , 180 , 185 , 160 , 180]
3
4 # numpy array n p _ h e i g h t _ m
5 np_height_m = np . array ( height ) * 0.0254
6
7 # numpy array n p _ w e i g h t _ k g
8 np_weight_kg = np . array ( weight ) * 0.453592
9
10 # m e n g h i t u n g bmi ( body mass index )
11 bmi = np_weight_kg / np_height_m **2
12
13 # m e n c e t a k bmi
14 print ( bmi )

Out:
1 [23.11037639 27.60406069 28.48080465 28.48080465 24.80333518 25.99036864
2 30.86356276 27.89402921 28.11789135 25.10462629 24.80333518 23.7478741
3 23.75233129 20.54255679 26.58105883]

1.10.4 Operasi numpy (3)


In:
1 # m e n g h i t u n g bmi
2 np_height_m = np . array ( height ) * 0.0254
3 np_weight_kg = np . array ( weight ) * 0.453592
4 bmi = np_weight_kg / np_height_m ** 2
5
6 # mencari bmi < 21
7 light = bmi < 21
8
9 # m e n c e t a k light
10 print ( light )

Out:
1 [ False False False False False False False False False False False False
2 False True False ]

1.10.5 Subsetting Numpy Arrays


In:
1 # n p _ w e i g h t dan n p _ h e i g h t
2 np_weight = np . array ( weight )
3 np_height = np . array ( height )
4
5 # m e n c e t a k n p _ w e i g h t pada index 5
6 print ( np_weight [5])
7
8 # m e n c e t a k n p _ h e i g h t dari index 5 sampai 10
9 print ( np_height [8:11])

Out:
1 176
2 [76 71 73]

12
1.11 2D Numpy Arrays
1.11.1 Membuat numpy array 2 dimensi
In:
1 # list of list b a s e b a l l
2 baseball = [[180 , 78.4] ,
3 [215 , 102.7] ,
4 [210 , 98.5] ,
5 [188 , 75.2]]
6
7 # Import numpy
8 import numpy as np
9
10 # 2 D Numpy array n p _ b a s e b a l l
11 np_baseball = np . array ( baseball )
12
13 # m e n c e t a k tipe n p _ b a s e b a l l
14 print ( type ( np_baseball ) )
15
16 # m e n c e t a k shape n p _ b a s e b a l l
17 print ( np_baseball . shape )

Out:
1 < class ’ numpy . ndarray ’ >
2 (4 , 2)

1.11.2 Contoh lain 2D numpy array


In:
1 # list of list b a s e b a l l
2 baseball = [[73 , 188] , [73 , 180] , [74 , 185] , [74 , 160] , [69 , 180] , [70 , 185] , [73 , 189] , [75 ,
185] , [78 , 219] , [79 , 230]]
3
4 # Import numpy
5 import numpy as np
6
7 # 2 d numpy array n p _ b a s e b a l l
8 np_baseball = np . array ( baseball )
9
10 # m e n c e t a k shape n p _ b a s e b a l l
11 print ( np_baseball . shape )

Out:
1 (10 , 2)

1.11.3 Subsetting 2D Numpy Arrays


In:
1 # m e n c e t a k baris ke 5 n p _ b a s e b a l l
2 print ( np_baseball [4 ,:])
3
4 # akses seluruh baris pada kolom ke 2 n p _ b a s e b a l l : n p _ w e i g h t
5 np_weight = np_baseball [: ,1]
6
7 # m e n c e t a k height pada index ke 8
8 print ( np_baseball [8 ,0])

Out:
1 [ 69 180]
2 78

13
1.11.4 Aritmetika numpy
In:
1 # list of list b a s e b a l l
2 baseball = [[74.0 , 180.0 , 22.99] ,
3 [74.0 , 215.0 , 34.69] ,
4 [72.0 , 210.0 , 30.78] ,
5 [72.0 , 210.0 , 35.43] ,
6 [73.0 , 188.0 , 35.71] ,
7 [69.0 , 176.0 , 29.39] ,
8 [69.0 , 209.0 , 30.77] ,
9 [71.0 , 200.0 , 35.07] ,
10 [76.0 , 231.0 , 30.19] ,
11 [73.0 , 188.0 , 23.88]]
12
13 # 2 d numpy array update
14 update = np . array ([[1.2303559 , -11.16224898 , 1.] ,
15 [1.02614252 , 16.09732309 , 1.] ,
16 [0.99484223 , 8.14402711 , 1.] ,
17 [0.99484223 , 8.14402711 , 1.] ,
18 [1.15442283 , 5.08167641 , 1.] ,
19 [1.09349925 , 4.23890778 , 1.] ,
20 [1.09349925 , 4.23890778 , 1.] ,
21 [1.09349925 , 4.23890778 , 1.] ,
22 [0.82285669 , -17.78200035 , 1.] ,
23 [0.99484223 , 8.14402711 , 1.]])

In:
1 # numpy array n p _ b a s e b a l l
2 np_baseball = np . array ( baseball )
3
4 # m e n c e t a k p e n j u m a l a h a n n p _ b a s e b a l l + update
5 print ( np_baseball + update )
6
7 # membuat numpy array c o n v e r s i o n
8 conversion = np . array ([0.0254 , 0.453592 , 1])
9
10 # m e n c e t a k p e r k a l i a n n p _ b a s e b a l l dan c o n v e r s i o n
11 print ( np_baseball * conversion )

Out:
1 [[ 75.2303559 168.83775102 23.99 ]
2 [ 75.02614252 231.09732309 35.69 ]
3 [ 72.99484223 218.14402711 31.78 ]
4 [ 72.99484223 218.14402711 36.43 ]
5 [ 74.15442283 193.08167641 36.71 ]
6 [ 70.09349925 180.23890778 30.39 ]
7 [ 70.09349925 213.23890778 31.77 ]
8 [ 72.09349925 204.23890778 36.07 ]
9 [ 76.82285669 213.21799965 31.19 ]
10 [ 73.99484223 196.14402711 24.88 ]]
11 [[ 1.8796 81.64656 22.99 ]
12 [ 1.8796 97.52228 34.69 ]
13 [ 1.8288 95.25432 30.78 ]
14 [ 1.8288 95.25432 35.43 ]
15 [ 1.8542 85.275296 35.71 ]
16 [ 1.7526 79.832192 29.39 ]
17 [ 1.7526 94.800728 30.77 ]
18 [ 1.8034 90.7184 35.07 ]
19 [ 1.9304 104.779752 30.19 ]
20 [ 1.8542 85.275296 23.88 ]]

14
1.12 Statistik dasar dengan numpy
1.12.1 mean dan median
In:
1 # np_height
2 np_height = np_baseball [: ,0]
3
4 # mean n p _ h e i g h t
5 print ( np . mean ( np_height ) )
6
7 # median n p _ h e i g h t
8 print ( np . median ( np_height ) )

Out:
1 72.3
2 72.5

1.13 Statistik deskriptif numpy


In:
1 # mean height
2 avg = np . mean ( np_baseball [: ,0])
3 print ( " Average : " + str ( avg ) )
4 # median height
5 med = np . median ( np_baseball [: ,0])
6 print ( " Median : " + str ( med ) )
7 # s t a n d a r d deviasi
8 stddev = np . std ( np_baseball [: ,0])
9 print ( " Standard Deviation : " + str ( stddev ) )
10 # c o r e l a s i kolom 0 dan kolom 1
11 corr = np . corrcoef ( np_baseball [: ,0] , np_baseball [: ,1])
12 print ( " Correlation : " + str ( corr ) )

Out:
1 Average : 72.3
2 Median : 72.5
3 Standard Deviation : 2 . 0 9 9 9 9 9 9 9 9 9 9 9 9 9 9 6
4 Correlation : [[1. 0.38329645]
5 [0.38329645 1. ]]

1.13.1 Statistik deskriptif numpy (2)


In:
1 # heights & p o s i t i o n s
2 heights = [191 , 184 , 185 , 180 , 181 , 187 , 170 , 179 , 183 , 186 , 185 , 170 , 187 , 183 , 173 , 188 , 183 ,
180 , 188 , 175 , 193]
3 positions = [ ’ GK ’ , ’M ’ , ’A ’ , ’D ’ , ’M ’ , ’D ’ , ’M ’ , ’M ’ , ’M ’ , ’A ’ , ’M ’ , ’M ’ , ’A ’ , ’A ’ , ’A ’ , ’M ’ , ’D ’ ,
’A ’ , ’D ’ , ’M ’ , ’ GK ’]
4
5 # numpy arrays n p _ p o s i t i o n s & n p _ h e i g h t s
6 np_positions = np . array ( positions )
7 np_heights = np . array ( heights )
8
9 # gk_heights
10 gk_heights = np_heights [ np_positions = = ’ GK ’]
11
12 # other_heights
13 other_heights = np_heights [ np_positions ! = ’ GK ’]
14
15 # median height g o a l k e e p e r s
16 print ( " Median height of goalkeepers : " + str ( np . median ( gk_heights ) ) )
17
18 # median height o t h e r p l a y e r s
19 print ( " Median height of other players : " + str ( np . median ( other_heights ) ) )

15
Out:
1 Median height of goalkeepers : 192.0
2 Median height of other players : 183.0

1.14 Visualisasi dengan matplotlib


1.14.1 Line plot
In:
1 # List year
2 year = [1951 , 1952 , 1953 , 1954 , 1955 , 1956 , 1957 , 1958 , 1959 , 1960 , 1961 , 1962 , 1963 , 1964]
3
4 # List pop
5 pop = [2.57 , 2.62 , 2.67 , 2.71 , 2.76 , 2.81 , 2.86 , 2.92 , 2.97 , 3.03 , 3.08 , 3.14 , 3.2 , 3.26]
6
7 # Import m a t p l o t l i b . pyplot as plt
8 import matplotlib . pyplot as plt
9
10 # membuat line plot : year pada x - axis , pop pada y - axis
11 plt . plot ( year , pop )
12 plt . show ()

Out:
1 < matplotlib . figure . Figure at 0 x248f049fef0 >

1.14.2 Scatter Plot (1)


In:
1 # list gdp_cap
2 gdp_cap = [974.580 , 5937.029 , 6223.367 , 4797.231 , 12779.379 , 34435.367 , 36126.492 , 29796.048 ,
1391.253 ,
3 33692.605 , 1441.284 , 3822.137 , 7446.298 , 12569.851]
4
5 # list l i f e _ e x p
6 life_exp = [43.828 , 76.423 , 72.301 , 42.731 , 75.319 , 81.234 , 79.828 , 75.635 , 64.061 , 79.441 ,
56.728 , 65.554 , 74.852 , 50.728]
7
8 # membuat scatter plot
9 plt . scatter ( gdp_cap , life_exp )
10
11 # scale x axis ke dalam skala l o g a r i t m a
12 plt . xscale ( ’ log ’)
13
14 # m e n a m p i l k a n plot
15 plt . show ()

Out:

16
1.14.3 Scatter plot (2)
In:
1 # contoh scatter plot lain
2 plt . scatter ( pop , life_exp )
3
4 # m e n a m p i l k a n plot
5 plt . show ()

Out:

1.14.4 histogram (1)

1 # membuat h i s t o g r a m dari data l i f e _ e x p


2 plt . hist ( life_exp )
3
4 # menampilkan histogram
5 plt . show ()

17
1.14.5 histogram (2)
In:
1 # membuat h i s t o g r a m dengan 5 bins
2 plt . hist ( life_exp , bins = 5)
3
4 # m e n a m p i l k a n dan m e m b e r s i h k a n plot
5 plt . show ()
6 plt . clf ()
7
8 # membuat h i s t o g r a m dengan 20 bins
9 plt . hist ( life_exp , bins = 20)
10
11 # m e n a m p i l k a n h i s t o g r a m dan m e m b e r s i h k a n ( lagi )
12 plt . show ()
13 plt . clf ()

Out:

1.15 Logika Boolean, Conditional Statement, dan Looping


1.15.1 Equality
In:
1 # m e m b a n d i n g k a n 2 boolean
2 print ( True = = False )
3
4 # m e m b a n d i n g k a n integer
5 print ( -5*15 ! = 75)
6
7 # m e m b a n d i n g k a n string
8 print ( " pyscript " = = " PyScript " )
9
10 # m e m b a n d i n g k a n boolean dan integer
11 print ( True = = 1)
12 print ( True = = 0)
13 print ( False = = 0)
14 print ( False = = 1)

Out:
1 False
2 True
3 False
4 True
5 False
6 True
7 False

18
1.15.2 Greater dan less than
In:
1 # m e m b a n d i n g k a n 2 integer
2 x = -3 * 6
3 print ( x >= -10)
4
5 # m e m b a n d i n g k a n 2 string
6 y = " test "
7 print ( " test " <= y )
8
9 # m e m b a n d i n g k a n 2 boolean
10 print ( True > False )

Out:
1 False
2 True
3 True

1.15.3 and, or, not


In:
1 # v a r i a b e l m y _ k i t c h e n dan y o u r _ k i t c h e n
2 my_kitchen = 18.0
3 your_kitchen = 14.0
4
5 # m y _ k i t c h e n > 10 dan m y _ k i t c h e n < 18
6 print ( my_kitchen > 10 and my_kitchen < 18)
7
8 # m y _ k i t c h e n < 14 atau m y _ k i t c h e n > 17
9 print ( my_kitchen < 14 or my_kitchen > 17)
10
11 # 2 * my_kitchen < 3 * your_kitchen
12 print ( my_kitchen *2 < 3* your_kitchen )

Out:
1 False
2 True
3 True

1.15.4 if
In:
1 # v a r i a b e l room dan area
2 room = " kit "
3 area = 14.0
4
5 # s t a t e m e n t if untuk room
6 if room = = " kit " :
7 print ( " looking around in the kitchen . " )
8
9 # s t a t e m e n t if untuk area
10 if area > 15:
11 print ( " big place ! " )

Out:
1 looking around in the kitchen .

19
1.15.5 if else
In:
1 # if - else untuk room
2 if room = = " kit " :
3 print ( " looking around in the kitchen . " )
4 else :
5 print ( " looking around elsewhere . " )
6
7 # if - else untuk area
8 if area > 15 :
9 print ( " big place ! " )
10 else :
11 print ( " pretty small . " )

Out:
1 looking around in the kitchen .
2 pretty small .

1.15.6 elif
In:
1 # if - elif - else untuk room
2 if room = = " kit " :
3 print ( " looking around in the kitchen . " )
4 elif room = = " bed " :
5 print ( " looking around in the bedroom . " )
6 else :
7 print ( " looking around elsewhere . " )
8
9 # if - elif - else untuk area
10 if area > 15 :
11 print ( " big place ! " )
12 elif area > 10:
13 print ( " medium size , nice ! " )
14 else :
15 print ( " pretty small . " )

Out:
1 looking around in the kitchen .
2 medium size , nice !

1.15.7 while
In:
1 # v a r i a b e l offset
2 offset = 4
3
4 # while loop
5 while offset ! = 0 :
6 print ( " correcting ... " )
7 offset = offset - 1
8 print ( offset )

Out:
1 correcting ...
2 3
3 correcting ...
4 2
5 correcting ...
6 1
7 correcting ...
8 0

20
In:
1 # v a r i a b e l offset
2 offset = -4
3
4 # while loop
5 while offset ! = 0 :
6 print ( " correcting ... " )
7 if offset > 0 :
8 offset = offset - 1
9 else :
10 offset = offset + 1
11 print ( offset )

Out:
1 correcting ...
2 -3
3 correcting ...
4 -2
5 correcting ...
6 -1
7 correcting ...
8 0

1.15.8 for
In:
1 # list areas
2 areas = [11.25 , 18.0 , 20.0 , 10.75 , 9.50]
3
4 # for loop
5 for area in areas :
6 print ( area )

Out:
1 11.25
2 18.0
3 20.0
4 10.75
5 9.5

In:
1 # list areas
2 areas = [11.25 , 18.0 , 20.0 , 10.75 , 9.50]
3
4 # for loop m e n g g u n a k a n e n u m e r a t e ()
5 for index , a in enumerate ( areas ) :
6 print ( " room " + str ( index ) + " : " + str ( a ) )

Out:
1 room 0: 11.25
2 room 1: 18.0
3 room 2: 20.0
4 room 3: 10.75
5 room 4: 9.5

In:
1 # lit areas
2 areas = [11.25 , 18.0 , 20.0 , 10.75 , 9.50]
3
4 # for loop
5 for index , area in enumerate ( areas ) :
6 print ( " room " + str ( index +1) + " : " + str ( area ) )

Out:
1 room 1: 11.25
2 room 2: 18.0
3 room 3: 20.0
4 room 4: 10.75
5 room 5: 9.5

21
In:
1 # list of list house
2 house = [[ " hallway " , 11.25] ,
3 [ " kitchen " , 18.0] ,
4 [ " living room " , 20.0] ,
5 [ " bedroom " , 10.75] ,
6 [ " bathroom " , 9.50]]
7
8 # for loop
9 for x in house :
10 print ( " the " + str ( x [0]) + " is " + str ( x [1]) + " sqm " )

Out:
1 the hallway is 11.25 sqm
2 the kitchen is 18.0 sqm
3 the living room is 20.0 sqm
4 the bedroom is 10.75 sqm
5 the bathroom is 9.5 sqm

In:
1 # d i c t i o n a r y europe
2 europe = { ’ spain ’: ’ madrid ’ , ’ france ’: ’ paris ’ , ’ germany ’: ’ bonn ’ ,
3 ’ norway ’: ’ oslo ’ , ’ italy ’: ’ rome ’ , ’ poland ’: ’ warsaw ’ , ’ australia ’: ’ vienna ’ }
4
5 # Iterasi europe
6 for country , city in europe . items () :
7 print ( " the capital of " + country + " is " + city )

Out:
1 the capital of spain is madrid
2 the capital of france is paris
3 the capital of germany is bonn
4 the capital of norway is oslo
5 the capital of italy is rome
6 the capital of poland is warsaw
7 the capital of australia is vienna

1.16 Pandas
1.16.1 Dictionary to DataFrame
In:
1 # Pre - defined lists
2 names = [ ’ United States ’ , ’ Australia ’ , ’ Japan ’ , ’ India ’ , ’ Russia ’ , ’ Morocco ’ , ’ Egypt ’]
3 dr = [ True , False , False , False , True , True , True ]
4 cpc = [809 , 731 , 588 , 18 , 200 , 70 , 45]
5
6 # Import pandas as pd
7 import pandas as pd
8
9 # membuat d i c t i o n a r y my_dict
10 my_dict = { ’ country ’: names , ’ drives_right ’: dr , ’ cars_per_cap ’: cpc }
11
12 # membuat D a t a F r a m e cars dari my_dict : cars
13 cars = pd . DataFrame ( my_dict )
14
15 # Print cars
16 print ( cars )

Out:
1 cars_per_cap country drives_right
2 0 809 United States True
3 1 731 Australia False
4 2 588 Japan False
5 3 18 India False
6 4 200 Russia True
7 5 70 Morocco True
8 6 45 Egypt True

22
In:
1 # row_labels
2 row_labels = [ ’ US ’ , ’ AUS ’ , ’ JAP ’ , ’ IN ’ , ’ RU ’ , ’ MOR ’ , ’ EG ’]
3
4 # m e n j a d i k a n r o w _ l a b e l s sebagai index D a t a F r a m e cars
5 cars . index = row_labels
6
7 # Print cars
8 print ( cars )

Out:
1 cars_per_cap country drives_right
2 US 809 United States True
3 AUS 731 Australia False
4 JAP 588 Japan False
5 IN 18 India False
6 RU 200 Russia True
7 MOR 70 Morocco True
8 EG 45 Egypt True

1.16.2 CSV to DataFrame (1)


In:
1 # Import pandas as pd
2 import pandas as pd
3
4 # Import data cars . csv
5 cars = pd . read_csv ( " cars . csv " )
6
7 # m e n c e t a k cars
8 print ( cars )

Out:
1 Unnamed : 0 cars_per_cap country drives_right
2 0 US 809 United States True
3 1 AUS 731 Australia False
4 2 JAP 588 Japan False
5 3 IN 18 India False
6 4 RU 200 Russia True
7 5 MOR 70 Morocco True
8 6 EG 45 Egypt True

1.16.3 CSV to DataFrame (2)


In:
1 # m e m p e r b a i k i import csv dengan m e l i b a t k a n i n d e x _ c o l
2 cars = pd . read_csv ( ’ cars . csv ’ , index_col = 0)
3
4 # m e n c e t a k cars
5 print ( cars )

Out:
1 cars_per_cap country drives_right
2 US 809 United States True
3 AUS 731 Australia False
4 JAP 588 Japan False
5 IN 18 India False
6 RU 200 Russia True
7 MOR 70 Morocco True
8 EG 45 Egypt True

23
1.16.4 Square brackets
In:
1 # Print out country column as Pandas Series
2 print ( cars [ ’ country ’ ])

Out:
1 US United States
2 AUS Australia
3 JAP Japan
4 IN India
5 RU Russia
6 MOR Morocco
7 EG Egypt
8 Name : country , dtype : object

In:
1 # Print out country column as Pandas D a t a F r a m e
2 print ( cars [[ ’ country ’ ]])

Out:
1 country
2 US United States
3 AUS Australia
4 JAP Japan
5 IN India
6 RU Russia
7 MOR Morocco
8 EG Egypt

In:
1 # Print out D a t a F r a m e with country and d r i v e s _ r i g h t columns
2 print ( cars [[ ’ country ’ , ’ drives_right ’ ]])

Out:
1 country drives_right
2 US United States True
3 AUS Australia False
4 JAP Japan False
5 IN India False
6 RU Russia True
7 MOR Morocco True
8 EG Egypt True

In:
1 # Print out first 3 o b s e r v a t i o n s
2 print ( cars . iloc [:3])

Out:
1 cars_per_cap country drives_right
2 US 809 United States True
3 AUS 731 Australia False
4 JAP 588 Japan False

In:
1 # Print out fourth , fifth and sixth o b s e r v a t i o n
2 print ( cars . iloc [3:6])

Out:
1 cars_per_cap country drives_right
2 IN 18 India False
3 RU 200 Russia True
4 MOR 70 Morocco True

24
1.16.5 loc (1)
In:
1 # m e n c e t a k data cars dari Japan
2 print ( cars . loc [ ’ JAP ’ ])
3
4 # m e n c e t a k data cars dari A u s t r a l i a dan Egypt
5 print ( cars . loc [[ ’ AUS ’ , ’ EG ’ ]])

Out:
1 ars_per_cap 588
2 country Japan
3 drives_right False
4 Name : JAP , dtype : object
5 cars_per_cap country drives_right
6 AUS 731 Australia False
7 EG 45 Egypt True

1.16.6 loc (2)


In:
1 # M e n c e t a k d r i v e s _ r i g h t untuk negara Morocco
2 print ( cars . loc [ ’ MOR ’ , ’ drives_right ’ ])
3
4 # M e n c e t a k sub - D a t a F r a m e
5 print ( cars . loc [[ ’ RU ’ , ’ MOR ’] ,[ ’ country ’ , ’ drives_right ’ ]])

Out:
1 True
2 country drives_right
3 RU Russia True
4 MOR Morocco True

1.16.7 loc (3)


In:
1 # m e n c e t a k kolom d r i v e s _ r i g h t sebagai Series
2 print ( cars . loc [: , ’ drives_right ’ ])

Out:
1 US True
2 AUS False
3 JAP False
4 IN False
5 RU True
6 MOR True
7 EG True
8 Name : drives_right , dtype : bool

In:
1 # m e n c e t a k kolom d r i v e s _ r i g h t sebagai D a t a F r a m e
2 print ( cars . loc [: ,[ ’ drives_right ’ ]])

Out:
1 drives_right
2 US True
3 AUS False
4 JAP False
5 IN False
6 RU True
7 MOR True
8 EG True

25
In:
1 # m e n c e t a k kolom c a r s _ p e r _ c a p dan d r i v e s _ r i g h t sebagai D a t a F r a m e
2 print ( cars . loc [: ,[ ’ cars_per_cap ’ , ’ drives_right ’ ]])

Out:
1 cars_per_cap drives_right
2 US 809 True
3 AUS 731 False
4 JAP 588 False
5 IN 18 False
6 RU 200 True
7 MOR 70 True
8 EG 45 True

1.16.8 iloc
In:
1 # m e n c e t a k o b s e r v a s i indeks 2
2 print ( cars . iloc [2])

Out:
1 cars_per_cap 588
2 country Japan
3 drives_right False
4 Name : JAP , dtype : object

1.16.9 loop
In:
1 # Import cars data
2 import pandas as pd
3 cars = pd . read_csv ( ’ cars . csv ’ , index_col = 0)
4
5 # Adapt for loop
6 for lab , row in cars . iterrows () :
7 print ( lab + " : " + str ( row [ ’ cars_per_cap ’ ]) )

Out:
1 US : 809
2 AUS : 731
3 JAP : 588
4 IN : 18
5 RU : 200
6 MOR : 70
7 EG : 45

In:
1 # Import cars data
2 import pandas as pd
3 cars = pd . read_csv ( ’ cars . csv ’ , index_col = 0)
4
5 # Code for loop that adds COUNTRY column
6 for lab , row in cars . iterrows () :
7 cars . loc [ lab , " COUNTRY " ] = row [ ’ country ’ ]. upper ()
8
9
10 # Print cars
11 print ( cars )

26
Out:
1 cars_per_cap country drives_right COUNTRY
2 US 809 United States True UNITED STATES
3 AUS 731 Australia False AUSTRALIA
4 JAP 588 Japan False JAPAN
5 IN 18 India False INDIA
6 RU 200 Russia True RUSSIA
7 MOR 70 Morocco True MOROCCO
8 EG 45 Egypt True EGYPT

In:
1 # Import cars data
2 import pandas as pd
3 cars = pd . read_csv ( ’ cars . csv ’ , index_col = 0)
4
5 # M e n g g u n a k a n . apply ( str . upper )
6 cars [ " COUNTRY " ] = cars [ ’ country ’ ]. apply ( str . upper )

27
2 Implementasi Python untuk Klasifikasi
2.1 Import libraries
In:
1 % pylab inline
2 import pandas as pd
3 import numpy as np
4 import matplotlib . pyplot as plt
5 import seaborn as sns
6 from sklearn . preprocessing import LabelEncoder
7 from sklearn . mo de l _s el ec t io n import GridSearchCV , t r a i n _ t e s t _ sp l i t
8 from sklearn . tree import D e c i s i o n T r e e C l a s s i f i e r
9 from sklearn . ensemble import RandomForestClassifier , BaggingClassifier , \
10 AdaBoostClassifier , G r a d i e n t B o o s t i n g C l a s s i f i e r
11 pylab . rcParams [ ’ figure . figsize ’] = (10 , 7)
12 import warnings
13 warnings . filterw arnings ( ’ ignore ’)

2.2 Load dataset


In:
1 data = pd . read_csv ( ’ dataset / loan_dataset . csv ’ , index_col = ’ Loan_ID ’)
2 data . head ()

In:
1 data . shape

Out:
1 (480 , 12)

In:
1 data . info ()

Out:
1 < class ’ pandas . core . frame . DataFrame ’ >
2 Index : 480 entries , LP001003 to LP002990
3 Data columns ( total 12 columns ) :
4 Gender 480 non - null object
5 Married 480 non - null object
6 Dependents 480 non - null object
7 Education 480 non - null object
8 Self_Employed 480 non - null object
9 A pp li ca n tI nc om e 480 non - null int64
10 CoapplicantIncome 480 non - null float64
11 LoanAmount 480 non - null int64
12 Loan_Amount_Term 480 non - null int64
13 Credi t_Histo ry 480 non - null int64
14 Property_Area 480 non - null object
15 Loan_Status 480 non - null object
16 dtypes : float64 (1) , int64 (4) , object (7)
17 memory usage : 48.8+ KB

2.3 Eksplorasi Data


2.3.1 Distribusi Kelas
In:
1 data [ ’ Loan_Status ’ ]. value_counts ()

Out:
1 Y 332
2 N 148
3 Name : Loan_Status , dtype : int64

28
In:
1 data [ ’ Loan_Status ’ ]. value_counts ( normalize = True )

Out:
1 Y 0.691667
2 N 0.308333
3 Name : Loan_Status , dtype : float64

In:
1 sns . countplot ( x = data . Loan_Status )
2 plt . show ()

Out:

2.3.2 Matrix Plot


In:
1 sns . pairplot ( data = data [[ ’ A pp l ic an tI n co me ’ , ’ C o a p p l i c a n t I n c o m e ’ , ’ LoanAmount ’ , ’ Loan_Status ’ ]] ,
hue = ’ Loan_Status ’)
2 plt . show ()

Out:

29
2.4 Feature Engineering
In:
1 data [ ’ TotalIncome ’] = data [ ’ A pp li c an tI nc o me ’ ]+ data [ ’ C o a p p l i c a n t I n c o m e ’]
2 data [ ’ Loan / Income ’] = data [ ’ LoanAmount ’ ]/ data [ ’ TotalIncome ’]
3 data [ ’ Loan / Term ’] = data [ ’ LoanAmount ’ ]/ data [ ’ L o a n _ A m o u n t _ T e r m ’]
4 data [ ’ R epayment Ratio ’] = ( data [ ’ Loan / Term ’ ]*1000) / data [ ’ TotalIncome ’]

2.5 Encode Variabel Kategorik


In:
1 var_kategori = [ ’ Gender ’ , ’ Married ’ , ’ Education ’ , ’ Self_Employed ’ , ’ Dependents ’ , ’ Cre dit_Hist ory ’ , ’
Loan_Status ’ , ’ Property_Area ’]
2 for feature in var_kategori :
3 if feature in data . columns . values :
4 data [ feature ] = LabelEncoder () . fit_transform ( data [ feature ])

2.6 Train-Test Split


In:
1 y = data [ ’ Loan_Status ’]
2 x = data . drop ([ ’ Loan_Status ’] , axis = 1)
3 x_train , x_test , y_train , y_test = t r a i n _ t e s t _ s p l i t (x , y , test_size = 0.25 , random_state = 123)

2.7 Training Model dan Prediksi


2.7.1 Decision Tree Classifier
In:
1 DecisionTree = D e c i s i o n T r e e C l a s s i f i e r ( random_state = 123)
2 DecisionTree . fit ( x_train , y_train )

Out:
1 D e c i s i o n T r e e C l a s s i f i e r ( class_weight = None , criterion = ’ gini ’ , max_depth = None ,
2 max_features = None , m ax_leaf_ nodes = None ,
3 m i n _ i m p u r i t y _ d e c r e a s e = 0.0 , m i n _ i m p u r i t y _ s p l i t = None ,
4 m i n _ s a m p l e s _ l e af = 1 , m i n _ s a m p l e s _ s p l i t = 2 ,
5 m i n _ w e i g h t _ f r a c t i o n _ l e a f = 0.0 , presort = False , random_state = 123 ,
6 splitter = ’ best ’)

In:
1 DecisionTree . score ( x_train , y_train )

Out:
1 1.0

In:
1 DecisionTree . score ( x_test , y_test )

Out:
1 0.6916666666666667

2.7.2 Bagging
In:
1 Bagging = B a g g i n g C l a s s i f i e r ( random_state = 123)
2 Bagging . fit ( x_train , y_train )

30
Out:
1 B a g g i n g C l a s s i f i e r ( base_es timator = None , bootstrap = True ,
2 b o o t s t r a p _ f e a t u r e s = False , max_features = 1.0 , max_samples = 1.0 ,
3 n_estimators = 10 , n_jobs = 1 , oob_score = False , random_state = 123 ,
4 verbose = 0 , warm_start = False )

In:
1 Bagging . score ( x_train , y_train )

Out:
1 0.9888888888888889

In:
1 Bagging . score ( x_test , y_test )

Out:
1 0.7666666666666667

2.7.3 Random Forest


In:
1 RF = R a n d o m F o r e s t C l a s s i f i e r ( random_state = 123)
2 RF . fit ( x_train , y_train )

Out:
1 R a n d o m F o r e s t C l a s s i f i e r ( bootstrap = True , class_weight = None , criterion = ’ gini ’ ,
2 max_depth = None , max_features = ’ auto ’ , max_ leaf_no des = None ,
3 m i n _ i m p u r i t y _ d e c r e a s e = 0.0 , m i n _ i m p u r i t y _ s p l i t = None ,
4 m i n _ s a m p l e s _ l e af = 1 , m i n _ s a m p l e s _ s p l i t = 2 ,
5 m i n _ w e i g h t _ f r a c t i o n _ l e a f = 0.0 , n_estimators = 10 , n_jobs = 1 ,
6 oob_score = False , random_state = 123 , verbose = 0 , warm_start = False )

In:
1 RF . score ( x_train , y_train )

Out:
1 0.9972222222222222

In:
1 RF . score ( x_test , y_test )

Out:
1 0.7916666666666666

2.7.4 Gradient Boosting


In:
1 GB = G r a d i e n t B o o s t i n g C l a s s i f i e r ( random_state = 123)
2 GB . fit ( x_train , y_train )

Out:
1 G r a d i e n t B o o s t i n g C l a s s i f i e r ( criterion = ’ friedman_mse ’ , init = None ,
2 learning_rate = 0.1 , loss = ’ deviance ’ , max_depth = 3 ,
3 max_features = None , m ax_leaf_ nodes = None ,
4 m i n _ i m p u r i t y _ d e c r e a s e = 0.0 , m i n _ i m p u r i t y _ s p l i t = None ,
5 min_samples_leaf =1, min_samples_split =2,
6 m i n _ w e i g h t _ f r a c t i o n _ l e a f = 0.0 , n_estimators = 100 ,
7 presort = ’ auto ’ , random_state = 123 , subsample = 1.0 , verbose = 0 ,
8 warm_start = False )

31
In:
1 GB . score ( x_train , y_train )

Out:
1 0.9416666666666667

In:
1 GB . score ( x_test , y_test )

Out:
1 0.7833333333333333

2.7.5 Adaptive Boosting


In:
1 AB = A d a B o o s t C l a s s i f i e r ( random_state = 123)
2 AB . fit ( x_train , y_train )

Out:
1 A d a B o o s t C l a s s i f i e r ( algorithm = ’ SAMME .R ’ , b ase_esti mator = None ,
2 learning_rate = 1.0 , n_estimators = 50 , random_state = 123)

In:
1 AB . score ( x_train , y_train )

Out:
1 0.8861111111111111

In:
1 AB . score ( x_test , y_test )

Out:
1 0.7583333333333333

2.8 Model Tuning


2.8.1 Decision Tree
In:
1 def GridSearch (x , y , model , parameters ) :
2 clf = GridSearchCV ( model , parameters , scoring = ’ accuracy ’ , n_jobs = -1 , cv = 5 , verbose = 1)
3 clf . fit (x , y )
4 print ( " Best Score : " + str ( clf . best_score_ ) )
5 print ( " Best Params : " + str ( clf . best_params_ ) )
6 return ( clf )

In:
1 ListParams = {
2 ’ criterion ’: [ ’ gini ’ , ’ entropy ’] ,
3 ’ splitter ’: [ ’ best ’ , ’ random ’] ,
4 ’ max_features ’: [ ’ auto ’ , ’ sqrt ’ , ’ log2 ’ , None ] ,
5 ’ max_depth ’ :[3 ,6 ,9] ,
6 ’ class_weight ’ :[ ’ balanced ’ , None ]
7 }
8
9 B e s t D e c i s i o n T r e e = GridSearch ( x_train , y_train , D e c i s i o n T r e e C l a s s i f i e r ( random_state = 123) ,
ListParams )

32
Out:
1 Fitting 5 folds for each of 96 candidates , totalling 480 fits
2 [ Parallel ( n_jobs = -1) ]: Done 34 tasks | elapsed : 4.7 s
3 Best Score : 0 . 8 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7
4 Best Params : { ’ class_weight ’: None , ’ criterion ’: ’ entropy ’ , ’ max_depth ’: 3 , ’ max_features ’: None ,
’ splitter ’: ’ random ’}
5 [ Parallel ( n_jobs = -1) ]: Done 480 out of 480 | elapsed : 5.1 s finished

In:
1 B e s t D e c i s i o n T r e e . score ( x_train , y_train )

Out:
1 0.8333333333333334

In:
1 B e s t D e c i s i o n T r e e . score ( x_test , y_test )

Out:
1 0.7833333333333333

2.8.2 Random Forest


In:
1 ListParams = {
2 ’ n_estimators ’: [50 , 75 , 100 , 200] ,
3 ’ max_depth ’ :[1 , 5 , 10 , 15 , 20 , 25 , 30] ,
4 ’ m i n _ s a m p l e s _ l e a f ’ : [1 , 2 , 4 , 6 , 8 , 10] ,
5 ’ max_features ’: [0.1 , ’ sqrt ’ , ’ log2 ’ , None ]
6 }
7
8 BestRF = GridSearch ( x_train , y_train , R a n d o m F o r e s t C l a s s i f i e r ( random_state = 123) , ListParams )

Out:
1 Fitting 5 folds for each of 672 candidates , totalling 3360 fits
2 [ Parallel ( n_jobs = -1) ]: Done 34 tasks | elapsed : 5.8 s
3 [ Parallel ( n_jobs = -1) ]: Done 184 tasks | elapsed : 10.3 s
4 [ Parallel ( n_jobs = -1) ]: Done 434 tasks | elapsed : 18.0 s
5 [ Parallel ( n_jobs = -1) ]: Done 784 tasks | elapsed : 29.7 s
6 [ Parallel ( n_jobs = -1) ]: Done 1234 tasks | elapsed : 46.4 s
7 [ Parallel ( n_jobs = -1) ]: Done 1784 tasks | elapsed : 1.2 min
8 [ Parallel ( n_jobs = -1) ]: Done 2434 tasks | elapsed : 1.7 min
9 [ Parallel ( n_jobs = -1) ]: Done 3184 tasks | elapsed : 2.2 min
10 [ Parallel ( n_jobs = -1) ]: Done 3360 out of 3360 | elapsed : 2.4 min finished
11 Best Score : 0 . 8 2 7 7 7 7 7 7 7 7 7 7 7 7 7 7
12 Best Params : { ’ max_depth ’: 5 , ’ max_features ’: None , ’ min_samples_leaf ’: 4 , ’ n_estimators ’: 200}

In:
1 BestRF . score ( x_train , y_train )

Out:
1 0.8444444444444444

In:
1 BestRF . score ( x_test , y_test )

Out:
1 0.7833333333333333

33
2.8.3 Adapative Boosting
In:
1 ListParams = {
2 ’ n_estimators ’: [50 , 75 , 100 , 150 , 200] ,
3 ’ learning_rate ’: [0.001 , 0.01 , 0.1 , 1] ,
4 ’ algorithm ’ : [ ’ SAMME ’ , ’ SAMME . R ’]
5 }
6
7 BestAB = GridSearch ( x_train , y_train , A d a B o o s t C l a s s i f i e r ( random_state = 123) , ListParams )

Out:
1 Fitting 5 folds for each of 40 candidates , totalling 200 fits
2 [ Parallel ( n_jobs = -1) ]: Done 34 tasks | elapsed : 6.5 s
3 [ Parallel ( n_jobs = -1) ]: Done 184 tasks | elapsed : 12.6 s
4 Best Score : 0 . 8 1 6 6 6 6 6 6 6 6 6 6 6 6 6 7
5 Best Params : { ’ algorithm ’: ’ SAMME ’ , ’ learning_rate ’: 0.001 , ’ n_estimators ’: 50}
6 [ Parallel ( n_jobs = -1) ]: Done 200 out of 200 | elapsed : 13.7 s finished

In:
1 BestAB . score ( x_train , y_train )

Out:
1 0.8166666666666667

In:
1 BestAB . score ( x_test , y_test )

Out:
1 0.7833333333333333

2.8.4 Gradient Boosting


In:
1 ListParams = {
2 ’ loss ’: [ ’ deviance ’ , ’ exponential ’] ,
3 ’ learning_rate ’: [0.001 , 0.01 , 0.1] ,
4 ’ n_estimators ’: [50 , 75 , 100 , 200] ,
5 ’ max_depth ’ :[3 , 5 , 7] ,
6 ’ subsample ’: [0.5 , 0.75 , 1] ,
7 ’ max_features ’: [0.1 , ’ sqrt ’ , ’ log2 ’ , None ]
8 }
9
10 BestGB = GridSearch ( x_train , y_train , G r a d i e n t B o o s t i n g C l a s s i f i e r ( random_state = 123) , ListParams )

Out:
1 Fitting 5 folds for each of 864 candidates , totalling 4320 fits
2 [ Parallel ( n_jobs = -1) ]: Done 34 tasks | elapsed : 5.8 s
3 [ Parallel ( n_jobs = -1) ]: Done 282 tasks | elapsed : 9.6 s
4 [ Parallel ( n_jobs = -1) ]: Done 673 tasks | elapsed : 26.5 s
5 [ Parallel ( n_jobs = -1) ]: Done 1023 tasks | elapsed : 36.8 s
6 [ Parallel ( n_jobs = -1) ]: Done 1473 tasks | elapsed : 57.3 s
7 [ Parallel ( n_jobs = -1) ]: Done 2266 tasks | elapsed : 1.4 min
8 [ Parallel ( n_jobs = -1) ]: Done 3269 tasks | elapsed : 2.0 min
9 [ Parallel ( n_jobs = -1) ]: Done 4320 out of 4320 | elapsed : 2.7 min finished
10 Best Score : 0 . 8 1 9 4 4 4 4 4 4 4 4 4 4 4 4 4
11 Best Params : { ’ learning_rate ’: 0.01 , ’ loss ’: ’ deviance ’ , ’ max_depth ’: 3 , ’ max_features ’: None , ’
n_estimators ’: 200 , ’ subsample ’: 0.5}

In:
1 BestGB . score ( x_train , y_train )

Out:
1 0.9138888888888889

34
In:
1 BestGB . score ( x_test , y_test )

Out:
1 0.8

35

You might also like