Data and Dataset
With Introduction to Python Programming
Catherine Olivia Sereati
Tepercaya Kualitas Lulusannya
Review : Machine Learning Process
Feature Model
Data Model Model
Data cleansing extraction and deployment and
collection training evaluation
selection integration
Feedback and iteration
Tepercaya Kualitas Lulusannya
What is data:
information such as facts and numbers used to analyze something or
make decisions
data can come in many forms, such as text, images, audio, or video, and is
processed by AI algorithms to identify patterns and relationships that can
be used to make predictions, categorize information, or solve problems.
Data is essential to the development of AI, as it allows machines to learn
from experience and improve their performance over time.
Data is the first element for AI. There’s no AI without data
Tepercaya Kualitas Lulusannya
Example data in Machine Learning :
Tepercaya Kualitas Lulusannya
What is Dataset?
• A dataset is a collection of data that is organized in a
specific way for a specific purpose.
• It can be in the form of a spreadsheet, a table, a
database or any other structured format.
• Datasets can be used for a variety of purposes,
including statistical analysis, research, machine
learning, and more.
• They can be small or large, and can contain many types
of data, such as numerical, categorical, or textual data.
• Examples of datasets include weather data, census
data, customer data, and scientific research data.
Datasets are an essential tool for analyzing and
understanding complex information.
Tepercaya Kualitas Lulusannya
Data in Artificial Intelligence :
• Data Training
• enable machine learning algorithms to learn patterns, correlations, and
relationships within the data
• improving their ability to make accurate predictions or classifications when
presented with new, unseen data.
• Data Testing
• Evaluate the performance of Machine Learning
• Error analysis
Tepercaya Kualitas Lulusannya
Example of dataset
Tepercaya Kualitas Lulusannya
How to Acquiring data (1) :
Manual Labeling :
CAT Not CAT CAT Not CAT
Observing Behaviour (Phenomenon)
Tepercaya Kualitas Lulusannya
How to Acquiring data (2) :
• Download from website / partnership : example : kaggle.com
Tepercaya Kualitas Lulusannya
Popular Data set
• Iris Data set : flower recognition
• MNIST Data set : Hand writing recognition
• CIFAR 10 data set : colour and image recognition
(https://medium.com/analytics-vidhya/exploration-of-iris-dataset-using-scikit-learn-part-1-8ac5604937f8)
Source : wikipedia
Tepercaya Kualitas Lulusannya
Introduction to Python
Programming
Tepercaya Kualitas Lulusannya
History of Python
Founder Guido van Rossum
Master of Mathematics、Master of Computer Science
Degree
When and Where Created in Amsterdam during Christmas in 1989.
Meaning of Name A big fan of Monty Python's Flying Circus
Influenced by Modula-3, Python is a descendant of ABC
Origin that would appeal to Unix/C hackers.
Tepercaya Kualitas Lulusannya
What is Python?
• Python is a programming language.
• Python is a general-purpose and advanced programming language.
• Python applies to programming in many fields:
• web development (server-side),
• software development,
• mathematics,
• system scripting.
Tepercaya Kualitas Lulusannya
Python Application Domains
Python has rich third-party libraries and advantages. Therefore, Python
can be used in many fields as follows:
AI : libraries for Machine Learning : TensorFlow, Pytorch, Scikit-Learn
Data science : Numpy, Pandas, Matplot-lib
Software Programming : General Purposes Language Programming
Web Development : web application framework : ex Django,
TurboGears, CherryPy, tec
Web Crawling : gather data from website ; scrappy : python
framework for large scale web-craws
Cloud Computing : Openstack : free and open source for cloud
operating system
Tepercaya Kualitas Lulusannya
Python compares with other language programming
• Python is simpler than Java and C
• Python has much bigger standard
library (and free), while Java and C
have limited access to library
(some of them are paid)
Tepercaya Kualitas Lulusannya
IDE’s
• IDE : Integrated Development Environment
• Increase programmer productivity by integrating common
programming activities into a single software
This Photo by Unknown Author is licensed under CC BY-SA
• We can edit source code, run the program, Debug
Common Python IDE’s :
• Pycharm
• Jupyter Notebook
• Spyder
• Google Colab No need to download
Tepercaya Kualitas Lulusannya
Python Common
data types
• Number
• String
• List
• Tuple
• Dictionary
• set
Tepercaya Kualitas Lulusannya
Python data types are
classified into the following
types:
• Sequential: Subscripts (indexes) can be used to access
elements. You can access elements in slice mode such
as [start:stop:step].
• Nonsequential: Subscripts (indexes) cannot be used to
access elements.
• Changeable: Values can be modified.
• Unchangeable:Values cannot be modified.
Tepercaya Kualitas Lulusannya
Number, basic operation
Addition (+)
Subtraction (-)
Multiplication (x)
Division (/)
Modulo/Rounding (%, //)
Power (**)
If the operation is performed on numbers of different
types (such as int and float), the result type is the type
with higher precision.
Tepercaya Kualitas Lulusannya
Math operation
Tepercaya Kualitas Lulusannya
STRING
• In Python, a string is a sequence with multiple characters.
• The number of characters indicates the length of the string.
• Python does not have a character data type. A single character is
considered as a string with length 1.
• To declare a string, you only need to
• use single quotation marks ('...') or
• double quotation marks ("...") to enclose the content, or
• three consecutive quotation marks ('''...''' or """...""").
Tepercaya Kualitas Lulusannya
String Operator (1)
• : Two strings are concatenated.
• Example: a="hello";b="world" =>a+b= 'helloworld'.
• *: A new string is obtained by multiplying a string by a number.
• Example: "a"*2=>"aa"
Tepercaya Kualitas Lulusannya
String Operator (2)
Tepercaya Kualitas Lulusannya
String Operator (3)
Tepercaya Kualitas Lulusannya
String Operator (4)
Tepercaya Kualitas Lulusannya
List
• A list is a sequence in which elements can be of any data type and
elements can be added or deleted at any time.
• In a list, elements are enclosed by a pair of square brackets and are
separated by commas (,). You can create a list in either of the
following ways:
• List = list(obj1, obj2,…)
• List = [obj1, obj2, ….]
• Operator:
• +: Combines lists, for example, the result of [1,2]+[2,3] is [1, 2, 2, 3].
• x: Multiplies a list by a number to obtain a new list, for example, the result of [1,2] x 2 is [1, 2,
1, 2].
Tepercaya Kualitas Lulusannya
List Operator (1)
Tepercaya Kualitas Lulusannya
List Operator (2)
Tepercaya Kualitas Lulusannya
Example Operator List
animals = ['cat', 'dog', 'monkey']
# list.append(obj): Add a new object to the end of the list.
animals.append ('fish') # Append an element.
print(animals) # Output: ['cat', 'dog', 'monkey', ‘fish’]
# list.remove(obj): Remove the first match for a value in the list.
animals.remove ('fish') # Delete element fish.
print(animals) # Output: ['cat', 'dog', 'monkey']
# list.insert(index, obj): Insert a specified object to a specified position in the list. The index indicates
the position.
animals.insert (1, 'fish') # Insert element fish at subscript 1.
print(animals) # Output: ['cat', ‘fish’, 'dog', 'monkey']
Tepercaya Kualitas Lulusannya
Next
• Conditioning and looping statement
• Python’s Machine Learning Library
Tepercaya Kualitas Lulusannya
Thank you
Tepercaya Kualitas Lulusannya