0% found this document useful (0 votes)
4 views25 pages

Module 12 Python

Uploaded by

Anjum Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views25 pages

Module 12 Python

Uploaded by

Anjum Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Lesson 1 of 3

Python is an interpreted, object-oriented, and high-level programming


language.
In other words, Python is a programming language that was designed in such a
way, so that a program or code should be as simple as a spoken language.
In fact, in 1999, Guido Van Rossum, the creator of Python, set the following goals
for it:
 bullet
An easy and intuitive language powerful as those offered by major competitors.
 bullet
An open-source language, so anyone can contribute to its development.
 bullet
A code that is understandable as plain English
 bullet
A language suitable for everyday tasks and short development times
Now, more than two decades later, it looks like Van Rossum did achieve what he
wanted to "A language for everyone."
Why Python?
Why is it that everyone wants to use Python? Well, there are many reasons for
using Python, some of which are:
Easy to learn
Click to flip
Python takes less time to learn and write code
Click to flip
Easy to use
Python has a simple syntax and is easy to write new software.
Easy to understand
Python is easy to read or modify your or someone else's code.
Great community and excellent library
Python has excellent community support for developing code. It has innumerable
third-party libraries contributed by people around the globe.
Some Programming Languages for Data Science

Python is the most popular language used in Data Sciences, but it is not the only
language. There are other languages too, and here are some of these.
1. R

It is best suited for data-mining and statistical analysis, making it smooth for use
in Data Sciences.

2. Julia

It is used a lot for big-data analysis by BlackRock, Apple, Oracle, and Google
developers.

3. JavaScript

Its ability to specify page behavior makes it great for data visualization. Data
visualization is a core component of data sciences.

4. C/C++

It is the veteran language in many domains and is used to build a lot of Python
libraries. Also, it is not very user-friendly but one of the fastest on the tracks.
5. SQL

It is the language of relational databases. It may not be "the" language for model
building and analysis, but it is a must for a data scientist.

6. MATLAB

It is a very powerful tool for mathematical and statistical computing. It allows
implementation of algorithms and UI creation, which is especially easy with
MATLAB due to its built-in graphics for creating data plots and visualization.

7. Scala

It is ideal when working with high-volume data sets.

Then there is Java, SAS, Swift, etc. So should you learn all these languages? Not
really, as Python suffices. Sometimes, you may need SQL and NoSQL.
Why Python in Data Science?
Given the above languages, why do you need Python? Compared to other
languages you need Python for the following reasons or advantages.
Simplicity

Python is more nearer to the English language than most other programming
languages. For example, you use the print function to print something,
the len function to find the length of a dataset, sum to total all values.
Scalable

Python can handle small or massive datasets. It can also break down complex or
overwhelming tasks. Also, it keeps growing as a language, to offer you more and
more possibilities
Libraries
+
Vast Community

The widespread and involved community promotes easy access for aspirants
who want to find solutions to their coding problems. If you are stuck somewhere
or need to troubleshoot, help is just a click or a Google search away.

Every stage covered



Python takes care of all processes of machine learning and data sciences like
 Data collection and cleaning,
 Data exploration,
 Data modeling (algorithm building)
 Data visualization

Other advantages

The other advantages of Python are:
 It is excellent for automation.
 It uses a test-driven approach.
 It has Web development frameworks and is ideal for cloud computing
 It is excellent to work with hardware and sensors, and so on.

 Installation and Setup


 Lesson 2 of 3
 In the lesson/topic, you will understand the entire process of
installing and setting up Python.
 Steps to Download Python from
python.org
 Introduction
 Here are the steps to download the latest version of Python
from python.org.
 START
 Step 1
 Open up any browser

 After opening any browser (for example, Safari), type python.org. It


takes you to the official site of Python
 1

 2

 Step 2
 Click on downloads to download the
latest version of Python for your OS


 Note that this download is very lightweight (few libraries, and you
have to install multiple libraries as and when you need them)
 1

 2
 Next steps
 After downloading, run the installation file to install the latest
version of Python on your system.
 1

 2

 START AGAIN
 Installing Anaconda for Python
 Jupyter Notebook with Anaconda is a popular choice for writing
Python code. The steps for installing Anaconda for Python are:
 Step 1
 Open up any browser and type
anaconda.org

 Anaconda comes with hundreds of pre-installed libraries and


occupies around 1 GB of space when it installs.
 Click on the Download Anaconda tab

 It takes you to another page. By default, it shows the Mac version.


For other versions, click on their OS icons.
 Step 3
 Download the Graphical Installer

 It takes you to the "Thank you" page while it downloads.


 Step 4
 4. Run the installer from the folder
where it is stored

 It pops up the installation wizard window.


 Step 5
 5. Follow the steps given in the
installation wizard

 Clicking on the final continue completes the installation

 Hello World
 Lesson 3 of 3
 Now you will learn how to write code in Python.
 To begin, open the Anaconda Navigator either in applications or
using regular search to write the code. It would be a green circular
icon.
The below screenshot shows the navigator. It has a lot of options.
These may look slightly different on your machine based on your
installation. However, to write code, you need the Juypter Notebook
icon. On the right hand of the screenshot, you can see tabs such as
Home, Environments, Learning, and Community. Click on various
parts of the screen to know more.
 Steps to Write the First Application
 Step 1
 Launch Juypter Notebook

 Step 1
 Launch Juypter Notebook

 It takes you to the home folder.


 1
 2

 3

 4

 5

 6

 7

 Step 2
 Create a new folder to store all the code
you will write in this course

 For example, create a folder on the Desktop named Python.


 Step 3
 Create a subfolder to store the code of
this program

 For example, create a subfolder Basics under Python.


 Step 4
 Open the subfolder (Basics), click on
New and choose Python 3

 This action will open up a new tab on your browser— the Jupyter
Notebook. You will spend most of your learning time within such a
Notebook.
 Step 5
 Change the file name

 Click on the file's name "untitled," which is available on the right-


hand top—type "first_code" in the popup window.
 Step 6
 Type the code

 Type' print ("Hello world")' in the cell (horizontal grey box) of the
Notebook.
 Step 7
 Execute/run the code

 Hit Shift + Enter keys to execute/run the code.


What is Code and Data?
Lesson 1 of 4
Now let us learn about code and data. You may have heard these phrases from
programmers like
 "I am writing the source code."
 "Data was uploaded"
 "The code failed"
 "Data was breached."
 "The application could not find the database."
So what is code and data?
To understand this, take an analogy of a cooking recipe, say baking a cake.
Here you have a screenshot of a recipe for a "simple white cake" by Scott Osman
on allrecipes.com. You notice that the recipe has ingredients that you can
consider as the initial data. These ingredients or data will be processed,
operated, sent through multiple steps, and you will have your cake which is the
final expected output.
But, where is your code? You would have heard a lot of times people define
code as "A set of instructions given to the computer."
So, to bake a cake, you need directions to tell you what to do with the
ingredients. Thus, if you have collected the ingredients in the right quantities and
followed the directions correctly, you expect to get a wonderful white cake as an
output.
Here, ingredients are analogous to data and directions to instructions. So, to
write computer code, you first get the data, then operate on it to get
some output.
Example of Coding
Let us say you are calculating the amount of tax to be paid by individual filers.
For example, you may have data like this:
income = 12000
tax = 0.12, that is 12 %
Now you need to calculate the tax payable based on the income. The tax
percentage also varies as there are multiple slabs for filing the tax.
So your code may look like this:
if income <= 0:
tax_payable = 0
elif income > 0 and income < 9950:
tax_payable = income * 0.10
elif income >= 9950 and income < 40525:
tax_payable = income * 0.12
elif income >= 40525 and income < 86375:
tax_payable = income * 0.22
elif income >= 86375 and income < 164925:
tax_payable = income * 0.24
elif income >= 164925 and income < 209425:
tax_payable = income * 0.32
elif income > 209425 and income < 523600:
tax_payable = income * 0.35
elif income >= 523600:
tax_payable = income * 0.37
and so on. In the above code, you are calculating tax_payable as
tax_payable = income x tax
So, if income and tax are datatax_payable = income x tax is the code.
Since there are different slabs for paying tax, you must use certain conditions.
So, based on your income slab, the percentage of tax changes. Thus the code
size increases, but the number of data points is the same.
Running/Executing the Code
Now, you can write this code in any environment or use any IDE or editor, but it
may have some dependencies and may not have such a huge impact. Let us see
what happens when we execute the code in different
langauges/environments/editor/IDE.
Anaconda's Jupyter Notebook - Python code

Here you have the tax code written in Anaconda's Jupyter Notebook and the
output when you run the code with an input income of 12,000.

mac OS command prompt - Python code



You get the same results if you copy the same Python code to a text file and
execute using the mac OS command prompt or terminal.

mac OS command prompt - C code



The code was written in Python but executed in different environments in the
above cases. Now, what if you wanted to write code to find the tax using C, C++,
Java, or JavaScript? You can write it.
For example, here, you have the code for tax written in C. The format looks
different because it is a different language. But the structure is similar. Notice it
also has data and the code. When you run this code, you get the same result.
So the code and data have this deep relation like eggs are the data and the
process of beating them is the code. Or your photo is data, and the filter applied
to it for enhancement is the code running in the background.
In the above code for tax income = 12000 and tax = 0.12 are the data. The code
is the conditions and calculations applied on income and the tax percentage and
the instructions to get the final payable amount.
Creating Data
Lesson 2 of 4
Creating data is simple in Python. For example, to create a string named "jedi,"
use:
jedi = "Luke Skywalker"
To execute this line of code in Jupyter, press Shift+Enter. Once you run it, the
system creates a space in the memory to store this text. Also, "jedi" gets its
value whenever you refer to it in the code. For example,

Python stores "jedi" as text since it is a name. If it were a number, it would store
it as a number. Now numbers can be either decimal fractions or whole numbers,
and we will look at the difference later.
However, what is essential is that you get storage space for this data internally.
For example, let us say you are out there to buy apples. What is the price of each
apple? To know this, create some data called apple_price. For example,
price_apple = 1.5
In computer terminology, you tend to call them variables because their values
may vary over time. In the earlier languages like C, you could have data that
could vary over some time (variables) and data that would be a constant. So, the
programming community eventually adapted the term "variable" for any data
value.

Thus, you declare variables like


price_apple = 1.5
parking_fees = 0.5
number_of_apples = 10

and so on
The computer or the Python interpreter will now create space for storing these in
the memory. It will store all variables in bytes (a collection of bits of 1s and 0s).
Each variable may be allocated a certain space or storage in the memory. It
always means the RAM and not the hard drive when you say memory. So here
are some sample location spaces.
These, are not the actual memory locations because in Python you cannot know
the exact memory area used, and neither do you need to.
All of this is managed internally. What is essential is the way you create data.
That is all you need to learn.

So you have the price of each apple here. Then you have the parking fees and
the number of apples you wish to buy.
Now when you print the parking fees, you get its value.
A simple calculation would tell you how much you need to pay for the apples. You
could calculate this using some mathematical operators like addition and
multiplication.
Using Data with Code
Lesson 3 of 4
Let us learn to manipulate data using some arithmetic operators. The following
are some of the arithmetic operators supported in Python.
Here is an example of using arithmetic operators to manipulate numeric data. So
we have created three data variables price_apple, number_of_apples, and
parking_fees.
Note the usage of underscores instead of spaces in the variable names. You will
understand why we have done this when you go through the rules for naming
variables.
To calculate the expenditure, we used the total variable. Then, we used the
multiplication operator (*) to multiply and the addition operator (+) to add.
Finally, we printed the total.

Now, what if you wanted to calculate the total price of getting some plums and
apples. Well, you could write the following code.
Here you created two more data variables, price_plums, and number_of_plums.
Also, you notice the total price increased as it is the sum of the prices of apples,
plums, and parking fees.
Next, assume that you and your partner share all the expenses. So, to find each
share, the total has to be divided by 2 as follows.
You can add a bit more wisdom or user-friendliness to the print function by
passing multiple attributes to the print function as follows.

Here you have a string followed by a variable with division operator (/) and
another string at the end.
Notice how each of these is separated using commas, and when you run it, you
see a neat output.

If you put the "total/2" operation within the string, it would not be considered an
operation but a part of the string.

Next, what if you do not want so many commas. Well, you can use a fstring. It is
also the recommended way of embedding expressions (variables or calculations)
in a string.
You begin by adding the letter 'f' at the start of the string, then enclose the
variables or calculations within the curly braces.
So fstrings are remarkable, and you will come across them quite often.
Syntax and Naming Conventions
Lesson 4 of 4
Now let us learn about the syntax and naming conventions for variables.
A variable can have a short name like x and y or a more descriptive name like
age, price_apple, and max_temperature.
Naming Conventions for Variables
Here are a few naming conventions for variables.
 A variable name must start with a letter (a – z, A – Z) or the underscore
character (_).
 A variable name cannot start with a number or any other symbol.
 A variable name can only contain alpha-numeric characters (a-z, A-Z, 0-9)
and underscores. So the following variable names are allowed:
o price_apple = 1.5

o price_apple_1 = 1.5

o apple = 1.5

o PriceApple = 1.5

o PriceApple2 = 3.0

 But, the following variable names will throw an error.


o 1_price_apple = 1.5

o price-apple = 1.5

o price apple = 1.5

o apples_&_plums = 22

 A variable name is case-sensitive. So jedi, Jedi, JEDI are all different


variable names in Python.
However, unofficial guidelines also exist among programmers. Sometimes they
are documented. They also differ from firm to firm or team to team.
In addition, you have PEP guidelines which are not official rules but some
recommended guidelines. Here are a few such guidelines.
Names to avoid or retain

Avoid names that are too general or too wordy. Instead, strike a good balance
between the two.
Examples of bad naming

 data_structure
 my_list
 info_map
 dictionary_for_the_purpose_of_storing_data_representing_word_definitions

Examples of good naming



 user_profile
 menu_options
 word_definitions
Points to remember

 Do not name things "O," "l," or "i.“
 When using CamelCase names, capitalize all letters of an abbreviation
(e.g. HTTPServer)

You might also like