0% found this document useful (0 votes)
33 views23 pages

Week 1

Basic Python-1

Uploaded by

Shashank S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views23 pages

Week 1

Basic Python-1

Uploaded by

Shashank S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

1. What do programmers do?

Programmers are essentially problem solvers. They take a real-world problem and design
a step-by-step, logical solution that a computer can execute. The medium for this solution is
code, written in a programming language.

Problems Solvable by Programming

A problem is a good candidate for a programming solution if it has the following


characteristics:

●​ Repetitive: Involves tasks that need to be done over and over.


○​ Example: Calculating the final grade for 300 students in a class. Doing it by
hand is tedious and prone to error. A program can do it instantly and
accurately.
●​ Data-Intensive: Involves processing, searching, or organizing large amounts of
information.
○​ Example: Finding all customers in a database of millions who live in a
specific zip code and have purchased a certain product.
●​ Rule-Based: The steps to solve the problem can be clearly defined by a set of logical
rules and conditions.
○​ Example: A chess game. The rules for how each piece can move are
absolute and can be perfectly encoded in a program.

Algorithm-centric vs. Data-centric Solutions

When solving problems, programmers can take different approaches, often falling into two
broad categories:

●​ Algorithm-centric: This approach focuses on the process or the steps required to


transform an input into an output. The primary challenge is designing the most
efficient sequence of operations.
○​ Word Problem Example: You are given a shuffled deck of 52 cards and need
to find the Ace of Spades.
○​ Solution Focus: The core of the problem is the method you use to search.
Do you go through the cards one by one from the top? Do you split the deck
in half and have two people search? The efficiency of the how is the main
concern. Sorting a list, finding the shortest path on a map, and compressing a
file are classic algorithm-centric problems.
●​ Data-centric: This approach focuses on the structure and relationships within the
data itself. The primary challenge is modeling the data correctly so that complex
questions can be answered easily.
○​ Word Problem Example: You are managing a library. You need to track
books, members, and loans. You need to quickly answer questions like,
"Which books by Isaac Asimov are currently checked out by John Doe?" or
"Which members have overdue books?"
○​ Solution Focus: The core of the problem isn't a complex sequence of steps,
but rather how you organize the information. You would create clear
definitions for Book (with properties like title, author), Member (with name,
address), and Loan (linking a book to a member with a due date). The
solution revolves around the what (the data model), not the how. Building
social networks, banking systems, and e-commerce platforms are
data-centric.

2. What is (and isn't) an algorithm?


An algorithm is a finite sequence of well-defined, unambiguous, computer-implementable
instructions to solve a class of problems or to perform a computation.

For a set of instructions to be considered an algorithm, it must have five key properties:
1.​ Finiteness: It must eventually terminate after a finite number of steps.
2.​ Definiteness: Each step must be precisely and unambiguously defined. There
should be no room for interpretation.
3.​ Input: It takes zero or more well-defined inputs.
4.​ Output: It produces one or more well-defined outputs.
5.​ Effectiveness: Each instruction must be basic enough that it can, in principle, be
carried out by a person using only a pencil and paper.
●​ Example of an Algorithm: A recipe for making a cup of tea.
○​ Input: Tea bag, water, kettle, cup.
○​ Steps:
■​ Fill the kettle with water.
■​ Boil the water in the kettle.
■​ Place the tea bag in the cup.
■​ Pour the boiled water from the kettle into the cup.
■​ Wait for 2 minutes.
■​ Remove the tea bag.
○​ Output: A cup of tea.​
This is an algorithm because it's finite (it ends), definite (each step is clear),
and effective.
●​ Example of what ISN'T an Algorithm: "Live a good life."
○​ This is not an algorithm because it fails the definiteness property. The steps
are not precisely defined. What does "be kind" mean in every possible
situation? What is the exact procedure for "finding happiness"? It is
ambiguous and cannot be translated into a concrete, executable process.
3. Algorithms for Prime numbers
Let's develop an algorithm by directly applying the definition of a prime number.

Definition: A prime number is a natural number greater than 1 that has no positive divisors
other than 1 and itself.

Problem: Write an algorithm that takes an integer n as input and outputs True if n is prime,
and False otherwise.

Algorithm 1: Direct Application of the Definition

To check if n is prime, we can check every number from 2 up to n-1 and see if any of them
divide n evenly. If we find even one such divisor, we know n is not prime. If we check all of
them and find no divisors, then n must be prime.

Here is the implementation in Python:

Python

def is_prime_v1(n):
"""Checks if a number is prime by testing divisibility from 2 to n-1."""
if n <= 1:
return False # Primes must be greater than 1

# Check for divisors from 2 up to n-1


for i in range(2, n):
if n % i == 0:
# We found a divisor, so n is not prime
return False

# If the loop completes without finding any divisors, n is prime


return True

# --- Examples ---


print(f"Is 29 prime? {is_prime_v1(29)}") # Output: True
print(f"Is 30 prime? {is_prime_v1(30)}") # Output: False
print(f"Is 2 prime? {is_prime_v1(2)}") # Output: True

This algorithm works perfectly and is a direct translation of the definition. It demonstrates
how to turn a mathematical definition into a computational process. We can make it more
efficient, but its correctness comes from its faithful application of the core concept.
4. Key Progg Trends
The world of programming is constantly evolving. Two major trends are the
"democratization" of the field and a shift in the most valuable skills.

Understanding the "democratization" of programming

This refers to the process of making programming more accessible to everyone, not just
those with a formal Computer Science degree. Several factors drive this:

●​ High-Level Languages: Languages like Python use syntax that is closer to human
language, abstracting away complex details like memory management. This lowers
the barrier to entry.
●​ Open-Source Libraries: Programmers no longer need to build everything from
scratch. Need to create a chart? There's a library for that (Matplotlib). Need to do
complex math? There's a library for that (NumPy). You can stand on the shoulders of
giants.
●​ GenAI and Low-Code Tools: AI assistants like Gemini and tools like Zapier allow
people to create functional applications and automate tasks by describing what they
want in plain English, further lowering the technical bar.

The Skills to Develop

As tools get better, the value is shifting from just being able to write code to higher-level
skills:

1.​ Problem Decomposition: The ability to take a large, vague problem and break it
down into smaller, manageable, and solvable pieces. This is the most crucial skill.
2.​ Systems Thinking: Understanding how your piece of code fits into the larger
system. It's not just about making your function work, but ensuring it works correctly
with the database, the user interface, and other services.
3.​ Critical Evaluation & Debugging: Especially with AI-generated code, you need the
skill to read, understand, and find flaws in code. You must be the expert who verifies
and fixes the AI's output.
4.​ Tool Proficiency: Mastering your tools—your code editor, version control (like Git),
and AI assistants—is essential for efficiency. The modern programmer orchestrates
tools rather than just typing characters.
5.​ Learning and Adaptation: The most important skill of all. The hot new technology of
today will be legacy tomorrow. You must be committed to continuous learning to stay
relevant.
5. How effective is GenAI at coding?
Generative AI is surprisingly effective at solving common coding problems, especially
those found in competitive programming contests. These contests provide an excellent
benchmark because the problems are well-defined, have clear correct answers, and are
ranked by difficulty.

Performance on Coding Contest Problems


●​ Google's AlphaCode (2022): This was a landmark achievement. AlphaCode
competed in contests on the Codeforces platform, which hosts thousands of human
programmers. It achieved an average rank placing it within the top 54% of human
competitors. This means it performed better than almost half of the human
participants who entered the contests.
●​ How it works: GenAI doesn't "think" about the problem like a human. It has been
trained on billions of lines of code from public repositories like GitHub. When it sees a
new problem, it performs sophisticated pattern matching to find similar problems it
has seen in its training data and adapts the solutions it learned from them.
●​ Strengths: It excels at standard algorithm problems (e.g., sorting, searching, basic
data structures) where many examples exist in its training data. It can generate
syntactically correct and often idiomatic code very quickly.
●​ Weaknesses: It struggles with problems that require a truly novel insight or a deep,
creative leap of logic not present in its training data. It can also produce code that is
subtly wrong or fails on "edge cases" (unusual inputs).

In short, GenAI is like a very knowledgeable student who has memorized every textbook but
sometimes lacks the deep understanding to solve a problem they've never seen before.

6. Refuting an algorithm
To refute an algorithm, you don't need to prove it's wrong for all inputs. You only need to find
one single counterexample.

A counterexample is a specific, valid input for which the algorithm produces an incorrect
output, fails to stop, or crashes.

Example Problem: A fellow programmer proposes an "efficient" algorithm to find the


maximum value in a list of positive numbers.
●​ Proposed Algorithm: "To find the largest number, you only need to compare the
very first and the very last numbers in the list. The greater of these two is the
maximum."
How to Refute It:

We need to find a list (a counterexample) where this logic fails.

1.​ Think about the algorithm's weakness: The algorithm completely ignores all the
numbers in the middle of the list.
2.​ Construct a counterexample: Create a list where the largest number is not at the
beginning or the end.
○​ Let's try the list: [10, 50, 20]
3.​ Test the algorithm with the counterexample:
○​ Input: [10, 50, 20]
○​ Algorithm's Steps:
■​ Get the first number: 10.
■​ Get the last number: 20.
■​ Compare them: 20 is greater than 10.
○​ Algorithm's Output: 20.
4.​ Compare with the correct output:
○​ The actual maximum value in [10, 50, 20] is 50.
5.​ Conclusion: The algorithm's output (20) does not match the correct output (50).
Therefore, the list [10, 50, 20] is a valid counterexample that refutes the proposed
algorithm.

7. Why is GenAI so effective at coding?


The primary reason for GenAI's effectiveness is that it has been trained on an unimaginably
vast dataset of human-written code. The main source for this data is open-source software
repositories, most notably GitHub.

Use of Open-Source Software to Train Models


●​ The "What": When a Large Language Model (LLM) is trained for coding, it is fed the
source code of millions of projects. It "reads" everything: the code itself, the
comments written by developers, the associated documentation, and even
discussions about bugs and features.
●​ The "How": The model isn't just memorizing code. It's learning statistical
relationships. It learns that after for i in range(, the next word is often n):. It learns that
if the comments mention "sorting," the code is likely to contain list.sort() or sorted(). It
learns common patterns, standard function names, and the "shape" of solutions to
common problems (e.g., how to read a file, how to make a web request).
Misuse and Ethical Concerns

This training method is not without controversy, centering on software licenses.

●​ The Problem of Licensing: Open-source code is free to use, but not without rules.
Licenses like the GPL (GNU General Public License) have a "copyleft" provision: if
you use GPL-licensed code in your project, your project must also be open-sourced
under the GPL. In contrast, permissive licenses like MIT or Apache have very few
restrictions.
●​ The Misuse: AI models are trained on all of this code, regardless of license. When
the AI then generates code for a user, it might produce a snippet that is a derivative
of GPL-licensed code it saw during training. If a company uses this AI-generated
snippet in their proprietary, closed-source product, they could be in violation of the
original GPL license without even knowing it.
●​ The Debate: This has led to lawsuits and intense debate. Is an AI learning from code
the same as a human learning from it? Or is it a form of automated copyright and
license laundering? The legal and ethical frameworks are still struggling to catch up
with the technology.

In essence, GenAI is so effective because it stands on the shoulders of the entire


open-source community, but it sometimes does so without fully respecting the rules that
community established.

8. Assumptions in AI-generated code


A significant (and current) weakness of AI-generated code is that it often makes implicit
assumptions about the context and the inputs it will receive. It generates the most
"statistically likely" solution, which often corresponds to the simplest, most common "happy
path" scenario.

A Weakness: Assuming the Happy Path

Let's say you ask an AI: "Write a Python function to calculate the average of a list of
numbers."

The AI will likely generate something like this:

Python

# AI-generated code
def calculate_average(numbers):
return sum(numbers) / len(numbers)

This code is clean, simple, and correct... for the "happy path." But it makes several
dangerous assumptions:
●​ It assumes the list numbers will never be empty. If you pass it an empty list [],
len(numbers) will be 0, and the code will crash with a ZeroDivisionError.
●​ It assumes the list will only contain numbers. What if someone passes [1, 2,
'apple']? The sum() function will fail.

The AI doesn't automatically consider these "edge cases" because it doesn't possess true
understanding. It simply reproduces the most common pattern associated with the request.

Asking Clarifying Questions

A professional developer's job is to think beyond the happy path. Before writing (or
accepting) this code, they would ask clarifying questions to understand the full requirements.
This is a critical skill when using AI. You must act as the human supervisor.

Asking Good Clarifying Questions

Instead of just accepting the AI's code, you should "interview" the stakeholder (or even
prompt the AI with these constraints). Good questions for the average-calculator problem
would be:

●​ "What is the expected output if the input list is empty? Should it be 0, an error, or
something else?"
●​ "What should the function do if the list contains non-numeric data? Should it ignore
them, or raise an error?"
●​ "Are the numbers always integers, or can they be floating-point numbers? Is there a
required precision for the result?"

By asking these questions, you move from a fragile, assumption-based solution to a robust,
well-defined one. This critical thinking is what currently separates a human developer from
an AI code generator.

9. Key Takeaways up to this point


1.​ Programming is Problem Solving: The core job of a programmer is to create
precise, logical, and finite solutions (algorithms) to problems.
2.​ Algorithms are Recipes: An algorithm is a clear, unambiguous, step-by-step
procedure. To disprove one, you only need a single counterexample.
3.​ GenAI is a Powerful Pattern Matcher: Tools like Gemini and AlphaCode are
effective at coding because they've been trained on the vast library of human-written
open-source code. They excel at recognizing and reproducing common solutions.
4.​ GenAI Lacks True Understanding: The primary weakness of GenAI is its reliance
on assumptions. It often produces code for the "happy path" and fails to consider
edge cases (like empty inputs) unless specifically prompted.
5.​ The Modern Developer is a Critical Thinker: The value of a human programmer is
shifting. It's less about typing code quickly and more about decomposing problems,
asking smart clarifying questions to handle edge cases, and critically
evaluating/debugging all code, especially AI-generated code. Your job is to be the
expert supervisor of the tools you use.
10. Python Resources
Here is a curated list of resources for learning Python, suitable for various learning styles.
●​ Official Documentation (The Source of Truth)
○​ The Python Tutorial: (docs.python.org/3/tutorial/) - Maintained by the
creators of Python. It's concise, accurate, and the best place to start.
●​ Interactive Learning Platforms
○​ Real Python: (realpython.com) - Offers extremely high-quality, in-depth
tutorials on a huge range of topics, from beginner to advanced.
○​ freeCodeCamp: (freecodecamp.org) - Provides full-length, project-based
courses on Python for free, including data science and machine learning.
○​ Codecademy: (codecademy.com) - Excellent for absolute beginners, with an
interactive, in-browser editor that guides you through the first steps.
●​ Essential Books
○​ "Automate the Boring Stuff with Python" by Al Sweigart: A fantastic,
practical book for beginners that focuses on using Python to automate
real-world tasks. The full book is legally available to read for free online.
○​ "Python Crash Course" by Eric Matthes: A fast-paced, project-based
introduction. Part 1 covers the fundamentals, and Part 2 walks you through
three substantial projects.
●​ Video-Based Learning
○​ Corey Schafer's YouTube Channel: Widely regarded as one of the best
programming channels. His Python playlists are clear, thorough, and
professional.
●​ Practice and Challenges
○​ LeetCode: The standard for practicing algorithm and data structure problems,
essential for interview preparation.
○​ HackerRank: Offers guided tracks and challenges for different skill domains
within Python.
○​ Codewars: A fun, "kata"-based system where you solve problems to rank up,
with a focus on elegant and clever solutions.

11. The REPL


The REPL is one of a programmer's most fundamental tools. The name is an acronym for
Read-Evaluate-Print Loop, which describes exactly what it does:
1.​ Read: It reads the line of code you type.
2.​ Evaluate: It executes that code.
3.​ Print: It prints the result or output of the code.
4.​ Loop: It then waits for you to type another line of code.
It's an interactive command-line environment for the Python language. Think of it as a
playground or a scratchpad. It's perfect for quickly testing small snippets of code, checking
the behavior of a function, or performing simple calculations without the overhead of creating
a .py file.

Using the REPL


1.​ Open your terminal or command prompt.
2.​ Type python or python3 and press Enter.
3.​ You will see a prompt, usually >>>. You can now type Python code.

Example Playground Session

Bash

$ python3
Python 3.11.4 (main, Jun 7 2023, 10:13:09) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> # This is the REPL prompt. Let's try some code.
>>>
>>> 2 + 2
4
>>>
>>> message = "Hello, playground!"
>>> print(message)
Hello, playground!
>>>
>>> len(message)
17
>>>
>>> # You can test how functions work
>>> message.upper()
'HELLO, PLAYGROUND!'
>>>
>>> # To exit the REPL, you can type exit() or press Ctrl-D
>>> exit()
$
12. Basic Arithmetic Operators
Python provides a standard set of arithmetic operators to perform calculations, just like a
regular calculator. Here's a look at the most common ones through simple expressions.
●​ + (Addition): Adds two numbers.
○​ 10 + 5 # Result: 15
●​ - (Subtraction): Subtracts the right number from the left.
○​ 10 - 5 # Result: 5
●​ * (Multiplication): Multiplies two numbers.
○​ 10 * 5 # Result: 50
●​ / (True Division): Divides the left number by the right. The result is always a
floating-point number (a number with a decimal).
○​ 10 / 3 # Result: 3.3333333333333335
●​ // (Floor Division): Divides and then rounds the result down to the nearest whole
number (integer).
○​ 10 // 3 # Result: 3
○​ 11 // 3 # Result: 3
○​ -10 // 3# Result: -4 (rounds down, away from zero)
●​ % (Modulo): Returns the remainder of a division. This is incredibly useful for
checking if a number is even or odd.
○​ 10 % 3 # 10 divided by 3 is 3 with a remainder of 1. Result: 1
○​ 12 % 2 # 12 is even, so the remainder is 0. Result: 0
○​ 13 % 2 # 13 is odd, so the remainder is 1. Result: 1
●​ ** (Exponentiation): Raises the left number to the power of the right.
○​ 2 ** 3 # Same as 2 * 2 * 2. Result: 8
○​ 5 ** 2 # Same as 5 * 5. Result: 25

13. Complex Expressions


When an expression contains multiple operators, Python doesn't simply evaluate it from left
to right. It follows a well-defined order of operations, which ensures that expressions are
evaluated consistently. This order is often remembered by the acronym PEMDAS or
BODMAS.

The Order of Operations (Precedence)


1.​ Parentheses ()
2.​ Exponents **
3.​ Multiplication *, Division /, //, % (evaluated left-to-right)
4.​ Addition +, Subtraction - (evaluated left-to-right)

The official Python documentation has a complete table of operator precedence, which is the
ultimate source of truth for any complex case.
Comprehending a Complex Expression: Step-by-Step

Let's analyze the expression: 5 + 2 * 3**2 - (7-1)

1.​ Parentheses (): First, evaluate anything inside parentheses.


○​ (7-1) becomes 6.
○​ The expression is now: 5 + 2 * 3**2 - 6
2.​ Exponents **: Next, handle any exponentiation.
○​ 3**2 becomes 9.
○​ The expression is now: 5 + 2 * 9 - 6
3.​ Multiplication * and Division /: Now, perform multiplication and division from left to
right.
○​ 2 * 9 becomes 18.
○​ The expression is now: 5 + 18 - 6
4.​ Addition + and Subtraction -: Finally, perform addition and subtraction from left to
right.
○​ 5 + 18 becomes 23.
○​ The expression is now: 23 - 6
○​ 23 - 6 becomes 17.

Final Result: 17

Rule of Thumb: When in doubt, use parentheses (). They make the order of operations
explicit and your code easier for others (and your future self) to read. The expression (5 + (2
* (3**2))) - (7-1) is much clearer, even though it yields the same result.

14. Limitations of Arithmetic Computation


While computers are amazing at math, their calculations are not perfect. This is because
computers have finite memory to store numbers. This fundamental constraint leads to two
major types of limitations.

1. Integer Size (Historically Important)

In many programming languages (like C or Java), an integer type has a fixed number of bits
(e.g., 32 or 64). This means there's a maximum number they can store. If you try to add 1 to
the maximum possible integer, it can "overflow" and wrap around to a very small negative
number, causing a serious bug.

●​ Python's Advantage: Python is special in this regard. Its standard integers (int) have
arbitrary precision. This means they can grow to use as much memory as needed
to store a number, so you won't experience overflow errors with standard integers in
Python.
2. Floating-Point Inaccuracy (The Real Problem)

This is a much more common and subtle limitation. Floating-point numbers (float) are used
to represent numbers with decimal points (e.g., 3.14159, 0.1). They are stored in a binary
format (IEEE 754 standard) that can't precisely represent all decimal fractions, just like we
can't write 1/3 perfectly as a finite decimal.

●​ The Classic Example: The number 0.1 is a simple, finite decimal for us, but it's an
infinitely repeating fraction in binary. The computer has to cut it off at some point,
leading to a tiny representation error.
●​ Demonstration in the REPL:
●​ Python

>>> 0.1 + 0.2


0.30000000000000004
●​
●​ As you can see, the result is not exactly 0.3. The tiny errors from representing 0.1
and 0.2 in binary were added together, resulting in a small but noticeable
discrepancy.

Why this matters: For most applications, this tiny error is irrelevant. But in scientific,
financial, or engineering calculations where high precision is critical, these small errors can
accumulate over millions of calculations and lead to significant and incorrect results. For
these cases, programmers use specialized tools like Python's Decimal module.

15. Investigation: sys.float_info


The sys module in Python provides access to system-specific parameters and functions.
One of its useful attributes is sys.float_info, which is an object that gives you detailed
information about how floating-point numbers are represented on your specific machine. It
allows you to peek under the hood at the limitations we discussed.

Exploring in the REPL

Let's use the REPL to see what it contains.

Python

>>> import sys # First, we must import the 'sys' module to use it
>>>
>>> sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308,
min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-308, dig=15, mant_dig=53,
epsilon=2.220446049250313e-16, radix=2, rounds=1)
Understanding Key Attributes from the Documentation

By looking at this output (and consulting the Python documentation), we can understand
some of the key limits of floating-point numbers on our system:

●​ max: 1.797...e+308
○​ This is the maximum representable float. Any number larger than this will
be considered "infinity."
○​ 1.79e+308 is scientific notation for 1.79 followed by 308 zeros. It's a massive
number.
●​ min: 2.225...e-308
○​ This is the smallest positive normalized float. Any positive number smaller
than this is effectively treated as zero.
●​ epsilon: 2.220...e-16
○​ This is arguably the most important attribute for understanding precision.
Epsilon is the smallest possible difference between 1.0 and the next
representable floating-point number.
○​ In other words, 1.0 + epsilon will result in a number slightly different from 1.0,
but 1.0 + (epsilon / 2) will still evaluate to 1.0 because the change is too small
to be registered. This number quantifies the precision limit of float arithmetic.
○​ Let's test this!
●​ Python

>>> 1.0 + sys.float_info.epsilon


1.0000000000000002
>>>
>>> 1.0 + (sys.float_info.epsilon / 2)
1.0
●​
●​

This investigation confirms that our computer's arithmetic has finite precision and provides
us with the exact values that define its boundaries.
16. Binary digits (bits)
The most fundamental concept in all of computing is that all data—numbers, text, images,
videos—is stored as sequences of binary digits, or bits.
●​ A bit is the smallest unit of data and can only have one of two values: 0 or 1.

The Light Switch Analogy

Think of a bit as a single light switch. It can either be Off (0) or On (1). There are only two
possible states. This two-state system is easy to represent physically in computer hardware:

●​ Low voltage vs. High voltage


●​ No magnetic charge vs. Magnetic charge
●​ A microscopic pit vs. no pit on a CD/DVD

Representing More Data

With just one bit, you can only represent two things. To represent more complex information,
we string bits together. Each time we add a bit, we double the number of distinct
combinations we can make.

●​ 1 bit: 2 combinations
○​ 0
○​ 1
●​ 2 bits: 4 combinations (2 * 2 = 2^2)
○​ 00
○​ 01
○​ 10
○​ 11
●​ 3 bits: 8 combinations (2 * 2 * 2 = 2^3)
○​ 000
○​ 001
○​ 010
○​ 011
○​ 100
○​ 101
○​ 110
○​ 111
●​ 8 bits (a Byte): 256 combinations (2^8)

This is the foundation of all digital representation. The number 7 might be stored as
00000111, the letter 'A' might be stored as 01000001, and a single pixel in a black-and-white
image might be stored as 0 (white) or 1 (black). Everything boils down to these simple 0s
and 1s.
17. Naming Things
This section connects the concept of bits to a practical problem: how many bits are needed
to uniquely identify a certain number of distinct items?

The Core Question

If you have N distinct items, what is the minimum number of bits required to give each item a
unique label?

The Rule

The number of bits needed is the smallest integer b such that 2^b >= N.

Word Problem Example


●​ Problem: You need to create a unique binary code for every letter in the lowercase
English alphabet (a-z). How many bits are required?
1.​ Count the distinct items (N): There are 26 letters in the alphabet, so N = 26.
2.​ Find the necessary power of 2: We need to find the smallest b where 2^b is at least
26.
○​ 2^1 = 2 (Not enough)
○​ 2^2 = 4 (Not enough)
○​ 2^3 = 8 (Not enough)
○​ 2^4 = 16 (Still not enough to label 26 unique letters)
○​ 2^5 = 32 (This is enough! We can assign 26 unique codes and will have 6
codes left over)
3.​ Answer: You need 5 bits.

With 5 bits, you could assign 'a' to 00000, 'b' to 00001, 'c' to 00010, and so on, up to 'z' which
would be 11001.

Useful Powers of 2 for Programmers

Certain powers of 2 appear so frequently in computing that they are worth remembering.
They often define the limits or sizes of common data types.

●​ 2^8 = 256: The number of values that can be represented by a byte.


●​ 2^10 = 1,024: A Kibibyte (often rounded to 1,000 for a Kilobyte).
●​ 2^16 = 65,536: The number of values in a 16-bit unsigned integer (often called a
"short"). Also the number of ports available in TCP/IP networking.
●​ 2^32 = ~4.3 billion: The limit for 32-bit memory addressing, which is why 32-bit
systems couldn't use more than ~4GB of RAM.
●​ 2^64: An enormous number, the limit for modern 64-bit systems.
18. Introducing "our friend"
Let's critique some statements from a hypothetical fellow learner to clear up common but
important misconceptions about how computers work.

Friend's Statement 1: "Computers use binary because it's just simpler for them."
●​ Critique: This statement is true on the surface but misses the critical "why." It's not
about simplicity in a cognitive sense (like an easy math problem). The real reason is
physical reliability.
●​ Deeper Explanation: Electronic circuits are built from transistors, which act like tiny
switches. It is vastly more reliable and less prone to error to design a circuit that only
has to distinguish between two physical states (e.g., a low voltage for '0' and a high
voltage for '1') than it would be to design a circuit that has to reliably distinguish
between ten different voltage levels for a base-10 system. A slight fluctuation in
power could easily be misread in a 10-state system, but it's much less likely to cause
a high voltage to be mistaken for a low one. The choice of binary is a direct
consequence of engineering for robustness and fault tolerance in physical hardware.

Friend's Statement 2: "My computer has a 64-bit processor, so it can only handle numbers
up to 2^64 - 1."
●​ Critique: This is a very common and understandable confusion between the
computer's architecture and how a high-level language like Python works. The
statement is false for Python integers.
●​ Deeper Explanation:
○​ What "64-bit" really means: A 64-bit architecture means the processor's
internal data pathways (registers) and memory addresses are 64 bits wide.
This makes it very efficient at performing calculations on numbers that fit
within 64 bits and allows it to access a massive amount of RAM.
○​ Python's Abstraction: Python's int type is not a raw 64-bit integer. It is a
more complex object that can use as much memory as needed to represent a
number of any size. If you calculate a number larger than 2^64 - 1, Python will
automatically allocate more memory to store it. This is called
arbitrary-precision arithmetic. It might be slightly slower than a native 64-bit
operation, but it prevents overflow errors and makes the language much
easier to use.
○​ In other languages (like C): In a language like C, this statement would be
true. If you define a standard 64-bit integer (uint64_t) and add 1 to its
maximum value, it will overflow. This is a key difference that highlights the
abstractions high-level languages provide.
19. Limits to float computation
We've seen that floating-point computation has limitations, but it's crucial to understand that
the primary issue is limited precision, not just a limited range.

Precision vs. Range


●​ Range: The difference between the smallest and largest number you can represent.
As we saw in sys.float_info, the range of floats is enormous.
●​ Precision: The number of significant digits that can be represented accurately. This
is where the real limitation lies.

The "Gaps" Between Numbers

Think of the number line. In pure mathematics, there are an infinite number of values
between any two numbers, say 1.0 and 1.1. But a computer using floating-point numbers
cannot store an infinite number of values. It has a finite number of bits (usually 64) to
represent a number.

This means there are "gaps" between representable floating-point numbers. The size of
these gaps changes:
●​ The gaps are very small for numbers close to zero.
●​ The gaps get larger as the numbers get larger.

The Consequence: Rounding Errors

When the result of a calculation falls into one of these gaps, it must be rounded to the
nearest representable number. This introduces a tiny error.

Example of Accumulating Error

This might seem insignificant, but these small errors can accumulate in calculations that
involve many steps.

Imagine a simulation where you add a very small value (0.0000001) to a running total a
billion times.

Python

total = 0.0
increment = 0.0000001
iterations = 1_000_000_000

for _ in range(iterations):
total += increment

# Mathematically, we expect the result to be 100.0


print(f"Expected result: {increment * iterations}")
print(f"Actual result: {total}")
Output:

Expected result: 100.0


Actual result: 100.00000014901161

Each addition introduced a minuscule rounding error. After a billion additions, those tiny
errors compounded into a noticeable discrepancy. This is a critical issue in fields like
scientific computing, financial modeling, and game physics, where small, repeated errors
can lead to completely wrong outcomes.

20. Python objects


The phrase "everything in Python is an object" is one of the most fundamental concepts in
the language.

An object is a self-contained entity that consists of both data (attributes) and procedures
to manipulate that data (methods). It's a way of abstracting and modeling a real-world
entity inside a program.

Analogy: A Real-World Car

Think of a specific car, your car.

●​ It has data (attributes):


○​ color = 'Red'
○​ make = 'Toyota'
○​ current_speed = 60
○​ fuel_level = 0.75
●​ It has behaviors (methods):
○​ accelerate()
○​ brake()
○​ turn_on_headlights()

An object in Python bundles these two things together. The accelerate() method inherently
knows how to modify the current_speed attribute of that specific car object.

Objects in Practice

You use objects constantly in Python, even with the simplest data types.

Let's see this in the REPL using the built-in type() function.

Python

>>> type(5)
<class 'int'>
This tells us that the number 5 is not just a value; it's an instance of the int class (an integer
object).

Python

>>> type("hello")
<class 'str'>

The text "hello" is an instance of the str class (a string object).

Because "hello" is an object, it comes with built-in methods to manipulate its data:

Python

>>> greeting = "hello"


>>>
>>> # .upper() is a method of the string object
>>> greeting.upper()
'HELLO'
>>>
>>> # .startswith() is another method
>>> greeting.startswith('h')
True

This object-oriented nature makes Python code more organized and intuitive, as it allows us
to model complex systems by thinking about the "things" (objects) involved and what they
can do.

21. Textual data


Computers don't understand letters, punctuation, or emojis. At their core, they only
understand numbers (specifically, binary numbers). To work with textual data, computers
need a standardized system to translate characters into numbers. This system is called
character encoding.

Abstracting Text as str Objects

In Python, all textual data is represented by the str object. This object handles all the
complexities of character encoding for you, so you can think about text as "a sequence of
characters" rather than "a sequence of numbers."

The History of Encoding


1.​ ASCII (American Standard Code for Information Interchange): This was one of
the earliest widespread standards.
○​ It uses 7 bits, allowing it to represent 2^7 = 128 different characters.
○​ This was enough for all uppercase and lowercase English letters, numbers
(0-9), standard punctuation, and some non-printable control characters.
○​ Limitation: ASCII had no way to represent characters from other languages
like é, ü, Я, or 汉.
2.​ Unicode (The Modern Standard): Unicode was created to solve the limitations of
ASCII.
○​ Goal: To provide a unique number—called a code point—for every character
in every human language, plus a vast collection of symbols and emojis.
Unicode can represent over 149,000 characters.
○​ UTF-8: Unicode itself is a giant map of characters to code points. UTF-8 is
the most common encoding that translates those code points into binary. It's a
clever, variable-width encoding:
■​ For any character that is also in ASCII (like 'A', 'B', 'C'), UTF-8 uses

😊
just 1 byte (8 bits), making it backward compatible.
■​ For other characters (like '€' or ' '), it uses 2, 3, or even 4 bytes.

Python 3's str object is Unicode by default. This is a major advantage, as it allows your
programs to handle text from any language seamlessly without you needing to worry about
the underlying encoding in most cases.

22. Strings
A Python str object is an immutable sequence of Unicode characters. The key idea is that
while we see letters, Python sees the underlying integer code points.

Python provides two built-in functions, ord() and chr(), that allow us to easily move between
the character representation and its integer code point, making this abstraction visible.
●​ ord(): Takes a single character (a string of length 1) and returns its integer Unicode
code point.
●​ chr(): Takes an integer and returns the corresponding Unicode character as a string.

Representing Letters as Integers: An Example

Let's explore this relationship in the REPL.

Python

>>> # Get the integer code point for the letter 'A'
>>> ord('A')
65
>>>
>>> # Get the integer code point for the letter 'B'
>>> ord('B')
66
>>>
>>> # Get the integer code point for a lowercase letter 'a'
>>> ord('a')
97
This shows that uppercase and lowercase letters have different numerical values. The letters
of the alphabet are represented by consecutive integers.

Now let's go the other way.

Python

>>> # What character corresponds to the number 65?


>>> chr(65)
'A'
>>>
>>> # We can even do arithmetic with the code points
>>> ord('A') + 1
66
>>> chr(ord('A') + 1)
'B'

This ability to translate characters to and from integers is the foundation of all text
processing on a computer. Every string operation, from sorting text alphabetically to
changing a letter's case, is ultimately performed by manipulating these underlying numbers.

Example with a non-ASCII character:

Python

>>> # Euro currency symbol


>>> ord('€')
8364
>>>
>>> chr(8364)
'€'

This demonstrates how Unicode, via the str object, handles a much wider range of
characters than simple ASCII.

23. Summary
This section summarizes the key concepts from topics 11 through 22, covering the
fundamentals of working with data in Python.
●​ The REPL is your playground: The Read-Evaluate-Print Loop is an interactive tool
for instantly testing code snippets, exploring how functions work, and performing
quick calculations.
●​ Numbers and Arithmetic: Python provides standard arithmetic operators (+, -, *, /,
//, %, **) that follow the PEMDAS order of operations. Parentheses () should be used
to ensure clarity.
●​ Computation has limits: All computer arithmetic is limited by the finite memory used
to store numbers. The most important limitation in Python is the finite precision of
floating-point numbers (float), which cannot represent all decimal values perfectly.
This leads to small rounding errors that can accumulate in large calculations. The
sys.float_info object allows you to inspect these limits.
●​ Everything is Binary: All data on a computer is stored as sequences of 0s and 1s
(bits). The number of distinct items you can represent is determined by the number of
bits you use (2^b items for b bits).
●​ Python's Abstractions: Python abstracts away many of the computer's underlying
complexities.
○​ Its int objects have arbitrary precision, meaning they can grow as large as
needed, unlike integers in many other languages.
○​ Everything in Python is an object, which bundles data (attributes) and
behaviors (methods) together.
●​ Text is Numbers: Textual data is stored by mapping each character to a unique
integer code point. Python's str object uses the Unicode standard, allowing it to
represent characters from virtually all human languages. The ord() and chr()
functions let you convert between a character and its integer code point, revealing
this underlying representation.

You might also like