0% found this document useful (0 votes)
12 views36 pages

Lab 1 - Intro

The document explains the fundamentals of programming, focusing on how computers execute programs through machine language and the Central Processing Unit (CPU). It discusses the importance of algorithms and data structures, emphasizing the analysis of their efficiency using concepts like running time, space usage, and asymptotic analysis. Additionally, it introduces Big O notation and its variants to describe algorithm performance in terms of input size, highlighting the significance of algorithm design for efficiency.

Uploaded by

mellaelvi23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views36 pages

Lab 1 - Intro

The document explains the fundamentals of programming, focusing on how computers execute programs through machine language and the Central Processing Unit (CPU). It discusses the importance of algorithms and data structures, emphasizing the analysis of their efficiency using concepts like running time, space usage, and asymptotic analysis. Additionally, it introduces Big O notation and its variants to describe algorithm performance in terms of input size, highlighting the significance of algorithm design for efficiency.

Uploaded by

mellaelvi23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Starting from basics

What is a program?
A program is simply a list of instructions meant to be followed mechanically by a computer.

What language does the computer understand?


A computer is built to carry out instructions that are written in a very simple type of language called machine
language.

How does machine language look like?


Well it looks like this
And the computer can directly execute a program only if the program is
expressed in that language. Also, it can execute programs written in other
languages if they are first translated into machine language.
But how does the computer execute the program?
First, the program is stored in the computer's main memory which also holds data that is being used or processed by
the program. Main memory consists of a sequence of locations. These locations are numbered, and the sequence
number of a location is called its address.

How does the memory look like?


The memory looks like this:

And we represent it like this:


Still haven’t told us how the computer executes the program.
The heart—or the brain, if you want—of the computer is a single component called the Central Processing Unit that
does the actual computing. A CPU contains an Arithmetic Logic Unit, or ALU, which is the part of the processor that
carries out operations such as addition and subtraction. The CPU also includes special purpose registers. The most
important of these is the program counter, or PC to keep track of where it is in the program it is executing.

This is not an easy concept. A computer is a machine built of millions of tiny switches called transistors, which have
the property that can be wired together in such a way that an output from one switch can turn another switch on or off.

How does the CPU look like?

The CPU looks like this:


Let’s recap what we learned in one diagram
Intro
It is important that we understand the concepts surrounding the efficiency of algorithms before we begin building data
structures. A data structure built correctly and with an eye toward efficient use of both the CPU and memory is one that
can be reused effectively in many different applications.
Simply put, a data structure is a systematic way of organizing and accessing data, and an algorithm is a step-by-step
procedure for performing some task in a finite amount of time.
These concepts are central to computing, but to be able to classify some data structures and algorithms as “good,” we
must have precise ways of analyzing them.
How can we measure the goodness of Algorithms?
The primary analysis tool involves characterizing the running times of algorithms and data structure operations, with
space usage also being of interest.
Running time is a natural measure of “goodness,” since time is a precious resource—computer solutions should
run as fast as possible.
In general, the running time of an algorithm or data structure operation increases with the input size, although
it may also vary for different inputs of the same size.
Also, the running time is affected by the hardware environment (e.g., the processor, clock rate, memory, disk) and
software environment (e.g., the operating system, programming language) in which the algorithm is implemented
and executed.
All other factors being equal, the running time of the same algorithm on the same input data will be smaller if the
computer has, say, a much faster processor or if the implementation is done in a program compiled into native machine
code instead of an interpreted implementation run on a virtual machine
Experimenting the timing of an algorithm.

Measuring elapsed time in this fashion provides a reasonable reflection of an algorithm’s efficiency; for
extremely quick operations, Java provides a method, nanoTime, that measures in nanoseconds rather than
milliseconds.
Because we are interested in the general dependence of running time on the size and structure of the input,
we should perform independent experiments on many different test inputs of various sizes.
Let’s test something we found interesting in Strings
We’ve said that Strings are immutable and any time that we are adding something to a String, a new object is being
created. We’ve said that to counter this problem we use StringBuilder.
Now it’s time to understand why. See the following code fragments.
We run the experiment n times, where n is changing.
As we said , we experiment using System.currentTimeMillis( ) to measure the efficiency of both repeat1 and repeat2
for very large strings.
Moving Beyond Experimental Analysis
To analyze the running time of an algorithm without performing experiments, we perform an analysis directly on a
high-level description of the algorithm (either in the form of an actual code fragment, or language-independent
pseudocode).

We define a set of primitive operations such as the following:


• Assigning a value to a variable
• Following an object reference
• Performing an arithmetic operation (for example, adding two numbers)
• Comparing two numbers
• Accessing a single element of an array by index
• Calling a method
• Returning from a method
Focusing on the Worst-Case Input
An algorithm may run faster on some inputs than it does on others of the same size.
Thus, we may wish to express the running time of an algorithm as the function of the input size obtained by taking the
average over all possible inputs of the same size. Unfortunately, such an average-case analysis is typically quite
challenging. It requires us to define a probability distribution on the set of inputs, which is often a difficult task.
Focusing on the Worst-Case Input
Making the standard of success for an algorithm to perform well in the worst case necessarily requires that it will do
well on every input.
That is, designing for the worst case leads to stronger algorithmic “muscles”.
1. The Constant Function
The simplest function we can think of is the constant function, that is, f(n) = c, for some fixed constant c, such as c = 5,
c = 27, or c = 210.
In other words, it does not matter what the value of n is; f(n) will always be equal to the constant value c.
As simple as it is, the constant function is useful in algorithm analysis because it characterizes the number of steps
needed to do a basic operation on a computer, like adding two numbers, assigning a value to a variable, or
comparing two numbers.
2. The Logarithm Function
One of the interesting and sometimes even surprising aspects of the analysis of data structures and algorithms is the
ubiquitous presence of the logarithm function, f(n) = logb n, for some constant b > 1. This function is defined as the
inverse of a power, as follows:
x = logb n if and only if b^x = n.
The most common base for the logarithm function in computer science is 2 as computers store integers in binary.
There are several data structures that use logarithmic functions in their time complexity analysis. For example: Binary
Search Tree, Balanced Search Trees, Heaps etc.
3. The Linear Function
Another simple yet important function is the linear function, f(n) = n.
That is, given an input value n, the linear function f assigns the value n itself. This function arises in algorithm analysis
any time we have to do a single basic operation for each of n elements.
For example, comparing a number x to each element of an array of size n will require n comparisons.
The linear function also represents the best running time we can hope to achieve for any algorithm that processes each
of n objects that are not already in the computer’s memory, because reading in the n objects already requires n
operations.
4. The N-Log-N Function
The next function we discuss in this section is the n-log-n function, f(n) = nlog n, that is, the function that assigns to an
input n the value of n times the logarithm base-two of n.
We will see several important algorithms that exhibit a running time proportional to the n-log-n function. For example,
the fastest possible algorithms for sorting n arbitrary values require time proportional to nlog n
4. The Quadratic Function
Another function that appears often in algorithm analysis is the quadratic function, f(n) = n^2 . That is, given an input
value n, the function f assigns the product of n with itself (in other words, “n squared”).
The main reason why the quadratic function appears in the analysis of algorithms is that there are many algorithms that
have nested loops, where the inner loop performs a linear number of operations and the outer loop is performed a
linear number of times.
Thus, in such cases, the algorithm performs n · n = n^2 operations.
One interesting curiosity
The quadratic function can also arise in the context of nested loops where the first iteration of a loop uses one
operation, the second uses two operations, the third uses three operations, and so on. That is, the number of operations
is: 1+2+3+···+ (n−2) + (n−1) +n.
In other words, this is the total number of operations that will be performed by the nested loop if the number of
operations performed inside the loop increases by one with each iteration of the outer loop. This quantity also has an
interesting history.
In 1787, a German schoolteacher decided to keep his 9- and 10-year-old pupils occupied by adding up the integers
from 1 to 100. But almost immediately one of the children claimed to have the answer! The teacher was suspicious, for
the student had only the answer on his slate. But the answer, 5050, was correct and the student, Carl Gauss, grew up
to be one of the greatest mathematicians of his time. We presume that young Gauss used the following identity
5. The Cubic Function and Other Polynomials
Continuing our discussion of functions that are powers of the input, we consider the cubic function, f(n) = n^3, which
assigns to an input value n the product of n with itself three times. The cubic function appears less frequently in the
context of algorithm analysis than the constant, linear, and quadratic functions previously mentioned, but it does
appear from time to time.
The linear, quadratic and cubic functions can each be viewed as being part of a larger class of functions, the
polynomials. A polynomial function has the form

Therefore, we could argue that we are presenting just four important functions used in algorithm analysis.
6. Summations
A notation that appears again and again in the analysis of data structures and algorithms is the summation, which is
defined as follows:

And this can also be rewritten as:


7. The Exponential Function
Another function used in the analysis of algorithms is the exponential function, f(n) = b^n, where b is a positive
constant, called the base, and the argument n is the exponent. That is, function f(n) assigns to the input argument n the
value obtained by multiplying the base b by itself n times.
If we have a loop that starts by performing one operation and then doubles the number of operations performed with
each iteration, then the number of operations performed in the nth iteration is 2^n.
Growth Rates
Ideally, we would like data structure operations to run in times proportional to the constant or logarithm function, and
we would like our algorithms to run in linear or n-log-n time. Algorithms with quadratic or cubic running times are
less practical, and algorithms with exponential running times are infeasible for all but the smallest sized inputs.
What is this Asymptotic Analysis
Asymptotic complexity, also known as asymptotic analysis, is a method of analyzing the performance of algorithms as
the input size approaches infinity. It provides a way to estimate the efficiency of an algorithm for large inputs, without
having to actually run the algorithm with large inputs.
In asymptotic complexity analysis, we typically focus on the worst-case time complexity of an algorithm, which gives
us an upper bound on the number of operations required to solve a problem as the size of the input grows.
The term "asymptotic" in the context of asymptotic complexity refers to the behavior of a function as the input size
approaches infinity or some other limit.
In mathematics, an asymptote is a line that a curve approaches but never touches. In a similar way, the behavior of a
function can approach a certain value or rate of growth as the input size increases, but never quite reach it. For
example, the running time of an algorithm may grow at a rate of O(n^2), but never quite reach n^2, even as the input
size becomes very large.
Big O notation -> Or the Order of Growth of a Function
The term "big O" comes from the mathematical concept of asymptotic analysis, which is the study of the behavior of
functions as their inputs become very large. In this context, the "big O" notation is used to describe the upper bound or
maximum growth rate of a function, as its input size increases towards infinity.
It is commonly used in computer science and other fields to analyze and compare the efficiency of different
algorithms.
In big O notation, we express the worst-case scenario of an algorithm's time complexity as a function of the input size.
This allows us to compare how well different algorithms scale as the size of the input increases.
For example, if we have an algorithm that takes O(n) time, where n is the size of the input, we say that it has a linear
time complexity. This means that as the input size grows, the running time of the algorithm will grow at the same rate.
On the other hand, an algorithm with a time complexity of O(n^2) has a quadratic time complexity, meaning that its
running time will grow much faster as the input size increases.
Some Properties of the Big-Oh (or Big O) Notation
The big-Oh notation allows us to ignore constant factors and lower-order terms and focus on the main components of a
function that affect its growth.

Thus, the highest-degree term in a polynomial is the term that determines the asymptotic growth rate of that
polynomial.
Some Quick Examples
Let’s see the growth rate of 1, logn, n, nlog n, n^2, n^3, 2^n
It shows the importance of good algorithm design, because an asymptotically slow algorithm is beaten in the long run
by an asymptotically faster algorithm, even if the constant factor for the asymptotically faster algorithm is worse. even
if we achieve a dramatic speedup in hardware, we still cannot overcome the handicap of an asymptotically slow
algorithm. The third table shows a computer 256 times faster than the previous one.
Big O, Big Omega and Big Theta
Big O, Big Omega, and Big Theta are all ways to describe how the performance of an algorithm changes as the size of
the problem (input) changes.
Think of an algorithm like a machine that can take something (the input) and do something with it to give you an
answer (the output). If you give the machine a small problem, it may be able to give you the answer quickly. But if you
give it a bigger problem, it may take longer to give you the answer.
• Big O is like the worst-case scenario. It tells you how much longer the machine will take to give you the answer in
the worst-case scenario as the problem gets bigger.
• Big Omega is like the best-case scenario. It tells you how much longer the machine will take to give you the
answer in the best-case scenario as the problem gets bigger.
• Big Theta is like the average-case scenario. It tells you how much longer the machine will take to give you the
answer on average as the problem gets bigger.
Some examples
We have this code segment:

We must determine the order of the method before we can determine the order of the code segment. Let’s suppose that
the purpose of the method is to print the sum of the integers from 1 to n each time it is called. We might be tempted to
create a brute force method such as the following:
Solution
What is the time complexity of this printsum method? Keep in mind that only executable statements
contribute to the time complexity, so in this case, all of the executable statements are O(1) except for the loop.
The loop, on the other hand, is O(n), and thus the method itself is O(n). Now, to compute the time complexity
of the original loop that called the method, we simply multiply the complexity of the method, which is the
body of the loop, by the number of times the loop will execute. Our result, then, is O(n^2) using this
implementation of the printsum method.

But is there any possibility to have better efficiency?


Yes.
However, we know from our earlier discussion that we do not have to use a loop to calculate the sum of the
numbers from 1 to n. Now let’s rewrite our printsum method and see what happens to our time complexity:

Now the time complexity of the printsum method is made up of an assignment statement, which is O(1), and a
print statement, which is also O(1). The result of this change is that the time complexity of the printsum
method is now O(1), which means that the loop that calls this method goes from O(n^2) to O(n). We know
from our earlier discussion that this is a very significant improvement. Once again, we see that there is a
difference between delivering correct results and doing so efficiently.
Finding the Maximum of an Array
As a classic example of an algorithm with a running time that grows proportional to n, we consider the goal of finding
the largest element of an array. A typical strategy is to loop through elements of the array while maintaining as a
variable the largest element seen thus far. Using the big-Oh notation, we can write the following mathematically
precise statement on the running time of algorithm arrayMax for any computer.
Finding the Maximum of an Array – The Solution
The initialization at lines 3 and 4 and the return statement at line 8 require only a constant number of primitive
operations. Each iteration of the loop also requires only a constant number of primitive operations, and the loop
executes n − 1 times.

We conclude that the running time of algorithm arrayMax is O(n).


String and StringBuilder example complexity
We revisit the experimental study from the beginning of the lesson in which we examined two different
implementations for composing a long string. Our first algorithm was based on repeated use of the string
concatenation operator. The second was using StringBuilder.
In terms of efficiency, the problem with String interpretation is that the creation of a new string as a result of a
concatenation, requires time that is proportional to the length of the resulting string. The first time through this loop,
the result has length 1, the second time through the loop the result has length 2, and so on, until we reach the final
string of length n. Therefore, the overall time taken by this algorithm is proportional to 1+2+···+n, which we recognize
as the familiar O(n^2) summation. Therefore, the total time complexity of the repeat1 algorithm is O(n^2).
In contrast, the running times in that table for the repeat2 algorithm, which uses Java’s StringBuilder class,
demonstrate a trend of approximately doubling each time the problem size doubles. The StringBuilder class relies on
an advanced technique with a worst-case running time O(n).

You might also like