0% found this document useful (0 votes)
18 views34 pages

Discrete Random Variables

Uploaded by

jehadalam123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views34 pages

Discrete Random Variables

Uploaded by

jehadalam123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Discrete Random Variables

Table of Contents

Random Variables...............................................................................................................................3

Limitations of Simple Probability Models...............................................................................3

Discrete and Continuous Sets.................................................................................................4

Definition of Random Variables.................................................................................................5

Distribution Functions......................................................................................................................8

Events Defined by Random Variables.....................................................................................8

Probabilities Defined by Random Variables..........................................................................9

Probability Mass Function (PMF).............................................................................................9

Cumulative Distributive Function (CDF)...............................................................................11

Derived Random Variables.............................................................................................................14

Expectations......................................................................................................................................15

Variances.............................................................................................................................................17

Well-Known Random Variables....................................................................................................18


Bernoulli Random Variables......................................................................................................19

Bernoulli Trials...............................................................................................................................21

Geometric Distributions............................................................................................................24

Binomial Distributions................................................................................................................26

Negative Binomial Distributions.............................................................................................27

Uniform Distributions.................................................................................................................27

Hypergeometric Distributions................................................................................................28

Poisson Distributions.................................................................................................................29

Truncated Geometric Distribution.........................................................................................31

Expected Values of Well-Known Discrete Random Variables.....................................32


Random Variables

Random variables are key to this course, and understanding them is a core part of

this course.

Limitations of Simple Probability Models

We use random variables to define the probability models associated with a random

experiment in a more sophisticated manner. We know how to define simple

probability models, but if we look at these probability models a little closely, we will

notice some limitations. The limitations are not severe, but they still exist.

The limitations include:

 Regarding the elements of the sample space

o The element could be anything at all, including non-numeric objects

o Further processing is not possible due to the above point

 Regarding the probability model as a whole

o It is not compact

o The sample space and probabilities are defined separately

Random variables help us overcome these limitations.


Discrete and Continuous Sets

Random variables can be discrete, or continuous. It can also be mixed, but we will not

be studying mixed random variables here. Before we can define the two types of

random variables, we need to see what discrete sets and continuous sets are.

In a discrete set, the number of elements is finite, or at least countable for an infinite

number of elements. For example, a random experiment that tests if a data packet

has been sent successfully or not has a finite number of elements in its sample

space. Another example could be the number of attempts needed to send a data

packet successfully. There are an infinite number of elements in the sample space

for this random experiment, but the elements are countable.

In a continuous set, the number of elements is infinite and uncountable. For example,

consider a line from 0 to 1, on which we have to choose a point. There are an infinite

number of points we can choose, and the values can get infinitely more precise,

making the set uncountable. A continuous set like this one is represented as S= [ 0 ,1 ]

, which signifies all the values between 0 and 1. There are a few other

representations, but we will look into those in the next chapter.

Obviously, the random variables that apply to each of these types of sets are called

discrete random variables and continuous random variables respectively.


Definition of Random Variables

The first limitation with simple probability models was the fact that the sample space

did not necessarily consist of numbers, which made further processing difficult. For

example, consider that our observation is the sum of two consecutive coin tosses.

This would force us to deal with ordered pairs, which is problematic. Thus, we need a

process to convert these non-numeric elements into numbers.

A random variable is a real valued function that converts each element of a sample

space S into a real number. If ω is an event that is in the sample space S, then X is a

function that, when given the parameter ω , produces a numeric result x , i.e. X ( ω )=x .

This function is called a random variable. For simplicity, the function X ( ω ) is simply

written as X .

The function X can be one-to-one, mapping each element in the sample space to

one value, or many-to-one, mapping several elements, or an event, to a single value.

Thus, X gives us a compact way of mentioning an event while simultaneously

capturing the uncertain variability (the fact that the value can change in an uncertain

manner) and allowing us to use numbers to represent events.

Procedure: Send 3 data packets

Observation: Number of successful deliveries

Sample Space: S= { FFF , FFD , FDF , FDD , DFF , DFD , DDF , DDD }

A random variable associated with an experiment depends on how we define the

function. Thus, we can have multiple random variables for a single experiment if we

define them in multiple different ways.


We need to define the function in a way such that each of the events is converted to

a real number. In this case, the events can be represented as

Ei ={ idata packets are successfully sent }

Thus,

E0 ={ FFF }

E1= { FFD , FDF , DFF }

E2= { FDD , DFD , DDF }

E3 ={ DDD }

E={ E0 , E1 , E 2 , E3 }

{
0 , ω ∈ E0
1 , ω ∈ E1
X ( ω )=
2 , ω ∈ E2
3 , ω ∈ E3

We can even represent X using a diagram.

Thus, P [ X <2 ] represents the probability of an event occurring, where the elements

of the event are E0 and E1, or FFF , FFD , FDF and DFF . Essentially, it means that ω is

an element of S such that X ( ω ) <2, i.e. { ω ∈ S : X ( ω )< 2 }.


P [ X <2 ] =P [ E0 ] + P [ E1 ] =( 1−p ) + 3 p ( 1− p )
3 2

1 1
If p= , P [ X <2 ] = .
2 2

To complete the probability model, we still need to identify the set of values

associated with random variables. A discrete random variable is usually associated

with a finite number of values, the set of which is represented as S X .

S X ={ 0 , 1 ,2 , 3 }.

Procedure: Rolling 2 dice

Observation: Sum of rolls

E={ E1 , E 2 , E3 , E4 , E5 , E6 , E 7 , E 8 }

Ei ={∑ is i} where i=2 , 3 , 4 , 5 , 6 ,7 ,8

X ( x 1 , x 2 )=x 1 + x 2=x

S X ={ 2 , 3 , 4 , 5 ,6 ,7 ,8 }
Distribution Functions

Events Defined by Random Variables

We have previously seen how taking any set of elements from the sample space

creates an event. Now let’s look at how random variables can be used to define

events.

Say X is a random variable. We know that X ( ω ) maps elements of S into an element of


S X . The result of the function is a point on a number line. Thus, since ω is an event,

we need to define X in such a way so that it defines an event space, meaning each

event is mapped to a number. If x is a result of the function X , then when X =x , it

represents an event. More specifically, if ω ∈ S : X ( ω ) =x, then this is an event. X =x is

defined by a number of elements from the sample space that map to the number x .

For example, if E2= { FDD , DFD , DDF }, then X =2. Thus, X =2 defines that event

space.

Given some value x , we are interested in the probabilities P [ X=x ] , P [ X ≤ x ] and


P [ X > x ]. These three are said to define the distribution function.
Probabilities Defined by Random Variables

The probability that the random variable X has a value x is given by P [ X=x ] . This

means that when we ran the experiment, we got an outcome ω from S that is

mapped by the function X ( ω ) to the number x . Thus, not only is each value of S x

mapped from an element or event from S, the probabilities are mapped as well.

For the previous example where we send three data packets, the sample space is
S= { FFF , FFD , FDF , DFF , FDD , DFD , DDF , DDD }, creating the set S X ={ 0 , 1 ,2 , 3 }.
1 3 3 1
Here, P [ X=0 ] = , P [ X=1 ]= , P [ X=2 ]= and P [ X=3 ] = .
8 8 8 8

P [ X=x ] gives us the amount of probability assigned to the number x . Using this, we

have assigned a probability to every possible number. For example, P [ X=2.5 ] =0.

Probability Mass Function (PMF)

P [ X=x ] is called the probability mass function (PMF), and is denoted by P X ( x ). It is

one of the distribution functions associated with random variables. It represents the

probability of the event { X =x } occurring, where x is the value for some event for the

random variable X . It is called a distribution function, since it describes the

distribution of probability over the number line. It is presented as a list of all its

possible values.

{
1
x=0 ,3
8
P X ( x )= 3
x=1 ,2
8
0 otherwise
Notice that using this representation, we are able to represent the whole probability

model in a compact manner, and have the sample space and the probabilities

together. Thus, we have solved the second limitation of simple probability models.

There are three conditions associated with the PMF.

For x ∈ S X , P X ( x ) >0

For x ∉ S X , P X ( x )=0

∑ P X ( x ) =1
x∈ SX

Finally, the reason this is called a probability mass function, is because the

probabilities can be considered to be masses.


Example

Consider a gambling game. Say we need to pay $ 1 to play. The game is that three

coins are tossed one after another and the outcomes are seen. There are four

possible outcomes to this game.

 If the number of heads is 0 , we get nothing back, making the net gain −1.

 If the number of heads is 1, we get $ 2, making the net gain +1.

 If the number of heads is 2, we get $ 3, making the net gain +2.

 If the number of heads is 3, we get $ 4 , making the net gain +3 .

Given that the coins are biased and P [ T ] =0.85 , we need to develop the probability

model.

Let’s assume the random variable X ( ω )=net gain . Thus,


S= { TTT ,TTH , THT , HTT , THH , HTH , HHT , HHH } and S X ={−1 , 1, 2 , 3 }.

{
0.61 x=−1
0.32 x=1
P X ( x )= 0.06 x=2
0.01 x=3
0 otherwise

Cumulative Distributive Function (CDF)

The word ‘cumulative’ means starting from the very first possible instance until the

given instance, unless otherwise mentioned. Thus, the cumulative distributive

function gives us the probability of the event { X ≤ x }={ ω ∈ S|X ( ω ) ≤ x } occurring. The

CDF is denoted by F X ( x ), where F X ( x )=P [ X ≤ x ].


The only difference between the CDF and the PMF is that the PMF deals with explicit

values of x , while the CDF deals with all possible values of x . Thus, there is a

possibility that a value of x such that x ∉ S will have a probability under CDF.

 If x is less than the least possible outcome in the given situation, meaning
x ∈ (−∞ , x min ), the probability will be 0 .

 However, if the value of x is within the valid range of outcomes, meaning


x ∈ [ x min , x max ] , even if x itself is not one of the outcomes, it will have some

probability depending on its value.

The exact value depends on where in the range x is. If x ∈ ( x 1 , x 2 ) such that
x 1 ∈ S X and x 2 ∈ S X , meaning x is between two valid members of S X , then

P [ X ≤ x ] =P [ X ≤ x 1 ] , the CDF of the lower value.

 If x is larger than the greatest possible outcome, i.e. x ∈ [ x max , ∞ ), then

P [ X ≤ x ] =1 since the given x is always greater than any possible value of X .

For the example of sending three data packets,

{
0 x <0
1
0 ≤ x<1
8
4
F X ( x )= 1 ≤ x <2
8
7
2 ≤ x <3
8
1 x≥3
Comparing this diagram to the one for PMF, we notice that the value of P X ( x ) for each

value of x corresponds to the change in F X ( x ).

F X ( x )=P [ X ≤ x ] = ∑ P X ( y )
Thus, y ∈S X .
y≤x

Between any two numbers x 1 ∈ S X and x 2 ∈ S X , the curve is flat. If x ∉ S X , the curve is

flat for x . If x ∈ S X , the curve is a straight vertical line at x . This vertical line

corresponds to a jump at x , the amount of which is equal to P X ( x ).

We want to find the probability P [ x1 < X ≤ x 2 ], where x 1< x2 .

The event { X ≤ x 2 } can be broken down into two events, { X ≤ x 1 }∪ {x 1 < X ≤ x 2 }.


Essentially, we are taking the two parts that make up the event { X ≤ x 2 } separately.

Thus,

P [ X ≤ x 2 ]=P [ X ≤ x1 ] + P [ x 1< X ≤ x 2 ]

P [ x1 < X ≤ x 2 ] =F X ( x 2) −F X ( x 1)

which makes sense.


Derived Random Variables

A derived random variable is defined from another random variable. Thus, the derived

random variable converts every element x ∈ S X into an element y ∈ SY . Here, we say


y=G ( x ), or Y =G ( X ).

The derived random variable can be a one-to-one or a many-to-one function. The

function G : X →Y .

Consider a scenario where S X ={−1 , 1, 2 , 3 } and SY = {1 , 4 ,9 }. Thus, the function G is

such that Y = X 2. This is a many-to-one function. Here, to find P [ Y = y ] , we need to

sum P [ X=x ] for all such x where g ( x )= y .

PY ( y )=P [ Y = y ] = ∑ PX ( x )
x∈ SX
g ( x )= y

The same formula could be used for a one-to-one function, but of course in that

case we would just be summing a single value.

{ {
0.61 x=−1
0.93 y =1
0.32 x=1
0.06 y=4
Thus, for P X ( x )= 0.06 x=2 , we have PY ( y )= .
0.01 y=9
0.01 x=3
0 otherwise
0 otherwise

From this, we can easily calculate F Y ( y ).


Expectations

One of the criticisms we made of the simple probability model is that it does not allow

for further processing of the probability model. We mentioned that once the sample

space had been converted to real numbers and a more sophisticated probability

model, like the PMF, was being used, further processing would be possible. One such

further processing is expected values or expectations.

Say someone gambles in a game n times. One scenario is where n is very small, like 1

or 2 or 3. The net gain in this case depends entirely on luck. A lucky person may play

the game once and win a large amount, while an unlucky person could play a few

times and win nothing.

However, if n is very large, to the extent that n extends to ∞ , the net gain no longer

depends on luck, but rather on the laws of probability. There will be a certain number

of outcomes of each possibility, and the net gain can be found by summing all of

those outcomes.

The number of wins is related to the PMF, which is the probability. In this case, the

probability depends on the frequencies of the different outcomes occurring. If a

certain outcome has a frequency of x , that outcome will appear n × x times. This will

not be an exact result, but any discrepancies will be ignorable.


From all of this, we will be able to create a table of sorts.

Net Gain Probability Total Gain

−1 n × 0.61 −0.61 n

+1 n × 0.32 +0.32 n

+2 n × 0.06 +0.12 n

+3 n × 0.01 +0.03 n

Grand Total −0.14 n

The value of the grand total that we just found is an important quantity, which is

called the expected value. Regardless of our luck, if we play the game a large number

of times, on average, this will be the net gain, or rather net loss, of $−0.14 per game.

If this number was 0 , this would be a fair game, since on average a player would

neither lose nor win.

This is the strategy used by casinos to ensure that they do not lose money. They do

not bother about winning every single game, but the games are rigged so that, on

average, they profit.

The expected value is dependent on two things, the values of the random variable X

and the probabilities with which those values appear.

E [ X ]= ∑ x ⋅ P X ( x )
x∈ SX

If conflicts occur, the expected value may also be represented as μ or μ X .

If we use the analogy of mass that we used with PMF, the expected value represents

the centre of mass.


Variances

Although expectations are generally a good representation of random variables,

there are still situations where expectations might not work well. Variance gives us

the average deflection of the values of random variables from the expected value.

This deflection could be positive or negative. Simply put, it is the difference of the

actual value of X from the expected value.

The variance of a random variables X is given by Var [ X ] =E [ ( X−E [ X ] ) ]. However,


2

programmatically, this would require us to pass over all the values in S X two times.

Thus, a simpler alternative formula, Var [ X ] =E [ X 2 ]−( E [ X ]) , is used.


2

Using the example from the previous table,

Var [ X ] =(−1+0.14 ) ⋅ 0.61+ ( 1+0.14 ) ⋅0.32+ ( 2+ 0.14 ) ⋅0.06+ ( 3+ 0.14 ) ⋅0.01


Well-Known Random Variables

Finding the probability model for a random experiment in real life can be tedious in

some cases, even involving multiple random variables sometimes. However, most

random experiments actually follow the same general model. For example, tossing a

coin repeatedly until we get a head is the same as sending a data packet repeatedly

until it is successfully delivered. There are a few well known random variables, and we

should always try to fit a given random experiment into one of these random

variables first, before going through the manual process of defining the probability

model. We will generally find most random experiments can be fit into one of them.

Well-Known random variables can be divided into two categories. Uniform random

variables, Poisson random variables and Hypergeometric random variables fall under

one category, while Bernoulli random variables, Geometric random variables, Binomial

random variables and Negative Binomial random variables, also known as Pascal

random variables, fall into another category.


Bernoulli Random Variables

Bernoulli random variables are the simplest random variables, but they are also one

of the most important. They are associated with Bernoulli experiments. Bernoulli

experiments have two outcomes and the outcomes are classified as success/failure

or on/off. Essentially, the outcomes are opposites of each other. Tossing a coin falls

into this category, as does sending a packet of data. If there are more than two

outcomes for a Bernoulli experiment, they will still be dividable into two groups, one

related to success and another to failure. For example, rolling a die and trying to get

an even number is a Bernoulli experiment since the outcomes can be divided into

two groups.

The Bernoulli random variables are defined such that any outcomes related to

successes are converted to 1, while outcomes related to failures are converted to 0.

Thus, S X ={ 0 , 1 }.

If we consider the total probability of all successful outcomes to be p, then

{
1− p x=0
P X ( x )= p x=1
0 otherwise

x 1−x
It is also possible to write this PMF in a single line as P X ( x )= p ( 1−p ) , x=0 , 1.
Similarly, the CDF can be expressed as

{
0 x <0
F X ( x )= 1− p 0 ≤ x <1
p x≥1

In some advanced courses, all of this information is expressed simply as X Ber ( p ) ,

where is used to express ‘distributed as’. In this case, instead of using the

term ‘distribution function’, the PMF and CDF can be called a family of distribution.

This is because the parameter, p, is not a specific value in this case.


Bernoulli Trials

Bernoulli experiments are actually building blocks for a wide range of random

variables. It is possible to combine two or more Bernoulli experiments to create a

complex experiment that represents another well-known random variable.

A Bernoulli trail is a sequence of Bernoulli experiments. A single Bernoulli experiment

is repeated multiple times. For example, tossing a coin repeatedly is a Bernoulli trial.

However, not all sequences of repeated Bernoulli experiments count as Bernoulli

trials. A Bernoulli trial must satisfy a few conditions:

 Individual Bernoulli experiments must be independent, meaning one

experiment cannot depend on another one. Independence is ensured by:

o Checking that the outcome of one experiment does not affect the

outcome of another

o The probability of success is a fixed number, i.e. p is constant

 The number of repetitions must be either

o Fixed, meaning the experiment is repeated n times exactly

o Dependant on a condition, such as a coin being tossed repeatedly until

a head occurs
Example 1

Procedure: Keep sending data packets until a success occurs.

Sample Space: S= { D, FD , FFD , FFFD , … }

Here, say a single attempt has a probability of success p.

Let’s say we have a random variable X =number of attempts until first success . Thus,
S X ={ 1 , 2, 3 , … } .

Now we need to find P [ X=x ] for all values of x ≥ 1. Once we find this, the probability

model is complete.

Notice how we do not have just two outcomes here. This is because this is not a

single Bernoulli experiment, it is a sequence of Bernoulli experiments, a.k.a. a

Bernoulli trial.

Example 2

Procedure: Send 3 data packets and count the number of successes

Sample Space: S= { FFF ,… . , DDD }

Here, let X =number of successes . Thus, S X ={ 0 , 1 ,2 , 3 }. Here, we do not care about the

order of the successes. Finally, we need to calculate P [ X=x ] .


Example 3

Procedure: Keep sending data packets until 3 successes occur.

Observation: The number of packets sent.

Sample Space: { DDD , FDDD , DFDD , DDFD , … }

Here, X =number of packets sent and S X ={ 3 , 4 ,5 , … }. Thus, we need to find P [ X=x ] for
x= {3 , 4 ,5 , … }.

We can consider all three examples above together. We have a Bernoulli trial,

meaning the probability p is fixed and the number of repetitions is either fixed, or

conditional.

Example Number of Random Distribution


Condition Parameters Short-Form
Number Repetitions Variable X Type

Until first Number of Probability of


1 Conditional Geometric X geom ( p )
success repetitions success, p

Number of
successes, n
Number of
2 Fixed n N/A , Binomial X bion ( n , p )
successes
Probability of
success, p
Number of
repetitions, k Negative
Until k Number of
3 Conditional , Binomial X pascal ( k , p )
successes repetitions
Probability of (Pascal)
success, p

Thus,

 Geometric distribution deals with the number of trials required for a single

success
 Binomial distribution deals with a specific number of successes

 Negative binomial distribution deals with the number of failures before a

specified number of successes

Keep in mind that the three distributions given above are not single distributions, but

rather a family of distributions.

Geometric Distributions

Geometric distributions are connected to a family of random variables, called

Geometric Random Variables. They are also related to binomial random variables,

since they are special case of binomial random variables, where n=1.

In a geometric distribution, we repeat a certain Bernoulli experiment independently

until we get a success. The sample space is thus S= { D, FD , FFD , … }. We can define

the random variable as X =number of repetitions needed ¿ get the first success

As such, S X ={ 1 , 2, 3 , … } . This is of course a discrete set, since there are an infinite

number of elements, but the elements are countable.

Our goal is to find the PMF probability model for a geometric distribution. Whenever

we are describing the probability model, we need to give a general solution. To do

this, we need to start calculating the probability of every value in S X .


Since this is a sequence of Bernoulli experiments, there are only two possibilities.

The probability of a success can be denoted as p. Thus P [ X=x ] = p ⋅ ( 1− p )x−1. Notice

that this pattern follows the pattern for a geometric series. This is exactly why this is

called a geometric distribution, and the random variables associated with it are called

a family of geometric random variables.

{
x−1
p ⋅ ( p−1 ) x≥1
P X ( x )=
0 otherwise

If p=0.5, the PMF graph would look like this:

x
The CDF will be given by F X ( x )=1−( 1− p ) when x ≥ 1.

A simple variation of the geometric distribution is where we define the random

variable X as X =number of failures before the first success . This means S X ={ 0 , 1 ,2 , … }

and P X x =
( )
0 {
p ⋅ ( p−1 ) x x≥0
otherwise
.
Binomial Distributions

A binomial random variable defines the number of successes in n attempts. For n=3,
S= { FFF , FFD , FDF , … DDD }. Thus, S X ={ 0 , 1 ,2 , … n }.

X =number of successes∈n attempts

Here, P [ X=x ] = C x ⋅ p ⋅ ( 1− p )
n x n− x
. Thus,

{
n x n− x
C x ⋅ p ⋅ ( 1− p ) x≥0
P X ( x )=
0 otherwise

Notice how this formula follows the pattern of the binomial formula. Thus, this is

called a binomial distribution and the random variables associated with it are called a

family of binomial random variables.

For n=4 , the PMF graph would look like this:

The CDF for binomial variables must be found by summing up the individual PMFs.

There is no general formula.


Negative Binomial Distributions

In negative binomial distributions, X is defined as

X =number of repetitions¿ get k sucesses .

For k =3, S= { DDD , FDDD , DFDD , , … } and S X ={ 3 , 4 ,5 … }.

Here, P [ X=x ] =
x−1 k x−k
C k−1 ⋅ p ⋅ ( 1−p ) . Notice the condition that the last delivery has to

be a success. Thus,

{
x−1 k x−k
C k−1 ⋅ p ⋅ ( 1− p ) x≥k
P X ( x )=
0 otherwise

Uniform Distributions

Uniform distributions are also a family of distributions, since they have parameters

and depending on the parameter, we will get a different distribution.

Say X is a discrete random variable and S X ={ x 1 , x 2 , … , x n }, where n is the number of

possible values of X .

We can say that X is a random variable for a uniform distribution if all of the possible

values of X are equally likely, meaning they have the same probability.

P [ X=x 1 ] =P [ X =x 2 ]=…=P [ X=x n ]


{
1
x ∈ SX
P X ( x )= n
0 otherwise
In the above case, we have considered that X can have any value at all. There is a

slightly different variant that assumes that the values of X are successive integers. It

can start from any integer number, but they should be within an interval, say [ l ,m ] ,

where l<m. Thus, S X ={ l ,l+1 , l+ 2, … , m }. Since there are m−l+1 values, the probability
1
of each will be .
m−l+1

{
1
x∈SX
P X ( x )= m−l+1
0 otherwise

{
0 x <l
x−l+ 1
F X ( x )= l ≤ x <m
m−l+1
1 x ≥m

Hypergeometric Distributions

Say we have a box with some ICs, a of which are in a good condition and b of which

are defective. We now pick the ICs from the box in two conditions, firstly, with

replacement, and secondly, without replacement. The probability of picking any

individual IC is the same.

Say in the first scenario we pick n ICs with replacement, putting each IC back after we

pick them. Here, let X =number of good ICs. Thus, S X ={ 0 , 1 ,2 , … , n }.

a
Here, the probability of picking a good IC is p= and the probability of picking a
a+ b
b
defective IC is ( 1− p )= . Since there are only two possible outcomes for each
a+b
experiment, this means each experiment is a Bernoulli experiment, and the entire

process of picking n ICs with replacement is a Bernoulli trial. More specifically, we are

technically just counting the number of successes, so X ( n , p ). A such,


¿

( )( )
x n−x
n a b
P X ( x )= C x .
a+ b a+b

Now consider the scenario where we pick n ICs, but without replacement. In this

scenario, every time we pick an item, the number of items remaining in the box

changes. Say X =number of good ICs, and we want to find P [ X=x ] .

This case is a hypergeometric distribution, and we will solve this by counting the

number of ways we can do the operations. First, consider how many ways we can
( a+b )
pick n ICs from a set of ( a+ b ) ICs. This is obviously C n. Next, consider how many

ways we can pick exactly x good and ( n−x ) defective ICs. This is C x × C n−x . Thus,
a b

a b
C x × C n−x
P [ X=x ] = ( a+b )
Cn

{
a
C x × bCn− x
x ≤ a ; ( n−x ) ≤ b
P X ( x )= ( a +b )
Cn
0 otherwise

Poisson Distributions

In Poisson distributions, some event always occurs. For example, a car arriving at its

destination, a customer arriving at a shop, arrival of packets at a router. For such

scenarios, the average arrival rate will be given to us, i.e. the number of arrivals per
unit time, denoted by λ . This unit of time could be anything from a second to months

or years, whatever is appropriate for the situation.

We need to define the time period, T, and the random variable


X =number of events∈timeT . For example, we might want to know the number of

earthquakes that might occur in the next two years. As such, S X ={ 1 , 2, 3 , … ∞ }.

We are interested in calculating P [ X=x ] , where x ≥ 0 . However, we will not be deriving

the formula for the PMF of Poisson distributions, due to the complicacy of the

formula. We will also be provided the formula during any examinations, so it is not

necessary to remember it.

Poisson distributions are actually approximations to binomial distributions. As we

know, in binomial distributions, we need to calculate C x . For a value of n like n=10000


n

and a value of x like x=2, C x can give us huge numbers. In such a case, calculations
n

can become very cumbersome. Consider the case of a shop owner who knows 4.5

customers arrive per hour in their shop. If


X =number of customers that arrive∈the next hour , we want to know P [ X=x ] .

If we divide the one hour into seconds, then the average number of customers
4.5
arriving per second is . From this, we can assume that this value is the
3600
probability of a customer arriving in a particular second and the probability of a

customer not arriving in a particular second is 1−¿ this value. Then, in each second,

we have a Bernoulli experiment. Thus, the repetition of this experiment for 3600 × x

seconds is a Bernoulli trial.


Say x=2. Thus, S X ={ 0 , 1 , … ,7200 }. Thus, P [ X=x ] = C 2 ⋅ p ( 1− p ) . This is possible
7200 2 7198

to calculate, but cumbersome. Thus, we can make an approximation using a Poisson

random variable.

{
e−α ⋅α x
x≥0
P X ( x )= x !
0 otherwise

Thus, Poisson random variables are only defined for positive values of x . Here,
α =λ × T . For the example above, α =4.5× 2.

Example

We frequently experience misdialled calls or cross connections. Say in an office, the

average number of misdialled calls, λ=4 . We want to find the probability of getting at

least 2 wrong calls by tomorrow.

α =4 ×2=8

X =number of misdialled calls∈time T

P [ X ≥2 ]=1−P[ X <2]

¿ 1−P X ( 0 )−P X ( 1 )

−8 0 −8 1
e ⋅8 e ⋅ 8
¿ 1− −
0! 1!

−8 −8
¿ 1−e −8 e

−8
¿ 1−9 e
Truncated Geometric Distribution

When we calculated geometric random variables, we did not place an upper limit on
X . Thus, S X ={ 1 , 2, … , ∞ }. For example, if we are sending data packets and we are not

getting a message telling us a particular packet was delivered, we assume there was

an error and keep sending it.

However, how many times do we keep doing this? What if the device we are trying to

send the packet to is offline? If we just keep sending the packet, it will be a huge

waste of network resources. Thus, we need to put a limit on it. This limit is denoted

by R . The geometric random variable is said to be truncated. Thus, S X ={ 1 , 2, … , R }.

For the first attempt, P [ D ] =p and P [ F ] =1−p . For the second attempt, P [ D ] =p (1−p )

and P [ F ] =( 1−p )2. For the third attempt, P [ D ] =p (1−p )2 and P [ F ] =( 1−p )3. If we

truncate the geometric distribution at R=3, we cannot have any more attempts after

this. Geometric distributions always end in a success, but here, we have a situation

where we may have to end without any successes.

Remember that our goal originally was find the probability that x attempts are

needed, not the probability that the packet is sent successfully. Till x=2, we have no

problems, but at x=3, we will not try again. As such, the probability that 3 attempts

are needed, regardless of whether it gives us a success or a failure, is the same as

the probability that there are 2 failed attempts.

{
p ( 1− p ) x−1 x=1 ,2 , … , R−1
P X ( x )= ( 1− p )R−1 x=R
0 otherwise
Expected Values of Well-Known Discrete Random Variables

E [ X ]= ∑ x ⋅ P X ( x )
x∈ SX

Var [ x ]= ∑ ( x−E [ x ]) ⋅ P X ( x )
2

x ∈S X

For Bernoulli random variables,

{
1− p x=0
P X ( x )= p x=1
0 otherwise

E [ x ] =0 ⋅ ( 1− p ) +1 ⋅ p= p

Actually, E [ x ] =np. (A little unsure about why, but that’s what he said.)

1
Similarly, for geometric random variables: E [ x ]=
p
n
For negative binomial random variables: E [ x ]=
p
For Poisson random variables: E [ x ] =λ

We just need to know these values, nothing more.

You might also like