Populations
and samples
Depends on data or survey
Example
Population – survey CEOs of the world’s top 500
corporations
Parameters
Mean, m
Standard deviation, s
Sample – population has too many individuals
Choose sample of population
Conditions
Every individual in a population has a known
nonzero chance of being sampled
Equal chance for everyone
Has to be independent ; choosing one does not
influence the choice for choosing another
Have to be careful when defining a population
Book – each member of population has a number
Use a random number table to randomly select
individuals
Excel – the function is =rand( )
Distributed uniform (0, 1)
X ~UNIF(0, 1)
Select numbers between 0 and 1,000
=round(1000*rand(), 0)
The round function rounds a number to the integer
Each time you change something in Excel, Excel
recalculates the random numbers
Use Copy and Past Special to freeze the random
numbers and stop them from changing
Trick
– Generate random numbers with any distribution
Example – generate normally distributed random
numbers
Probability Density Function (PDF) – a function
that associates each value of a discrete random variable with the
probability that this value will occur.
Denoted as p(x) or f(x)
Cumulative Density Function (CDF) – integral of a
probability function
Denoted by a capital letter, such as P(x) or F(x).
If you sum
over all probabilities, then it has to equal one
A PDF and CDF is shown below
Use UNIF to get probability between 0 and 1
Find the inverse for P(X) using that random number
To randomly create a normally distributed variable
with mean and standard deviation, then the Excel function is
=norminv(rand(), mean, standard deviation)
Example
Find the random numbers for the distribution,
X_{i}~N(10, 25)
The notation is X_{i}~N(m,
s^{2})
The Excel function is = norminv(rand(), 10, 5)
Can use this method to find random numbers from any
distribution
Stratified
Random Sampling
You take a sample and then you divide a sample by
gender (male or female)
Then you divide by age, creating the four categories
0 – 30 years
31 – 40 years
40 – 60 years
> 60 years
You have a total of eight compartments
You randomly select individuals and fill the
compartments equally
Each compartment has 10 individuals
Unfortunately, males/females and age categories may
not be distributed evenly
Unbiasedness
– on average, the mean of a sample will equal its true parameter
value
The notation is E( ) = m
E stands for expected value
Precise – the study is repeatable, if we took
another sample, we get similar results
Nonrandom samples – makes our parameter estimates
biased
Some people in the population will never be selected;
they may be transient
Some people may not fill out the surveys
Some people may lie on surveys
Block
Randomization
Use Table F and choose block size 2, 4, 6, 8, and 10
Example – testing effectiveness of a new drug
We have 8 patients, and choose block size 8
Four patients get the new drug, while four patients
get the placebo
Our study has 8 patients who have a unique number
between 1 and 8
Patients could be a biased sample; however, we are
testing drug’s effectiveness
Then we have 8 patients who get the following
treatments
Treatment 
2 
3 
8 
5 
Placebo 
1 
4 
6 
7 
Standard
Error
Each time we take a sample, we get a different mean
Example

Sample 1:
=29.3

Sample 2:
=33.3

Sample 100:
=27.7
We do not want to keep taking samples to find the
variability in the mean
The standard error (SE) gives the variability in the
mean for repeated sampling
The formula
As the sample size increases, the standard error
decreases
With an infinite sample size, we know the true
parameter for the mean
Binominal
Distribution
We have two states,
P is probability that Event A happens
1 – P is probability that Event A does not happen
The states or events are mutually exclusive
We sampled 80 people and 43 went to college
The mean for people going to college (the event)
P = 43 / 80 = 0.5375
The probability for people who did not go to college
1 – P = (80 – 43) / 80 =1 – 0.5375 = 0.4625
The variance
The standard error is
It is possible to keep probability of events in
percents.
