Stratified Sampling - Proportional Allocation
Stratified Sampling
Technique: Proportional Allocation
Done By:
Anna Serene Boby, 2048115
An
Introduction:
Sampling procedures are
not unknown to many of us. Humans have always been very curious about finding
information that can help in improving the quality of life. This could be
considered as a cause of developing techniques with which we can collect
information on a large scale. This in turn led to the development of sampling methods
that help in collecting samples instead complete enumeration in many cases.
We use different types
of sampling technique in order to get a best estimate of the true population
values. There are several sampling techniques that is being used. The most
simple and easy way to collect a sample is the SRS (Simple Random Sample)
technique. This method consists of selecting units at random from the
population. Though, this method is easy it has several drawbacks. Another
method that has been developed to improve the collection of units is the
Stratified Random Sampling technique.
Stratified random sampling technique is
a method that is used to collect sample from population, by dividing the entire
population into stratas. This division is such that the units in each stratum
is similar to one another, and the units between two stratums are not
similar. This happens as the division into stratas will be based on the
presence of any particular characteristic. This sampling technique is similar
to cluster sampling, but is different in aspects of strata formation. There are
different allocations possible in stratified random sampling. They are: -
Equal, Proportional, Neyman's and Cost Optimum.
We will be focusing on Proportional
Allocation. The other allocation techniques are based on certain other
parameters. For instance, in Equal Allocation, the units are collected from
each stratum equally. This is irrespective of the stratum size.
The simplest
one is using Equal Allocation, where as mentioned equal number of units are selected from
each strata. Another method is “Proportional Allocation”, which is the method
that will be considered. In this method, units are collected from the strata
based on the size of each stratum i.e. there is proportional allocation here. Proportional
allocation becomes more sensible to use in several situations as this adds more
value and quality to the sample collected. The samples are collected from each
stratum using the below mentioned formulas:
N: The total population
size
n: Total sample size and,
are the samples collected from each stratum selected, whose sum would be ‘n’.
are all the
different stratas. Where their sum would be equal to N. ‘L’ is the total number
of stratas. The samples from each stratum is obtained as:
The variance can be
obtained as:
Where
The mean can be
obtained as:
Application of
Proportional Sampling:
As an application of
proportional sampling we can use any data set based on gender. Since there are
(in many cases) only 2 genders, we can categorize the population based on them.
A real life application would be to find the proportion or average of age of people
based on their gender. This type of analysis will help in understanding what
proportion of population is greater than the other and also to get their age.
This method is best used to analyse the average ages of men and women in a
particular population. The following R example will make it clear.
For explaining the Sampling
Technique we will be using data set that has been taken from “Kaggle”. This
data set contains information on the preferences of people regarding the color
based on their gender and age. Using our analysis we can see what is the
average age of a person with respect to their gender. We can also find the
average age of a person that likes “cool”. We have people from ages that range
from about 40 to 60. We can divide the stratas as elderly men and elderly
women.
For
explaining the Sampling Technique we will be using data set in R.
We can classify them into different stratas based on the gender i.e. “F” as one and “M” as another stratum.
Thus, we get the average age of the people in the stratum as 48.9 years who likes “cool”. The average age of elderly women who like “cool” is 48.72 years and the average age of men who like “cool” is 49.11 years. Thus, this method shows us why it is better to go for proportional sampling technique when we have gender related cases under study. We have also obtained the confidence interval in which the true population value will lie.
This
technique will help in getting better and valid estimates of the population
parameters. In our analysis we have chosen a sample of size 20 (n). The total
population is of 60 (N) people. The proportion of male is lesser than females. This
sampling technique can also be used for hypothesis testing. When we have a hypothesis
to test, say for instance in our study the hypothesis was:
H0: There is no significant difference in the ages of elderly
men and women who like color “cool”.
H1: There is a significant difference in the ages of
elderly men and women who like color “cool”.
From the results that we have obtained we see that
their ages almost coincide i.e. they are almost similar. Hence, we can accept
the Null hypothesis that there is no significant difference in the ages of
elderly men and women who like the color “cool”.
In this analysis we have chosen proportional
allocation because it helps us in estimating the true value than in case of
SRS. This technique could be more efficient as we have men lesser than women in
our study. There is a possibility that SRS might have given us incorrect
results. Hence, we go for proportional stratification in such cases.
Advantages of Proportional Allocation:
There are several advantages of using proportional
stratified sampling over SRS.
1.
They
give us estimates that have more precision with lower variance that the ones
given by SRS (Simple Random Sampling).
2.
When
the entire population is grouped into strata and then the sample is collected
in accordance to the proportion of the stratum, it gives use better estimates
as it becomes a better representative of the population.
3. It is better to use this sampling technique when we have great variability in the proportion of units under study. As in, when the population is homogeneous a SRS would help us in our study. However, if there is greater heterogeneity, then we go for stratification.
Disadvantages of Proportional Allocation:
1.
One
among the major disadvantage of proportional allocation is when any unit from
the population cannot be classified into the stratas so formed.
2.
Another
disadvantage of this technique takes place when a single sample from a stratum
is very large. This increases the cost of sampling.
Comments
Post a Comment