Stratified Sampling - Proportional Allocation

 

Stratified Sampling Technique: Proportional Allocation

Done By:

Anna Serene Boby, 2048115

An Introduction:

Sampling procedures are not unknown to many of us. Humans have always been very curious about finding information that can help in improving the quality of life. This could be considered as a cause of developing techniques with which we can collect information on a large scale. This in turn led to the development of sampling methods that help in collecting samples instead complete enumeration in many cases.

We use different types of sampling technique in order to get a best estimate of the true population values. There are several sampling techniques that is being used. The most simple and easy way to collect a sample is the SRS (Simple Random Sample) technique. This method consists of selecting units at random from the population. Though, this method is easy it has several drawbacks. Another method that has been developed to improve the collection of units is the Stratified Random Sampling technique.

Stratified random sampling technique is a method that is used to collect sample from population, by dividing the entire population into stratas. This division is such that the units in each stratum is similar to one another, and the units between two stratums are not similar. This happens as the division into stratas will be based on the presence of any particular characteristic. This sampling technique is similar to cluster sampling, but is different in aspects of strata formation. There are different allocations possible in stratified random sampling. They are: - Equal, Proportional, Neyman's and Cost Optimum.

We will be focusing on Proportional Allocation. The other allocation techniques are based on certain other parameters. For instance, in Equal Allocation, the units are collected from each stratum equally. This is irrespective of the stratum size. 

The simplest one is using Equal Allocation, where as mentioned equal number of units are selected from each strata. Another method is “Proportional Allocation”, which is the method that will be considered. In this method, units are collected from the strata based on the size of each stratum i.e. there is proportional allocation here. Proportional allocation becomes more sensible to use in several situations as this adds more value and quality to the sample collected. The samples are collected from each stratum using the below mentioned formulas:

 

N: The total population size

n: Total sample size and, 


  are the samples collected from each stratum selected, whose sum would be ‘n’.


are all the different stratas. Where their sum would be equal to N. ‘L’ is the total number of stratas. The samples from each stratum is obtained as:



The variance can be obtained as:



Where 


 

The mean can be obtained as:


and 


is the independent sample mean.


Application of Proportional Sampling:

As an application of proportional sampling we can use any data set based on gender. Since there are (in many cases) only 2 genders, we can categorize the population based on them. A real life application would be to find the proportion or average of age of people based on their gender. This type of analysis will help in understanding what proportion of population is greater than the other and also to get their age. This method is best used to analyse the average ages of men and women in a particular population. The following R example will make it clear.

For explaining the Sampling Technique we will be using data set that has been taken from “Kaggle”. This data set contains information on the preferences of people regarding the color based on their gender and age. Using our analysis we can see what is the average age of a person with respect to their gender. We can also find the average age of a person that likes “cool”. We have people from ages that range from about 40 to 60. We can divide the stratas as elderly men and elderly women.

For explaining the Sampling Technique we will be using data set in R.







We can classify them into different stratas based on the gender i.e. “F” as one and “M” as another stratum.


Thus, we have classified our entire population into 2 different stratums.


In our case we can move ahead with proportional selection of units. Let the sample size be 20.







Thus, we get the average age of the people in the stratum as 48.9 years who likes “cool”.  The average age of elderly women who like “cool” is 48.72 years and the average age of men who like “cool” is 49.11 years. Thus, this method shows us why it is better to go for proportional sampling technique when we have gender related cases under study.  We have also obtained the confidence interval in which the true population value will lie.

This technique will help in getting better and valid estimates of the population parameters. In our analysis we have chosen a sample of size 20 (n). The total population is of 60 (N) people. The proportion of male is lesser than females. This sampling technique can also be used for hypothesis testing. When we have a hypothesis to test, say for instance in our study the hypothesis was:

H0: There is no significant difference in the ages of elderly men and women who like color “cool”.

H1: There is a significant difference in the ages of elderly men and women who like color “cool”.

From the results that we have obtained we see that their ages almost coincide i.e. they are almost similar. Hence, we can accept the Null hypothesis that there is no significant difference in the ages of elderly men and women who like the color “cool”.

In this analysis we have chosen proportional allocation because it helps us in estimating the true value than in case of SRS. This technique could be more efficient as we have men lesser than women in our study. There is a possibility that SRS might have given us incorrect results. Hence, we go for proportional stratification in such cases.

 

Advantages of Proportional Allocation:

There are several advantages of using proportional stratified sampling over SRS.

1.      They give us estimates that have more precision with lower variance that the ones given by SRS (Simple Random Sampling).

2.      When the entire population is grouped into strata and then the sample is collected in accordance to the proportion of the stratum, it gives use better estimates as it becomes a better representative of the population.

3.      It is better to use this sampling technique when we have great variability in the proportion of units under study. As in, when the population is homogeneous a SRS would help us in our study. However, if there is greater heterogeneity, then we go for stratification.


Disadvantages of Proportional Allocation:

1.      One among the major disadvantage of proportional allocation is when any unit from the population cannot be classified into the stratas so formed.

2.      Another disadvantage of this technique takes place when a single sample from a stratum is very large. This increases the cost of sampling. 

 

 

Comments

Popular posts from this blog

Comparing the efficiency of SRSWOR and SRSWR with the help of R Programming

Selection of samples:SRSWR vs SRSWOR(2048114)

pps (probability proportional to size) Systematic Sampling