Linear Systematic Sampling Technique

 

Linear Systematic Sampling 

Done by:

Hari Prasad

2048102

 



Introduction 

 

The world we live today produces tonnes of data points, and these have become a significant source of information which if we can produce useful results from it could benefit a majority of sectors and people around the world. The problems faced by the professionals in the field of data modelling and analytics is to extract the useful contents from these vast databases and fit appropriate models to it. Over the years, statisticians and other professionals have formulated different sampling techniques that would help the user to extract a part of the population that would best represent the data. In this blog, I would cover the linear systematic sampling technique.  

 

Systematic Sampling  

 

Systematic sampling is a type of probability sampling where the individual chooses a random start from the target population and continues to select sampling units at fixed sampling interval. This technique is similar to simple random sampling but easier to conduct, and when the researcher has budget constraints It is necessary to have the entire sampling frame to perform systematic sampling on a population. It is also crucial that the data should not follow any pattern or classification. 

 

Basic terminologies and notations 

 

·         N --> Total number of observations in the data or population size 

·         n --> Total number of observations in the sample or sample size 

·         k --> sampling interval 

 



 

The observations in the systematic sampling are arranged as in the following table:


 

Steps involved in systematic sampling 

1.      Select a random number between 1 and k 

2.      Suppose it as 'r'. 

3.      Select the first sample unit with the serial number r 

4.      Select the next kth unit after the rth unit. 

5.      Repeat it for the next (n-2) times to choose the sample. 

 

Types of Systematic Sampling 

·         Linear Systematic Sampling 

·         Circular Systematic Sampling 

 

Where to use Systematic Sampling? 

The main situations suitable to use systematic sampling are, 

1.      When there is a budget restriction 

2.      When the population units are large or when taking responses from individuals 

3.      When the units do not follow a particular pattern 

4.      When there should be least or no data manipulation or bias 

 

Linear Systematic Sampling 

 

Systematic sampling wherein there are k possible set of samples each having an equal probability of 1/k of being selected. Therefore, the first unit in each possible sample data is selected at random while the other (n-1) units are selected systematically by the sampling interval k.  

 

This blog focuses on Linear Systematic sampling and some real-life applications of systematic sampling. 

 

Application of Linear Systematic Sampling 

To understand how systematic sampling works and how to estimate the population parameters, I have used a data frame from kaggle.com. The data is collected by a health insurance company in the order of their id number. The data contains information about 1000 clients namely their, 

 

·         Age: age of the client 

·         Sex: gender of the client 

·         BMI: body mass index 

·         Children: number of children 

·         Smoker: whether the client is a smoker or not (logical entries) 

·         Region: part of the country they are from 

·         Charges: maximum amount of insurable amount 

Since the data does not follow any pattern of classification, it is feasible for the individual to perform linear systematic sampling to the data to obtain the required samples. To select the systematic sample, we will use the R-programming language. 

A researcher would like to select 100 sampling units from the population. 




 

Thus, we have selected a 1 on 10 systematic samples for the population.

 

Advantages of Systematic Sampling 

·         Convenient and straightforward. 

·         A better way of representing a population in a faster manner 

·         Free from favouritism and personal bias 

·         Minimum risk involves 

·         Cost-efficient 

 

Limitations of Systematic Sampling 

·         There are certain cases where sample units have unequal chances of being selected. There is a chance for a particular combination of systematic sample to not being selected. 

·         After ordering the population units, if the units follow some pattern, then the systematic sampling may not provide us with the best representatives of the data. 

 

Comments

Popular posts from this blog

Comparing the efficiency of SRSWOR and SRSWR with the help of R Programming

Selection of samples:SRSWR vs SRSWOR(2048114)

pps (probability proportional to size) Systematic Sampling