DOUBLE SAMPLING FOR NONRESPONSE
DOUBLE SAMPLING FOR NON-RESPONSE
INTRODUCTION:
What is a Non
–sampling error?
A non sampling error is a statistical term that refers
to an error that results during data collection,causing the data to differ from
the true values.i.e the differences between estimates and population quantities
that do not arise solely from the fact that only a sample,insead of the whole
population is observed.
What is a Non
–Response?
Non response means failure to obtain a measurement on
one or more study variables for one or more elements k selected for the
survey.The self selection of respodents may produce bias.i.e only people with
certain opinions will respond to some questions.
PROCEDURE
Double sampling can be used to adjust for non response
in the form of call backs.Non response is an important problem to consider in
any survey.We can consider the two groups-response and non-response in two
strata.
The two steps of double sampling for non-response:
1.
Step1:
n’ initial simple random samples are
selected from a population of N units.These units are classified into two
strata-response and non-response.
n1’ of these response is
stratum1
n2’ of these do not
response is stratum2
2.
Step2:
Call back n2 samples by
simple random sampling from the n2’ non-respondents.Thus,we are in a
double sampling setting where n1=n1’ ,n2 is
the number of call backs.
APPLICATION
Double sampling can
be used to adjust for non-response in the form of call
backs. Non-response is an important problem
to consider in any survey.It is widely used for forest and forest ecosystems
To find the estimate the mean and also the variance of the estimate.
The estimate for the mean is:
The estimated variance of this estimate is:
For example:Time spent studying
In a college with 1000 students, a
questionnaire is mailed to a simple random sample of 106 students asking them
about the amount of time they spend per week studying. Out of these students,
46 respond. From the 60 non-respondents, a simple random sample of 20 is
selected and intensive efforts are made by telephone and personal visit to
obtain responses.
These are the data obtained.Here Questionnaire is "the student responding to questionnaire"" and Telephone and Vist is "the students contacted and responded to telephone and visit".Sample mean of Questionnaire is 20.5 hours and Telephone and Visit is 10.9 hours.Also SD of sample in Questionnaire is 6.2 hours and Telephone and Visit is 5.1 hours.
Selecting the
Number of Call Backs
- c0:
the initial cost of sampling each respondent (the set-up cost for each
respondent)
- c1:
the cost of a standard response (cost of producing the response)
- c2: the cost of a call back
response
Total cost =(n′×c0σ22)+(n1′×c1)+(n2×c2)
We want to determine the value k(k > 1)
where: n2=n2′/k.
As ,its variance can be derived
and one can find the value of k and n' that
minimize the expected cost of sampling for a desired fixed value of
, which we denote as V0 .
When N is large, the optimal value of k and n' are:
where σ2 is the variance of the entire population
and σ22is the variance of the non-response group.
Example:Weekly living expenditures
In a college of 1000 students, we want to find out students' average weekly living expenditure. The response rate is anticipated to be about 60%. It is thought that the response group has a higher variance than the non-response group. The overall variance σ2 120 and the variance of the non-response group σ22~80,c0=0,c1=1,c2=4
Disadvantages of Non-Response
- It invalidates the results of an investigation or research.
- It may result in higher variances for the estimates since the sample size the researcher ends up with is lesser than what was expected.
- It may lead to inconclusive research.
Comments
Post a Comment