PPS SYSTEMATIC SAMPLING_ HARTLEY-RAO ESTIMATOR
CENSUS 2001 & 2011(LITERACY RATES)
SHIVANI AJITH
INTRODUCTION: A sampling scheme with replacement in which each sampling unit has unequal probability of selection, the probability being proportional to the size of the auxiliary variable associated with the particular unit, is called probability proportion all to size and with replacement (PPSWR) sampling scheme.
PPS systematic sampling has the great advantage that it is easy to implement. It also has the property
that the inclusion probability of a unit is proportional to its size. Thus it
is a type of so-called πps sampling, i.e., a unit’s inclusion probability πi is
proportional to its size. Like simple systematic sampling, the PPS version has
the disadvantage that there is no variance estimator for it. t in PPS sampling
with replacement (ppswr), the probability of selecting a given unit on any
given draw is proportional to its size, but the overall inclusion probability
is not, i.e., it is not a πps sampling scheme (unless the sample size is one).
FORMULA FOR HARTLEY-RAO ESTIMATOR:
HR = 2-1 N -2 [(n-1)]-1∑∑(1-πi-πj + ∑πk2 /n)(yi/πi – yj/πj)2
APPLICATION OF REAL LIFE EXAMPLE IN R:
CONTEXT
This is the official dataset released by the govt. of India based on the census 2001 and 2011 survey.
CONTENT
The data is of 35 Indian states and union territories.
The literacy rate is spread across the major
parameters - Overall, Rural and Urban.
All the data is percentage of the total
population of that state.
About the dataset
The data in this CSV file contains the data from
the Govt. Of India website, regarding the literacy rate of the 35 states and
union territories.
There are 3 key fields, literacy rate overall, Category, Name of the States/Union Territory
ANALYSIS
INTERPRETATION-
1) The selected samples are
i)1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 for the literacy rate of states
ii)2,2,2,2,2,2,2 for the literacy rates of Union Territories
2) The estimate obtained for the average total literay rate for year 2011 using Hartley and Rao estimator is 3066.689 with standard error 32.4589 which implies that on an average, the mean value of total literacy rate in year 2011 is 3066.689 and it will lie in the range [3066.689 ± 32.4589].
3) The bias for estimate of total literacy rate for year 2011 is 342.0893
CONCLUSION:
The
error structure, for the variance estimator remains nearly unbiased for a large
sample and relatively larger population under mild conditions. The estimated
standard error for variance using Hartley and Rao estimator for the total deaths
in year 2020 is 32.489.The smaller the value of a standard error of estimate
the closer are the dots to the regression Line and better is the estimate based
on the equation of the line.
Here the Bound of error B=2*SE = 64.917 gives that the Hartley-Rao estimator obtained will Not exceed the margin of error [3066.689 ± 32.4589].
Also the dataset for both total literacy rates in year 2001 and year 2011 are positively correlated.
The
variance of the estimator given by Rao, Hartley gives a value smaller than that
given by Horvitz and Thompson ,it shows that
the estimator given by Rao, Hartley is better than that given by
Horvitz and Thompson.
Comments
Post a Comment