Influence of informative sampling on dependence between variables

  • Julia Aru Statistics Estonia, 15174 Tallinn
Keywords: Covariance matrix, inclusion probabilities, informative sampling, multinomial distribution, multivariate exponential family, multivariate normal distribution


In the case of informative sampling the sampling scheme explicitly or implicitly depends on the response variables. As a result, neither the sample distribution of response variables, nor the covariance matrix reects the corresponding population counterparts. In this paper, a relationship between multivariate sample and population distributions is used. Based on this, the influence of the informative sampling on the covariance matrix is investigated. It is shown that with inclusion probabilities in a multiplicative form with respect to study variables, the independence between variables is preserved in the sample. Further, it is shown that with inclusion probabilities exponentially depending on the study variables, the multivariate exponential family is invariant under sampling. The sample distribution belongs to the same family as the population distribution but with different parameters. The relationship between parameters is given. The multinomial and multivariate normal distributions are examined in more detail and the parameters of their sample distributions are derived explicitly. The effect of the informative sampling on the respective covariance matrices and correlations is analysed and illustrated in the examples.


Download data is not yet available.