题目
Your colleague Patricia is conducting a regression analysis based on a large sample (N> 30) from her bank's customer database. The dependent variables the customer's FICO credit score. The independent variable is an internal composite score based on the customer's education level and other factors. Her classical linear regression model (CLRM) is therefore given by FICO(i) = β(0) +β(1)×SCORE(i) + u(i). She generates her sample regression function (SRF) from a large random sample of size N where the population presumably has a mean, μ, and finite variance, For example, her sample dependent values are FICO(1), FICO(2), FICO(3), ..., FICO(N). We can assume her random selections are identically and independently distributed. Each of the following statements is true EXCEPT which is false?
选项
A.According to the central limit theorem (CLT), as the sample size increases, the sample average of the FICO scores will itself tend to follow a normal distribution
B.According to the CLT, the intercept, \u03b2(0), and slope, \u03b2(1), estimators in her regression should follow an approximately normal distribution
C.If the other assumptions of CLRM are valid, including that the error term has a conditional mean of zero and constant variance, then the error terms are approximately normal
D.If the FICO score--which happens to be the dependent variable in the regression--is positively or negatively skewed, then the distribution of its own sample mean will be skewed even for a large sample; and further, this will violate an CLRM assumption if we regress it against the internal composite score
答案
D
解析
The regression model does not insist on an assumption about the distribution of the dependent variable, and the CLT (which finds that the sample mean is approximately normal) is indifferent to the shape of the underlying distribution. In regard to (A), (B) and (C) each is TRUE. Please notice that the CLT informs not only the sample mean of the FICO score, in terms of its own univariate distribution, but also informs the normality of each of the coefficients and the error term in the regression model. 回归模型没有坚持关于因变量的分布的假设,并且中心极限理论(发现样本均值近似于正态)不受基础分布的形状的影响。关于(A),(B)和(C)均正确。请注意,中心极限理论不仅根据其自身的单变量分布告知FICO评分的样本均值,还告知回归模型中每个系数的正态性和误差项。