Original Code
* When I was first learning about t-tests and f-tests I was told that a t-test estimated the probability of falsely rejecting the null for a single estimator.
* While the f-test estimated the probability of rejecting the null that the model explained nothing significant.
* It was also stated that sometimes the t-tests can fail to reject the null when the f-test will reject the null and this is the result primarily of correlation among explanatory variables.
* All of these things I believe are well understood.
* However, what I always wanted to know was, if the t-test rejected did that mean that the f-test must also reject?
* This seems intuitively to me to be true.
* That is, if one part of the model seems to be statistically significant, mustent the whole model be statistically significant?
* Now thinking back on the question, I think the answer must be no.
* Assuming we are using rejection rates of 10%, the reason I argue is that if the f-test assumptions are met then it should falsely reject the null 10% of the time.
* Likewise, if the t-test's assumptions are met it should reject the null 10% of the time.
* However, if we are estimating two ts and they are independent then the probability that neither of them reject the null at 10% is 1-(1-.10)^2=19%
* Thus if the f-test rejects at 10% then there must be a range for which one or more t-stat can reject but the f-stat will fail to reject.
* Let's see if we can see this in action through simulation.
cap program drop ft_test
program define ft_test, rclass
clear
set obs 1000
gen x1=rnormal()
gen x2=rnormal()
gen y=rnormal()
reg y x?
* Calculate the p-stats for the individual coefficients.
* We multiply by 2 because the ttail is initially one sided and we are interested in the two sided alternative.
return scalar pt1 = 2 * ttail(e(df_r), abs(_b[x1]/_se[x1]))
return scalar pt2 = 2 * ttail(e(df_r), abs(_b[x2]/_se[x2]))
* We also want to extract the F stat
return scalar pF = Ftail(e(df_m),e(df_r),e(F))
end
ft_test
ft_test
ft_test
* Running the simulated regression on the data a few times, I can easily see how the P-stat for the t-tests diverge from the f-stat fairly frequently.
simulate pt1=r(pt1) pt2=r(pt2) pF=r(pF), reps(1000): ft_test
gen rt1 = (pt1<=.1)
gen rt2 = (pt2<=.1)
gen rF = (pF<=.1)
sum r*
* All of the p tests seem to be rejecting at the right level
* It might be the case that we always reject the null for the f if the rejection of the null for the t-tests are correlated.
pwcorr rt1 rt2, sig
* There does not appear to be correlation between the two t-tests rejections.
* By now we should already know the answer to the question.
* But let's check directely.
gen rtF = 0 if rt1 | rt2
replace rtF = 1 if rF == 1 & rtF == 0
sum rtF
* Thus the probability of rejecting the f-null given that we have rejected at least one of the t-nulls is only a little above 50%.
* It does make sense that the f and t rejections be correlated.
* That is, when the individual coefficients seem to be explaining the unknown variance then overall the model seems to be working relatively well.
pwcorr rF rt1 rt2, sig
* There is one more thing to check. How frequently do we reject the null for the F but not for either of the ts.
gen rFt = 0 if rF
replace rFt = 1 if rt1 | rt2 & rFt == 0
sum rFt
* In this simulation, we only reject the F-stat when at least one of the t-stats rejects.
* We could therefore argue that the F-stat is a more conservative test than the t-stats.
* However, I do not believe this to be entirely the case.
* As mentioned before, I think it is possible for the t-stat to fail to reject when the explanatory variables are correlated when the F-stat does reject.
* Let's see if we can simulate this.
cap program drop ft_test2
program define ft_test2, rclass
clear
set obs 1000
gen x1=rnormal()
gen x2=rnormal()+x1*3
* This will cause x1 and x2 to be strongly correlated.
gen y=rnormal()
reg y x?
* Calculate the p-stats for the individual coefficients.
* We multiply by 2 because the ttail is initially one sided and we are interested in the two sided alternative.
return scalar pt1 = 2 * ttail(e(df_r), abs(_b[x1]/_se[x1]))
return scalar pt2 = 2 * ttail(e(df_r), abs(_b[x2]/_se[x2]))
* We also want to extract the F stat
return scalar pF = Ftail(e(df_m),e(df_r),e(F))
end
simulate pt1=r(pt1) pt2=r(pt2) pF=r(pF), reps(1000): ft_test2
* Same analysis as previously
gen rt1 = (pt1<=.1)
gen rt2 = (pt2<=.1)
gen rF = (pF<=.1)
sum r*
pwcorr rt1 rt2, sig
* The rate of rejection between ts is highly correlated.
gen rtF = 0 if rt1 | rt2
replace rtF = 1 if rF == 1 & rtF == 0
sum rtF
* Under this setup, the rejection rate of the null for the F is about 45% of the time when one of the ts is rejected.
pwcorr rF rt1 rt2, sig
* We can see that now the rejection rates by component is still very strong.
* There is one more thing to check. How frequently do we reject the null for the F but not for either of the ts?
gen rFt = 0 if rF
replace rFt = 1 if rt1 | rt2 & rFt == 0
sum rFt
* Now we can see the result as discussed previously. About 25% of the time the f-stat is rejecting the null even though neither t-stat is rejecting the null.
* Thus it may be informative to use a F-stat to check for model fit even when the t-stats do not suggest statistical significance of the individual components.
* The ultimate result of this simulation is to emphasize for me the need to do tests of model fit.
* If, I were to look only at the t-tests in this example then I would falsely assume that the model fits well nearly twice as frequently as if I were to look at the F-stat only.
Francis - I have been trying to leave this comment on your post from Monday, but apparently I am a robot!
ReplyDeleteHere's the comment:
Francis: you might be interested in the following 2 papers:
The article by Geary and Leser in "The American Statistician" (1968):
http://www.jstor.org/stable/pdfplus/2681875.pdf
and the one by Duchan, in the same journal (1969):
http://www.jstor.org/stable/pdfplus/2682578.pdf
Best,
Dave Giles