That’s why you as a statistician can tolerate that approach: who can argue with a kind of tripwire that simply says, “You might want to look more closely at this.” But “outlier” is still a loaded term and one of the more dangerous ones in statistics, and “outlier test” without further explanation should raise a red flag in your mind. People who stop at detection are implicitly acknowledging that their model may be useful but it’s not well-motivated. The problem with outlier detection is when people don’t see the gap between what they conceive an outlier to be (“a bad point”, “a suspicious transaction”, etc) and what outlier test do: determine how likely a point is with respect to a particular model. Īndrew, I think your original statement was correct. This entry was posted in Bayesian Statistics, Decision Theory, Teaching by Andrew. I prefer methods such as factor analysis or lasso that group or constrain the coefficient estimates in some way. The trouble with stepwise regression is that, at any given step, the model is fit using unconstrained least squares. This sort of problem comes up all the time, for example here’s an example from my research, a meta-analysis of the effects of incentives in sample surveys. To address the issue more directly: the motivation behind stepwise regression is that you have a lot of potential predictors but not enough data to estimate their coefficients in any meaningful way. For example, Jennifer and I don’t mention stepwise regression in our book, not even once. Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. I don’t find the topic on your blog, and wonder if you have addressed the issue. I learned in econometrics that stepwise is poor practice, as it defaults to the “theory of the regression line”, that is no theory at all, just the variation in the data. I noticed in the semi-parametric and parametric material (Wang and Lee is the text) that they use stepwise regression a lot. I have been slowly working my way through the grad program in stats here, and the latest course was a biostats course on categorical and survival analysis.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |