A statistic question about R-Square in linear regression

Visiting · March 28, 2020, 1:53pm

Developed a simple linear regression model, sample size 12,000, coefficient is significant, outcome variable meet normal distribution, residual diagnosis for the model is good. But the R-Squared is only 0.002, very low. can this model be used?
Big sample analysis is quite easy to obtain significant p value, is there any other parameter to be used for evaluating how good the analysis? Thank you.

FJCC · March 28, 2020, 2:49pm

I suggest you think about whether the model should be used rather than can it be used. You do not give any information about the subject of the model or the purpose of the study so I can only answer very generally. The R^2 says you can control or predict 0.2% of the variance in the outcome. Is that a significant change in this field or application? Let's say you are predicting a cost on the order of $100000 with a variance of $10000. The model tells you about $20 of that variance. I cannot think of a field where that sort of change would be considered worth trying to control but maybe there is one.

Generally, before worrying too much about "is this result true", it can be helpful to think about "does it matter if it is true". With such a small effect, I would guess that it is not a practically significant result but only you can make that judgement.

By the way, I would avoid saying a coefficient is significant without stating at what level. The traditional p < 0.05 threshold for significance has no justification beyond popularity and is, in my opinion, often far too generous.

Visiting · March 28, 2020, 3:06pm

Thank you!
By the way, p<0.05 is significance threshold, actual p value less than 0.001.

system · April 4, 2020, 3:06pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.