Hi! I'm new to R studio and having a problem with my lm function.
I created a new variable in my dataframe:
df <- df %>% mutate(logweekpay=log(df$weekpay))
But when I try regressing it onto the female variable with:
lm(df$logweekpay~df$female)
I get: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf dans 'y'
I tried regressing df$weekpay on df$female and it worked just fine. Can sombdy help me figure out the problem?
typeof(df$female)
[1] "integer"
It depends what you mean by "make it work." The fundamental problem is that data which has zeros in it can't behave according to log--so the model is wrong.
If you're willing to change models, you can use the linear version as you've done. Or you can drop the observations that equal zero. Or you can add a small number to all the observations so that the zeros become positive. All of these solutions will change the results some.
Another approach is to model getting any pay at all separately and to model how much the pay is only for those who are working.