Forgotten to mention to check homework policy if applicable.
To debug this type of problem, it helps to look at what there is to work with. First a diversion to clarify my usage.
One of the hard things to get used to in R
is the concept that everything is an object
that has properties. Some objects have properties that allow them to operate on other objects to produce new objects. Those are functions
.
Think of R
as school algebra writ large: f(x) = y, where the objects are f, a function, x, an object (and there may be several) termed the argument
and y is an object termed a value
, which can be as simple as a single number (aka an atomic vector
) or a very packed object with a multitude of data and labels.
And, because functions are also objects, they can be arguments to other functions, like the old g(f(x)) = y. (Trivia, this is called being a first class object.)
Although there are function objects in R
that operate like control statements in imperative/procedural language, they are best used "under the hood." As it presents to users interactively, R
is a functional programming language. Instead of saying
take this, take that, do this, then do that, then if the result is this one thing, do this other thing, but if not do something else and give me the answer
in the style of most common programming languages, R
allows the user to say
use this function to take this argument and turn it into the value I want for a result
The roles in
A <- lm(Murder ~ Population + Illiteracy + Income + Frost, data=state.x77)
consist of <-, a so-called primitive f that works as an assignment operator to send the return value
of lm
to the new object
A, an object state.x77, Murder, an object within state.x77, the \sim operator that identifies the following objects to lm
.
What do we see when we run the command?
A <- lm(Murder ~ Population + Illiteracy + Income + Frost, data=state.x77)
#> Error in model.frame.default(formula = Murder ~ Population + Illiteracy + : 'data' must be a data.frame, not a matrix or an array
Created on 2020-04-06 by the reprex package (v0.3.0)
This clearly points to states.x77
as the culprit. It's the wrong kind of object.
class(state.x77)
#> [1] "matrix"
Created on 2020-04-06 by the reprex package (v0.3.0)
matrix \ne data frame
So, what to do?
frame.x77 <- state.x77
as.data.frame(frame.x77) -> frame.x77
class(frame.x77)
#> [1] "data.frame"
A <- lm(Murder ~ Population + Illiteracy + Income + Frost, data=frame.x77)
summary(A)
#>
#> Call:
#> lm(formula = Murder ~ Population + Illiteracy + Income + Frost,
#> data = frame.x77)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -4.7960 -1.6495 -0.0811 1.4815 7.6210
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.235e+00 3.866e+00 0.319 0.7510
#> Population 2.237e-04 9.052e-05 2.471 0.0173 *
#> Illiteracy 4.143e+00 8.744e-01 4.738 2.19e-05 ***
#> Income 6.442e-05 6.837e-04 0.094 0.9253
#> Frost 5.813e-04 1.005e-02 0.058 0.9541
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.535 on 45 degrees of freedom
#> Multiple R-squared: 0.567, Adjusted R-squared: 0.5285
#> F-statistic: 14.73 on 4 and 45 DF, p-value: 9.133e-08
Created on 2020-04-06 by the reprex package (v0.3.0)