I'm new to Rstudio and I'm having a problem with linear regression line coding.

I'm trying to develop a multiple line regression using a dummy variable. I have successfully created the dummy variable but I also need other variables to create a simple linear regression analysis. I have imported my data and I named that file "data2", this data contains 4 columns of variables: price.of.house, Region,schools.rating, and house.age

And for the simple linear regression analysis, I will need 2 variables for my X and Y. But I don't know how to name my variables because I want to use x <- price.of.house and y <- Region. The price.of.house and Region are my variables in the imported data2. Can you help me with the syntax to use these 2 variables in my code? Thank you so much!

This is my code:
#I need x = price.of.house and y=Region. How can I call these variables in the data2. I will need X, Y and Z for my linear regression analysis.

x_new <- read.table("data2") => I tried using this one but useless

#This part is for the dummy variables
region.list <- c(Region, data=data2)
region <- region.list[sample(c(1,2,3,4,5), 500, replace=TRUE)]
z <- vector(length = 500)
for (i in 1:500){
if (region[i] == "A|B") z [i] <- 0
else z[i] <- 1
}

result1 <- lm(y ~ x*z) => the linear regression line

No need to rename your variables. lm() lets you specify multiple right hand side variables by putting + between them. Note also that the dependent variable goes first and is generally called y.

If your data is in a dataframe named data2 try something like

lm(price.of.house ~schools+rating+z, data = data2)

Thank you so much! I really appreciat it

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.