Fitting linear regression model


I am having troubling fitting two linear regression models.

My dataset, named 'heart', contains mode of treatment (trtment) for congenital heart disease in infants, with 0 standing for circulatory arrest treatment and 1 for low-flow bypass treatment. Psychomotor Development Index (PDI) score and Mental Development Index (MDI) score are then noted against the type of treatment administered to the infant.

I am to fit two linear regression models, one with PDI score as the response and the another with Mental Development Index (MDI) score as the response.

I am using the following command for PDI and I am not getting the correct output to construct a linear model.

fit <- with(heart, lm(trtment~pdi))

lm(formula = trtment ~ pdi)

(Intercept) pdi104 pdi105 pdi109 pdi110 pdi111 pdi114 pdi115 pdi117
1.617e-14 5.000e-01 2.727e-01 1.000e+00 6.667e-01 2.500e-01 1.000e+00 1.000e+00 3.333e-01
pdi118 pdi120 pdi122 pdi124 pdi130 pdi134 pdi50 pdi52 pdi60
1.000e+00 1.000e+00 7.500e-01 1.000e+00 -1.530e-14 -1.390e-14 1.000e+00 1.000e+00 -1.508e-14
pdi63 pdi66 pdi67 pdi70 pdi71 pdi75 pdi76 pdi77 pdi78
-1.629e-14 -1.438e-14 -1.449e-14 3.333e-01 -1.604e-14 5.000e-01 -1.428e-14 -1.618e-14 -1.534e-14
pdi80 pdi82 pdi86 pdi87 pdi90 pdi92 pdi93 pdi98 pdi99
4.545e-01 -1.348e-14 2.000e-01 3.333e-01 1.000e+00 6.667e-01 7.500e-01 6.522e-01 6.000e-01

If I use this following command instead, I am still getting an error message and incorrect output;

fit <- with(heart, lm(pdi~trtment))
Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors

I will really appreciate if someone could help me troubleshoot my commands as I need to model both PDI and MDI.



It seems your pdi column is a factor rather than numeric. It is hard to say why without seeing more of your code, especially the reading in of the data. Please make a Reproducible Example (reprex).

Just a few lines of your data and then the fitting process should be enough.

Thanks for your reply. Here is a reproducible example;

trtment pdi mdi
1 0 80 74
2 1 118 124
3 1 122 109
4 0 98 78
5 0 98 91
6 0 111 130

The first column from 1-6 are the serial numbers, just wanted to clarify.

Your example data work fine for me, as shown below.

DATA <- data.frame(trtment = c(0,1,1,0,0,0),
                   pdi = c(80,118,122,98,98,111), 
                   mdi = c(74, 124, 109,78,91,130)
#>   trtment pdi mdi
#> 1       0  80  74
#> 2       1 118 124
#> 3       1 122 109
#> 4       0  98  78
#> 5       0  98  91
#> 6       0 111 130
#>     trtment            pdi             mdi        
#>  Min.   :0.0000   Min.   : 80.0   Min.   : 74.00  
#>  1st Qu.:0.0000   1st Qu.: 98.0   1st Qu.: 81.25  
#>  Median :0.0000   Median :104.5   Median :100.00  
#>  Mean   :0.3333   Mean   :104.5   Mean   :101.00  
#>  3rd Qu.:0.7500   3rd Qu.:116.2   3rd Qu.:120.25  
#>  Max.   :1.0000   Max.   :122.0   Max.   :130.00
fit <- with(DATA, lm(pdi ~ trtment))
#> Call:
#> lm(formula = pdi ~ trtment)
#> Coefficients:
#> (Intercept)      trtment  
#>       96.75        23.25

Created on 2019-04-22 by the reprex package (v0.2.1)
What do you see when you do


Is pdi treated as numbers, with a Mean and Median, or is it treated as a factor or character? If it is not numeric, I suspect you have some bad data or possibly NA-denoting strings somewhere in the column.

Great! Thank you so much for the quick reply. I am going to try that again although your post prompted me to check my dataset and lo and behold, there is a row that has no PDI data -

trtment PDI MDI
0 . 109

Literally a dot there!

I think I am going to omit this reading and proceed with my modeling. Really appreciate your help.


This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.