I am having troubling fitting two linear regression models.

My dataset, named 'heart', contains mode of treatment (trtment) for congenital heart disease in infants, with 0 standing for circulatory arrest treatment and 1 for low-flow bypass treatment. Psychomotor Development Index (PDI) score and Mental Development Index (MDI) score are then noted against the type of treatment administered to the infant.

I am to fit two linear regression models, one with PDI score as the response and the another with Mental Development Index (MDI) score as the response.

I am using the following command for PDI and I am not getting the correct output to construct a linear model.

fit <- with(heart, lm(trtment~pdi))

lm(formula = trtment ~ pdi)

(Intercept) pdi104 pdi105 pdi109 pdi110 pdi111 pdi114 pdi115 pdi117
1.617e-14 5.000e-01 2.727e-01 1.000e+00 6.667e-01 2.500e-01 1.000e+00 1.000e+00 3.333e-01
pdi118 pdi120 pdi122 pdi124 pdi130 pdi134 pdi50 pdi52 pdi60
1.000e+00 1.000e+00 7.500e-01 1.000e+00 -1.530e-14 -1.390e-14 1.000e+00 1.000e+00 -1.508e-14
pdi63 pdi66 pdi67 pdi70 pdi71 pdi75 pdi76 pdi77 pdi78
-1.629e-14 -1.438e-14 -1.449e-14 3.333e-01 -1.604e-14 5.000e-01 -1.428e-14 -1.618e-14 -1.534e-14
pdi80 pdi82 pdi86 pdi87 pdi90 pdi92 pdi93 pdi98 pdi99
4.545e-01 -1.348e-14 2.000e-01 3.333e-01 1.000e+00 6.667e-01 7.500e-01 6.522e-01 6.000e-01

If I use this following command instead, I am still getting an error message and incorrect output;

fit <- with(heart, lm(pdi~trtment))
Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors

I will really appreciate if someone could help me troubleshoot my commands as I need to model both PDI and MDI.



It seems your pdi column is a factor rather than numeric. It is hard to say why without seeing more of your code, especially the reading in of the data. Please make a Reproducible Example (reprex).

Just a few lines of your data and then the fitting process should be enough.

Thanks for your reply. Here is a reproducible example;

trtment pdi mdi
1 0 80 74
2 1 118 124
3 1 122 109
4 0 98 78
5 0 98 91
6 0 111 130

The first column from 1-6 are the serial numbers, just wanted to clarify.

Your example data work fine for me, as shown below.

DATA <- data.frame(trtment = c(0,1,1,0,0,0),
                   pdi = c(80,118,122,98,98,111), 
                   mdi = c(74, 124, 109,78,91,130)
#>   trtment pdi mdi
#> 1       0  80  74
#> 2       1 118 124
#> 3       1 122 109
#> 4       0  98  78
#> 5       0  98  91
#> 6       0 111 130
#>     trtment            pdi             mdi        
#>  Min.   :0.0000   Min.   : 80.0   Min.   : 74.00  
#>  1st Qu.:0.0000   1st Qu.: 98.0   1st Qu.: 81.25  
#>  Median :0.0000   Median :104.5   Median :100.00  
#>  Mean   :0.3333   Mean   :104.5   Mean   :101.00  
#>  3rd Qu.:0.7500   3rd Qu.:116.2   3rd Qu.:120.25  
#>  Max.   :1.0000   Max.   :122.0   Max.   :130.00
fit <- with(DATA, lm(pdi ~ trtment))
#> Call:
#> lm(formula = pdi ~ trtment)
#> Coefficients:
#> (Intercept)      trtment  
#>       96.75        23.25

Created on 2019-04-22 by the reprex package (v0.2.1)
What do you see when you do


Is pdi treated as numbers, with a Mean and Median, or is it treated as a factor or character? If it is not numeric, I suspect you have some bad data or possibly NA-denoting strings somewhere in the column.

Great! Thank you so much for the quick reply. I am going to try that again although your post prompted me to check my dataset and lo and behold, there is a row that has no PDI data -

trtment PDI MDI
0 . 109

Literally a dot there!

I think I am going to omit this reading and proceed with my modeling. Really appreciate your help.


