Hi everyone,
I want to conduct a LOSEM twin analysis using OpenMX under R. (It's used in behavioral genetics and twin studies to investigate the genetic and environmental influences on traits and behaviors).
My Variables: a.res. and b.res(dependent variables [standardized]),z_age (age [standardized]), zygof1f2 (zygosity,MZ meaning monozygotic = 1, DZ meaning dizygotic = 2)
My Code:
\#Select dependent variables.
selDVs = c("a.res.", "b.res.")
\#Select moderator
moderator = "z_age"
\#Subset MZ and DZ data
mzData =subset(DS_HA, zygof1f2 =="MZ",c(selDVs, moderator))
dzData =subset(DS_HA, zygof1f2 =="DZ",c(selDVs, moderator))
\#Define LOSEM increments
targets = seq(from = -1.0, to = 2.0, by =.01)
\#Run and plot for specified windows
output\<-umxGxE_window(selDVs = selDVs, moderator = moderator, mzData = mzData, dzData = dzData, target = targets)
After executing the last command, the following error is displayed: Error in if (target < min(modVar)) { : the condition has length > 1
This confuses me as the R help page states a target is "A user-selected list of moderator values to test (default = NULL = explore the full range)". It's also strange because using targets <- seq(from = -1.0, to = 2.0, by =.01) worked in a paper I read. They also used R and uploaded their scripts to an OpenScience website, so I don't understand why it doesn't work for me.
Here is a reprex of twin data (I hope the format works, I am quite new to coding in R so sorry if this is the wrong format)
Please include a reprex (see the FAQ) using twinData or one of the other builtin datasets from umx or OpenMX if that reproduces the problem; otherwise enough of the data you are actually using.
If default target is taken, the error does not occur
library(umx)
#> Loading required package: OpenMx
#> OpenMx may run faster if it is compiled to take advantage of multiple cores.
#> For an overview type '?umx'
#>
#> Attaching package: 'umx'
#> The following object is masked from 'package:stats':
#>
#> loadings
DS_HA <- data.frame(
a.res. = c(0.5, 0.7, 0.6, 0.8, 0.9, 0.4, 0.3,
0.2, 0.1, 0.2),
b.res. = c(0.4, 0.6, 0.5, 0.7, 0.8, 0.3, 0.2,
0.1, 0.2, 0.1),
z_age = structure(c(-1.60475777759526, -0.818111808185819,
-0.0314658387763781, 0.755180130633063, 1.5418261000425, -1.1327701959496,
-0.503453420422042, 0.283192548987398, 0.597850936751175, 0.912509324514951
), dim = c(10L, 1L),
`scaled:center` = 35.2,
`scaled:scale` = 6.35609943282828),
zygof1f2 = c("MZ", "MZ", "DZ", "DZ", "MZ", "MZ", "DZ", "MZ",
"DZ", "MZ"))
#Select dependent variables.
selDVs = c("a.res.", "b.res.")
#Select moderator
moderator = "z_age"
#Subset MZ and DZ data
mzData =subset(DS_HA, zygof1f2 =="MZ",c(selDVs, moderator))
dzData =subset(DS_HA, zygof1f2 =="DZ",c(selDVs, moderator))
#Define LOSEM increments
targets = seq(from = -1.0, to = 2.0, by =.01)
#Run and plot for specified windows
# reproduces error " condition has length > 1
output <- umxGxE_window(selDVs = selDVs, moderator = moderator, mzData = mzData, dzData = dzData, target = targets)
#> Error in if (target < min(modVar)) {: the condition has length > 1
length(targets)
#> [1] 301
# take target default of NULL
output <- umxGxE_window(selDVs = selDVs, moderator = moderator, mzData = mzData, dzData = dzData)
#> mod = -1.60475777759526
#> Polite note: It's better to use 'sep'. This might become compulsory as it helps umx manage variable names in twin models.
#> Polite note: Variance of variable(s) 'a.res.' and 'b.res.' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Error: The name 'a.res.' is illegal because it contains the '.' character in mxData(observed = data, type = "raw")
# error message indicates problems with a.res. and b.res.
# 1. variance < 0.1 (change units?)
# names should not contain period character
# reform names
colnames(DS_HA)[1:2] <- c("a_res","b_res")
# rerun
#Select dependent variables.
selDVs = c("a_res", "b_res")
#Select moderator
moderator = "z_age"
#Subset MZ and DZ data
mzData =subset(DS_HA, zygof1f2 =="MZ",c(selDVs, moderator))
dzData =subset(DS_HA, zygof1f2 =="DZ",c(selDVs, moderator))
#Define LOSEM increments
targets = seq(from = -1.0, to = 2.0, by =.01)
# Run and plot for specified windows
# with only reformed variable name does not fix
output <- umxGxE_window(selDVs = selDVs, moderator = moderator, mzData = mzData, dzData = dzData)
#> mod = -1.60475777759526
#> Polite note: It's better to use 'sep'. This might become compulsory as it helps umx manage variable names in twin models.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Running ACE with 4 parameters
#> mod = -0.60475777759526
#> Polite note: It's better to use 'sep'. This might become compulsory as it helps umx manage variable names in twin models.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Running ACE with 4 parameters
#> mod = 0.39524222240474
#> Polite note: It's better to use 'sep'. This might become compulsory as it helps umx manage variable names in twin models.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Running ACE with 4 parameters
#> mod = 1.39524222240474
#> Polite note: It's better to use 'sep'. This might become compulsory as it helps umx manage variable names in twin models.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Polite note: Variance of variable(s) 'a_res' and 'b_res' is < 0.1.
#> You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
#> Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.
#> Running ACE with 4 parameters
Thanks a lot, that already helps! Strangely it works with the reprex data but not with my original data when the default target is used. When I use the reprex data I get the same results as you. However, when I use my data, R replies with
Error in is.data.frame(x) :
'list' object cannot be coerced to type 'double'
I have checked everything again and the reprex data has exactly the same data structure as my original data and the data type (numeric etc) is also equivalent. Do you think this could be a bug in the umxGxE_window function? (Unfortunately I cannot upload my original data, but as I said it is completely equivalent in terms of data structure, variable names and types).
As far as I know, tbl_df is a class used in the dplyr package for data frames. It's a subclass of tbl and data.frame, which means I should be able to use it like a regular data frame (but apparently I wasn't). When I reuploaded the data in R it returned class(DS_ACE) as a regular data frame and the error message was gone. It still doesn't make much sense to me, but at least my code works now.
So thanks so much again!
That's a tibble, which is a superclass of data.frame. I'm not a big fan. Anything I might use it for I use {data.table} instead. Glad that's working now.