I have a data frame containing continuous data of subjects emotional responses to different stimuli. Some of the stimuli have three parts. Now I want to create a new variable based on the timestamps in the dataset to assign the rows of the data to the different stimulus-parts. I've created a testdataset to illustrate my problem:
#create variable subject
subject=c("VP01", "VP01", "VP01", "VP01", "VP01", "VP01", "VP01", "VP01", "VP01", "VP01", "VP01", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02", "VP02")
#create variable event
event=c("calib", "calib", "stim1", "stim1", "stim1", "stim2", "stim2", "stim2", "stim2", "stim2", "stim2", "calib", "calib", "stim1", "stim1", "stim1", "stim3", "stim3", "stim3", "stim3", "stim3", "stim3", "stim3", "stim3", "stim3")
#create variable sad
sad=c(0, 0, 1, 1, 2, 3, 3, 6, 6, 4, 7, 1, 1, 2, 1, 1, 4, 7, 2, 4, 6, 7, 5, 4, 6)
#create variable happy
happy=c(0, 1, 1, 0, 2, 3, 4, 6, 7, 4, 6, 1, 1, 2, 5, 1, 4, 6, 2, 7, 4, 7, 5, 2, 3)
#create variable time
time=c("00:10:49.863", "00:10:50.863", "00:10:51.863", "00:10:52.863", "00:10:53.863", "00:10:54.863", "00:10:55.863", "00:10:56.863", "00:10:57.863", "00:10:58.863", "00:10:59.863", "00:11:00.863", "00:11:01.863", "00:11:02.863", "00:11:03.863", "00:11:04.863", "00:11:05.863", "00:11:06.863", "00:11:07.863", "00:11:08.863", "00:11:09.863", "00:11:10.863", "00:11:11.863", "00:11:12.863", "00:11:13.863")
#create test data set
testdata <- data.frame(subject,event,time,sad,happy)
-
My first problem is that currently the timestamps are string variables, but I guess I would have to somehow convert them into actual timestamps because of my second problem. The format here is hh:mm:ss.000, but actually hh:mm:ss would suffice.
-
The variable "event" indicates which stimulus the data belongs to. stim2 and stim3 are divided into three parts. The beginning of part 1 is relative, so for each subject the timestamp differs. So the beginning of part 1 of stim2 I would just infer from the first row in which stim2 appears for a subject. The end of part 1 however is always after 2 seconds. The end of part 2 is after 1 second and the end of part 3 whenever "stim2" appears for the last time in the column "event" for one subject. For stim3 it's similar: Beginning of part 1 is the first appearance of "stim3" in column event, end of part 1 after 1 second, end of part 2 however after 2 seconds, end of part 3 whenever "stim3" appears the last time in "event" for one subject.
So what I would like to end up with is a dataframe that looks like testdata2:
part=c("calib", "calib", "stim1", "stim1", "stim1", "stim2_1", "stim2_1", "stim2_1", "stim2_2", "stim2_3", "stim2_3","calib", "calib", "stim1", "stim1", "stim1", "stim3_1", "stim3_1", "stim3_2", "stim3_2", "stim3_2", "stim3_3", "stim3_3", "stim3_3", "stim3_3")
#create test data set
testdata2 <- data.frame(subject,event,part,time,sad,happy)
My actual data is much more finegrained (7-8 rows per second) but I tried to simplify things for the sake of this example, I hope it works anyways. I'm an absolute beginner with R and coding in general and I don't know how to go about this at all - any help is greatly appreciated!!!