Hello,
I have a dataframe with values for Ages from 0 to 110 and I would like to create a new dataframe or to add a column in my existing one summing the values of one column.
For example : For age = 2 I want the value to be the sum of the values for age=0,1 and 2 of my existing data and for age = 110 I want it to be the sum of the whole values of my existing data.
I tried this code but all I get is an empty vector.
i <- 0
repeat {
p_lifetable_1955 <-cumsum (data1955$Fem_px[0 : i])
i <- i + 1
if(i == 111) break
}
A quick addendum to @melih_guven's terrific answer - make sure you sort your column by age before summing, otherwise you will get unexpected results.
Also, for what it's worth, running a loop over a dataframe is typically not a good idea. Loops are expensive in R and running them over dataframes can lead to errors that are difficult to debug. Using vectorized functions like those provided by dplyr or data.table are almost always better choices.
Loops are not necessarily poor, more so the execution of them is, with people not fully understanding what a loop is doing hence why base functions in the family apply() or {purrr} are more often suggested. Additionally, {data.table} and {dplyr} make using vectorized functions easier (e.g. case_when()). Also see the following SO post about difference between {data.table} vs. {dplyr} and when one might be more advantageous to use over another.
To directly answer @Diego17 question as to why you are getting an empty vector - this is occurring as the loop rewrites over i every single time with the last operation being break hence an empty vector. If you created an empty list results_list <- list() to dump i into or a copy of the dataframe to dump i into then you will end up with a vector of values. I hope this helps.