Dear Community,
I am currently working on a dataset that shows the development of mortality caused by road traffic accidents across a number of countries over the last 50 years. For my research, I would like to delete all columns (years) as well as countries (rows) which show no data at all. For the columns, I was able to make use of a solution from the forum in which I simply calculate the total sum of values for each column and then delete all columns which show a sum of 0:
#Count the empty values in a column
colSums(is.na(PS22) | PS22 == "")
#create a boolean variable that indicates if a column is empty (True) or not (False)
empty_columns <- colSums(is.na(PS22) | PS22 == "") == nrow(PS22)
#Remove empty columns
PS222 <- PS22[, !empty_columns]
For the rows (countries) however, this obviously does not work as a row not only includes the respective numbers for each year but also (non-integer) values for other variables such as e.g. the country code. I was trying to make use of the same logic as in the case of columns, however, only taking the columns into account which actually represent years. I did not manage to find an acceptable solution which is why I would like to ask if maybe some of you guys could give me a hand here
Many thanks in advance!