Hi, I have a dataset consisting of 3 columns and 59 rows. I want to create a fourth row, which is based on the data of two other rows. Firstly, it should check in one column which city it is about. Secondly, if that row matches a particular city, I want to divide the "length" (which is another column) by the total amount of length of that particular city. Hence, the new column would be something like: percentage of length per city (but I named it Percentage). The data looks something like this:
A tibble: 59 x 3
Groups: City [4]
City Omschrijving Length
1 Amsterdam Bedrijfsterrein 6756.
2 Amsterdam Begraafplaats 1359.
3 Amsterdam Bos 506.
4 Amsterdam Bouwterrein 1584.
5 Amsterdam Dagrecreatief terrein 2127.
6 Amsterdam Detailhandel en horeca 3179.
7 Amsterdam Hoofdweg 20958.
8 Amsterdam Nat natuurlijk terrein 578.
9 Amsterdam Openbare voorziening 2118.
10 Amsterdam Overig agrarisch gebruik 476.... with 49 more rows
My code looks something like this
City_Omschrijving <-
City_Omschrijving %>%
filter(!Omschrijving == "Park en plantsoen") %>%
group_by(City, Omschrijving) %>%
mutate(Percentage = Length / sum(Length))
after which I did:
City_Omschrijving$Percentage <-
formattable::percent(City_Omschrijving$Percentage, digits =1)
But this gives me, unfortunately, the percentage of each Length with regards to the total amount of Length of all the column, instead of the percentage of each Length per city:
Groups: City, Omschrijving [59]
City Omschrijving Length Percentage
1 Amsterdam Bedrijfsterrein 6756. 1.6%
2 Amsterdam Begraafplaats 1359. 0.3%
3 Amsterdam Bos 506. 0.1%
4 Amsterdam Bouwterrein 1584. 0.4%
5 Amsterdam Dagrecreatief terrein 2127. 0.5%
6 Amsterdam Detailhandel en horeca 3179. 0.8%
7 Amsterdam Hoofdweg 20958. 4.9%
8 Amsterdam Nat natuurlijk terrein 578. 0.1%
9 Amsterdam Openbare voorziening 2118. 0.5%
10 Amsterdam Overig agrarisch gebruik 476. 0.1%... with 49 more rows