Hi, I'm trying to replicate the apple mobility plot https://www.apple.com/covid19/mobility/ with ggplot and having trouble with it. My data frame has the countries as rows and the dates a as column names.
Any help would be greatly appreciated. I tried to upload a screenshot and it didn't work
If you have the dates as columns, then you will need to reshape the data first:
https://tidyr.tidyverse.org/articles/tidy-data.html
Thanks, so I make the dates as the rows and the countries as the columns?
tidy data would have columns for country, date, value(i.e. number_of_cases), the rows would have the information that is appropriate under such headings.
No, you reshape the data to a 'long' format via e.g. tidyr::pivot_longer()
such that your data looks like:
"USA", "2020-04-01", 100
"USA", "2020-04-02", 101
"Italy", "2020-04-01", 200
"Italy", "2020-04-02", 201
maps1 %>%
pivot_longer(-Group.1, names_to = "Dates", values_to = "frequency")
Is still returning the same data frame as there original, am I doing something wrong?
We don't really have enough info to help you out. Could you ask this with a minimal REPR oducible EX ample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.
If you've never heard of a reprex before, you might want to start by reading this FAQ:
Martin thanks for you advice. I have now got the data frame reshaped. I tried to add my code using datapasta package but it was giving me grief to. I've managed to plot the data using ggplot and have loads the geom_line from multiple data sets. Its all working but how do I create a legend representing each country?
If it's from multiple datasets you are making it more difficult than otherwise.
One dataset where the the country variable can be the colour mapping of the geom would be easier
How do I write that? Below is how I separated into county data frames and then plotted.
But you are suggesting that I can plot it from the original data frame?
AustraliaDataMapMobilityTransit <- subset(transitData, region =="Australia")
BelgiumDataMapMobilityTransit <- subset(transitData, region == "Belgium")
BrazilDataMapMobilityTransit <- subset(transitData, region == "Brazil")
CanadaDataMapMobilityTransit <- subset(transitData, region == "Canada")
CzechRepublicDataMapMobilityTransit <- subset(transitData, region == "Czech Republic")
ggplot()+
geom_line(data = AustraliaDataMapMobilityTransit, aes(x = Dates, y= frequency), colour = "red")+
geom_line(data=BelgiumDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "blue")+
geom_line(data=BrazilDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "yellow")+
geom_line(data=CanadaDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "black")+
geom_line(data=CzechRepublicDataMapMobilityTransit, aes(x= Dates, y = frequency), colour = "orange")
ggplot(data = transitData)+
geom_line( mapping = aes(x = Dates, y= frequency, colour = region))
Thanks you are a legend. How do I only plot top 5 and bottom 5.
Top 5 countries and bottom 5 countries, each country has approx 100 rows and there is approx 90 odd countries in the list
how do you calculate a ranking of the counties ? i.e. you have a summarisation over their dates in mind ?
The countries are listed in order, but there is 100 entries for each country before it moves to the next
Or is there a way to specify what countries I want to include
Say I want to graph Australia, Canada,America and Russia
sorry friend, you are a little all over the place...
you can certainly filter your data on the way into your graph.
dplyr::filter() function is for that purpose.
filtered_data <- transitData %>% filter(region %in% c("Australia", "Canada"))
ggplot(data = filtered_data )+
geom_line( mapping = aes(x = Dates, y= frequency, colour = region))
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.