Trying to filter gapminder data over time.

Hey guys. I'm working on creating a plot that displays data over time with sliding animated bars. Currently I'm using example GPD over time data from gapminder. Each country is then assigned a rank for each year recorded in the data based on their GDP. The plot is only going to be showing the top 10-15 countries at any given time, so I'm trying to filter out any countries that never reach rank 15 or higher for any of the years recorded in the data. As you can see there are 12 years recorded for each country and lots of countries never reach rank 15 or higher in any of those years:

The 'rank' is used to position the bar representing the country in the plot. As you can see, only 10 countries are visible in this case. I'm hoping to speed up render time by getting rid of the unused data.

Any help is greatly appreciated!

Here is my data code:

library(tidyverse)
library(gapminder)

plotData <- gapminder %>%
  filter(continent == "Americas") %>%
  group_by(year) %>%
  # The * 1 makes it possible to have non-integer ranks while sliding
  mutate(rank = min_rank(-gdpPercap) * 1) %>%
  mutate(height = gdpPercap / max(gdpPercap)) %>%
  ungroup()

You can achieve this with a grouped filter with any(), such as:

plotData %>%
  group_by(country) %>%
  filter(any(rank <= 15)) %>%
  ungroup()
2 Likes

I've attempted to do this as well and have implemented it as an R package: barRacer

https://github.com/jl5000/barRacer

Please feel free to propose improvements.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.