GGplot, plot multible variables on one column

Hi!
I'm working with a dataset similar to the image attached. For each patient I have an ID, a treatment day where a blood sample was taken and a day of treatment that a complication occurred.

Skærmbillede 2021-11-28 kl. 09.42.22

I would like to make a plot (using GGplot), plotting ID on the Y-axis and both day of blood sample and day of complication on the x-axis on the same graph.

I've plotted ID and bloodsample/ID and complication separately using the following code:

ggplot(data=df, mapping = aes (x=Bloodsample, y = ID)) + geom_point+geom_path

Can anyone tell me how to add an extra variable to the x-axis?
Thanks!

Hello!

There are a couple of ways of doing this. The best is what I've labelled the "reshape" method, but the "wide data" method also works:


library(tidyverse)

dat = tribble(
  ~id, ~blood, ~comp,
  1, 90, 112,
  2, 115, 130,
  3, 100, 90,
  4, 50, 70,
  5, 120, 10,
  6, 70, 120
)


# Reshape Method ----------------------------------------------------------

dat |> 
  pivot_longer(-id) |> 
  ggplot(aes(y = id, x = value, color = name)) +
  geom_point() +
  geom_path()


# Wide Method -------------------------------------------------------------

dat |> 
  ggplot(aes(y = id)) +
  geom_point(aes(x = blood, color = "blood")) +
  geom_path(aes(x = blood, color = "blood")) +
  geom_point(aes(x = comp, color = "comp")) +
  geom_path(aes(x = comp, color = "comp"))

Though if I can be a bit assumptive, perhaps a plot like this could also be effective:

dat |> 
  ggplot(aes(y = id)) +
  geom_segment(aes(x = blood, xend = comp, yend = id)) +
  geom_point(aes(x = blood, color = "blood")) +
  geom_point(aes(x = comp, color = "comp"))

Thank you so much!

I actually have more than 1 complication columns (4), but using the wide method I was able to add all to the same graph. I would also like to connect all dots with the same ID - however right now it's only connecting from bloodsample to my first complication.

Do you also know how to connect bloodsample and complication (based on ID), for all four complications?

Best regards

No worries, here's how I'd do that - the key thing is pulling out the maximum and minimum values for each ID.

library(tidyverse)

# Set up Data -------------------------------------------------------------

vec = seq(10,100,5)

dat = tibble(id = 1:20,
       blood = sample(vec, 20, replace = T),
       comp1 = sample(vec, 20, replace = T),
       comp2 = sample(vec, 20, replace = T),
       comp3 = sample(vec, 20, replace = T))

# Plot --------------------------------------------------------------------

dat |>
  pivot_longer(-id) |>
  group_by(id) |> 
  mutate(max = max(value),
         min = min(value)) |> 
  ggplot(aes(y = id, x = value)) +
  geom_segment(aes(x = min, xend = max, yend = id)) +
  geom_point(aes(color = name))

If you're a newer user of ggplot2, we can really finesse this more (e.g., putting the cases in order of when the blood was taken, improving the legend, etc.)

dat |>
  mutate(id = as.character(id) |> fct_reorder(blood)) |> 
  pivot_longer(-id) |>
  mutate(name = str_replace(name, "comp", "Complication "),
         name = str_replace(name, "blood", "Blood Taken")) |> 
  group_by(id) |> 
  mutate(max = max(value),
         min = min(value)) |> 
  ungroup() |> 
  ggplot(aes(y = id, x = value)) +
  geom_segment(aes(x = min, xend = max, yend = id)) +
  geom_point(aes(color = name)) +
  theme_bw() +
  theme(axis.text.y = element_blank(), 
        axis.ticks.y = element_blank(),
        legend.position = "top") +
  labs(x = "Day", y = NULL, color = "Event") +
  expand_limits(x = 0)

1 Like

Thanks!

I am new to Ggplot, so this is very helpfull, thanks!

When setting up the data I'm not really sure I understand what you are doing in the first line (vec = seq(10,100,5), what are these numbers referring to? Also my ID should be considered as a factor instead of numeric, if that makes a difference.

Once again, thanks! :smiley:

No worries! So that "vec" bit is just me setting up some data for me to use as I don't have access to your data - I wouldn't worry!

You can transform a variable to a factor simply by using, e.g., mutate(var = factor(var))

Thank you so much for your help - I am having a bit of trouble with this second plot

dat = tibble(id = 1:20,
blood = sample(vec, 20, replace = T),
comp1 = sample(vec, 20, replace = T),
comp2 = sample(vec, 20, replace = T),
comp3 = sample(vec, 20, replace = T))

Is it correct that I should prepare my data similarly? when writing ID = 1:20, your referring to your min and max ID values - as my ID is a factor there isn't any min and max. So for now I wrote

df3= tibble(id,
blood = sample(df3, replace = T),
comp1 = sample(df3, replace = T),
comp2 = sample(df3, replace = T),
comp3 = sample(df3, replace = T))

df3 being my dataset. I understand that the vector your creating is just some mock data, but I'm unsure what I should replace this with in my own dataset.

However this can be run by R, and then I just copy pasted your code

df3 |>
pivot_longer(-id) |>
group_by(id) |>
mutate(max = max(value),
min = min(value)) |>
ggplot(aes(y = id, x = value)) +
geom_segment(aes(x = min, xend = max, yend = id)) +
geom_point(aes(color = name))

Which doesn't run - Probably becasuse I made a mistake from the beginning. I'm not sure if I'm supposed to fill in min and max values or if R can read this. Again thanks for helping an R newbie :relaxed:

I think what may be useful is providing a Reproducible Example.

Could you copy-paste the result of typing dput(head(your_data)) into your console, where your_data is your data frame?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.