Not all my datapoints are shown in the plot

Hi, I am new to r and I want to plot 2 variables (classification and total energy) against each other using ggplot, whereby a only want to use 2 out of 3 classes from the variable classification, eg LB/L and LP/P.

I did plot my results using the below script, but than I dont get all my results.

data %>%
filter(classification == c("LB/B", "LP/P"))%>%
ggplot(aes(classification, total energy))+
geom_boxplot()+
ylim(-4,20)

I set the limits for y but this only increased the range without including more datapoints. I am not sure what I am doing wrong.

HI @mulderr01, try to change == for %in%

data %>%
  filter(classification %in% c("LB/B", "LP/P")) %>%
  ggplot(aes(classification, total_energy)) +
  geom_boxplot() +
  ylim(-4, 20)

That really worked. Thanks:)

Could you help me with something else?

Now I only want to plot these 2 classes if a condition is met. There is another column containing called 'source' by which the class was established, eg VKGL or ClinVar. I only want to use LB/B and LP/P classes from VKGL. How could I integrate/add such a condition to the above script?

You can add the condition for source to the filter() function

data %>%
  filter(classification %in% c("LB/B", "LP/P"), source == "VKGL") %>%
  ggplot(aes(classification, total_energy)) +
  geom_boxplot() +
  ylim(-4, 20)

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.