Modelling and data wrangling woes

Given that I have three previous crops, N rates and grain yield and I computed the agronomic optimum N rates using a quadratic model, please how do I know that those three agronomic optimum N rates (AONR) predicted from the models are different. I have attached a fake dataset below. Thank you very much for the help.

Packages

library(ggplot2)
library(ggpmisc)
library(tidyverse)
library(lmerTest)
library(performance)

Data

n_seq <- seq(0, 300, length.out = 500)

Parameters: y0 = yield at zero N, a = max additional yield from N

yS <- 8200 + 4000 * (1 - exp(-0.015 * n_seq)) # Soybean - highest
yF <- 6500 + 5000 * (1 - exp(-0.015 * n_seq)) # Fallow - middle
yC <- 5000 + 5500 * (1 - exp(-0.015 * n_seq)) # Corn - lowest


df <- data.frame(
  N_rate    = rep(n_seq, 3),
  Rep=c(1:4),
  Yield  = c(yS, yF, yC),
  Previous_crop = factor(rep(c("Soybean", "Fallow", "Corn"), each = length(n_seq)),
                levels = c("Soybean", "Fallow", "Corn"))
) %>% 
  mutate(Rep=factor(Rep))

Model

quad_mod <- lmer(Yield ~Previous_crop*N_rate+Previous_crop*I(N_rate^2)+(1|Previous_crop)+(1|Rep), data = df)


## Summary of the model output
anova(quad_mod,type = 3,ddf = 'Kenward-Roger')
summary(quad_mod)

performance_aic(x = quad_mod)

Model prediction

newdf<- df %>%
  
  dplyr::select(Previous_crop) %>%
  
  unique() %>%
  
  expand_grid(N_rate = seq(0,300, by=1)) %>%
  
  mutate(pred = predict(quad_mod, newdata=., re.form=NA))


newdf

Corn AONR

newdf %>% 
  filter(Previous_crop=='Corn') %>% 
  group_by(N_rate) %>% 
  reframe(AONR_corn=range(pred)) %>% 
  filter(AONR_corn==max(AONR_corn))

Soybean AONR


newdf %>% 
  filter(Previous_crop=='Soybean') %>% 
  group_by(N_rate) %>% 
  reframe(AONR_soybean=range(pred)) %>% 
  filter(AONR_soybean==max(AONR_soybean))

Fallow AONR


newdf %>% 
  filter(Previous_crop=='Fallow') %>% 
  group_by(N_rate) %>% 
  reframe(AONR_fallow=range(pred)) %>% 
  filter(AONR_fallow==max(AONR_fallow))

Visualization


p1 <- ggplot(df, aes(x = N_rate, y = Yield,colour = Previous_crop)) +
  geom_line(linewidth = 1.2) +
  #  stat_poly_eq(use_label('eq','adj.R2','P'),label.x = 'left',label.y = 'top',size=3.7,
  #              formula = y~poly(x,2,raw = T,coefs = F),coef.digits = 4)+
  
  # Crop labels at end of each curve
  annotate("text", x = 253, y = tail(yS, 1), label = "Soybean",
           hjust = 0, size = 4, color = "#3d2b1f", fontface = "bold") +
   annotate("text", x = 253, y = tail(yF, 1), label = "Fallow",
           hjust = 0, size = 4, color = "#3d2b1f", fontface = "bold") +
  annotate("text", x = 253, y = tail(yC, 1), label = "Corn",
           hjust = 0, size = 4, color = "#3d2b1f", fontface = "bold") +
  # Dotted horizontal reference lines at AONR_S
  # annotate("segment",
  #         x = 0, xend = 28,
  #        y = 8000, yend = 8000,
  #       linetype = "dotted", color = brown, linewidth = 0.8) +
  
  scale_x_continuous(
    # name   = "N Rate (kg N/ha)",
    limits = c(0, 270),
    breaks = seq(0, 250, by = 50),
    expand = c(0, 0)
  ) +
  scale_y_continuous(
    #name   = "Corn Grain Yield (kg/ha)",
    limits = c(0, 13000),
    breaks = seq(0, 12000, by = 2000),
    expand = c(0, 0)
  ) +
  labs(
    #title = paste(
    #   "2024 Missouri-Columbia"),
    y = expression(bold('Corn grain yield '~(kg ~ ha ^ -1))),
    x = expression(bold('N rate '~(kg ~ ha ^ -1))))+
  
  #ggtitle("Year 2 Corn Grain Yield Response to N Rate\nby Previous Crop") +
  
  theme_classic(base_size = 13) +
  theme(
    plot.title   = element_text(hjust = 0.5, size = 12, face = "bold"),
    # legend.title = element_blank(),
    #legend.text = element_blank(),
    axis.line    = element_line(color = "black", linewidth = 0.7),
    axis.ticks   = element_line(color = "black"),
    panel.grid   = element_blank(),
    text = element_text(family = 'serif',face = 'bold',colour = 'black',size = 12)
  )


p1

Hello!
If I understand you correctly (correct me please if not) your question is:

Ìs there a significant difference between the three crops regarding their predicted AONR?

If so there are several solutions. If this is a paper I would definetly consider performing another ANOVA (or a nonparametric alternative). If it is just for you, it is enough to simply check visually using a boxplot or even better, what I like to do is using the ggstatsplot package. It combines both approaches. You can extend your provided code like this:

ggstatsplot::ggbetweenstats(newdf, Previous_crop, pred)

The output clearly indicates there is a strong significant difference between every crop.

At the end just a short hint for future code sharing (which was actually good thanks for that): Please check that you included all necessary packages at the beginning, e.g. I had to install and load lme4 and performance. As an ecologist I know this packages and the regarding functions, somebody else however would have to search for the package before he/she can run your code. :slight_smile:

this seems to be a nice project, please let me know if this was helpful and if it was let me know if i can help you more :slight_smile: