How to fit and read data from the standard curve graph with non-linear regression line on it having 2 sets of data

remi133 · November 21, 2021, 1:24am

Hello everyone,

I have recently been working on producing a standard curve with a non-linear regression line that would be presented with a 4 parameter logistic function. I managed to achieve that by using the dcr and plotrix packages, and I got

by using the following code:

library(drc)
library(plotrix)
x <- c(0, 1.2689671, 0.7365379, 0.2978207, 0.2838790, 0.2287194, 0.1317211)
y <- c(0, 5, 20, 80, 250, 1000)
model <- drm(STD ~ Absorbance, fct = LL.4(), data = STD)
plot(model,      cex=1.5,
                 col= "blue",
                 pch=16,
                 sub = "PGD2 Elisa",
                 xlab = expression(paste("%B/"~B[0]),size = 1, adj= 0),
                 ylab = expression(paste("Prostaglandin" ~~~  D[2], ~~"MOX  Concentration (pg/ml) "))
     )+
  title("Quantifying PGD2 in cell culture lysates and its enzymatic reactions ", adj = 0)+
  grid(5,10)+
  legend("topright", legend = c("PGD2 MOX - Standard Curve"),
           lwd = 1.5, pch=16, col = c("blue")
         )

However, the graph above is the visual representation of the biosynthetic assay. After fitting the line by non-linear regression, now I have to convert the sample absorbance (the second set of data) to pg/ml PGD2 (y-axis). I could do it manually, but first of all, it would not be precise, it is time-consuming, and by using Rstudio, I can improve my coding knowledge. Therefore, I have been thinking of data.frame that I was using while working with ggplot2 , so I created the following data.frame :

DataFramePGD2<-structure(list(Drug = structure(c(1L, 1L, 1L, 1L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L), 
                                               .Label = c("Vehicle-Vehicle", "Vehicle-NicAcid", "SP-Vehicle", "SP-NicAcid"), 
                                                class = "factor"), Absorbance = c(0.3476667, 0.3364504, 0.3590866, 0.3533717,
                                                                                  0.3622728, 0.3300868, 0.3402785, 0.3545702,  
                                                                                  0.3813823, 0.3201596, 0.3240057, 0.3440934,
                                                                                  0.2884533, 0.3655528, 0.3594112, 0.3916317)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                      -16L))

However, I never used data.frame within basic r graphics, and I do not know how to make my data.frame be fitted within the graph. I do not know what kind of code to use and how to do it.

DataUsed to produce above graph
DataUsed to produce data.frame

Thank you for any help

technocrat · November 21, 2021, 7:24am

{ggplot2} provides good facilities to overlay plots of objects from multiple data frames. This would require extracting components from the drm model as explained in this S/O post. It's bedtime, so I'm not going to attempt that, but take a look and I'll check back tomorrow.

Here's a cut-and-paste for the first set of data

std <- data.frame(
  STD =
    c(0, 5, 20, 80, 250, 1000),
  Absorbance =
    c(1.268967137, 0.736537869, 0.297820701, 0.283878975, 0.228719425, 0.131721101))

For the second data frame, I don't understand the relationship with the first set. Is there a common key field, like a sample ID?

remi133 · November 21, 2021, 12:56pm

Dear @technocrat,

I am not sure if there is a possibility to use together plot, and ggplot2. So I decided to use plotrix, and dcr packages mainly because within the ggplot2 environment, it was "impossible" to fit a four parametric logistic function.

Answering the second question, the data.frame is the second part of my data (please see attached)

I know that I must have made a mistake, and I divided these data instead of keeping them as one set of data, not two as in this example. So, the relationship between the first and second set of data is, that the first set of data was used to produce the standard curve with a four parametric logistic function. The second part of the data should be used to read the absorbance from the y-axis.

Thank you

jrkrideau · November 21, 2021, 3:55pm

I believe @ technocrat was suggesting doing all the plotting in ggplot2 rather than in base graphics.

I think you are correct that the data should not be separated but I am not familiar with the type of analysis you are doing. Unless there is a huge amound of code, it might be better to post sone sample raw data and your code for readers to get a feel for what you are doing overall. BTW, you did do the initial analysis in R?

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

remi133 · November 21, 2021, 4:24pm

Dear @jrkrideau,

Thank you for your reply, but I have shared all my data, including small (see the 'data.frame and the graph from the basic plot`), so I do not see a purpose in sharing FAQs.

Nevertheless, the analysis I am doing consists of two separate steps. First of all:
(1) The PGD2 Kit that I have been using to experiment comes with six different standard concentrations (0, 5, 20, 80, 250, and 1000 pg/ml) <- This set of data goes to the y-axis, and to obtain the x-axis data (plus all of the drugs that had been used) I had to run an ELISA. The results of reading you can see in the above example. So, from the readings, I could create a graph that is consisted of standard concentrations (the y-axis) and the readings of the standard concentrations (x-axis) and the drugs that the cells were treated with (see the screenshot above).
(2) Now, the second step must be done. I must fit the readings from the drugs used on the cells to the graphs to see the pattern. I could do that manually, but it won't be precise at all. Also, I want to learn how to do it in the R.

Before I managed to achieve the standard curve with the four parametric logistic function, I tried to achieve that using the ggplot2 but I literally could not, even the internet could not help me.

Answering your last question, yes I did.

Thank you

EDIT

The biggest challenge with the ggplot2 was to fit the non-linear four parametric logistic function (y=d + a-d/1+ (x/c)^2) within my graph. Using the basic graph and the plotrix and dcr packages, I could easily (the matter of 1 code) create the graph that I wanted. Each time when I was trying to do it, the ggplot2 did not see the p, or t values. Moreover, it did not see the b, c, d, and e intercepts, and without them you cannot fit the non-linear regression line.

In the basic r graphic after coding summary(mydata) I am getting the following outcome:
summary basic plot

While using the ggplot2 and summary(mydata) I can see
summaryy ggplo2

That is the main challenge

jrkrideau · November 21, 2021, 4:30pm

Sorry, I misread your original post and was interpreting the lower part of the png as a separate data set.

technocrat · November 21, 2021, 9:14pm

The purpose of a reprex is to allow someone to get to exactly the point at which the process is stuck. A reprex is cut-and-paste into a fresh R session and produces identical objects. In general, it's not necessary to provide all the data or even real data, just so long as it exhibits the same behavior and challenges. Having to reverse engineer problem deters many community members from attempting an answer.

I see from the screenshot, which is missing from the data, that the rows of the second data frame represent standard concentrations of data type double in the same general range as variables 1-12 in the first data frame. What remains unclear is how the Value variable of the first data frame is matched up with the "Vehicle-Vehicle" like columns of the second frame.

{plotrix} may provide the functionality needed for a {ggplot2} layering. I'm not familiar. The S/O flow link cited a paper by the authors of {drc} on how to extract what is needed for a {ggplot2} object should {plotrix} not be up to it.

remi133 · November 21, 2021, 9:14pm

Ok, a little update. After spending hours with the ggplot2 and after trying to fit the four parametric logarithmic function with the sclae_x_continous (in my case I do not need to present the x-axis values in the form of logs). I failed to do so. There are my results.
So first of all,

1.) I put the two sets of data together in one CSV. file and:

DataFramePGD2 = structure(list(Drug = c("Vehicle-Vehicle", "Vehicle-NicAcid", "SP-Vehicle", "SP-NicAcid"), 
                               Absorbance = c(0.3476667, 0.3364504, 0.3590866, 0.3533717,
                                              0.3622728, 0.3300868, 0.3402785, 0.3545702,  
                                              0.3813823, 0.3201596, 0.3240057, 0.3440934,
                                              0.2884533, 0.3655528, 0.3594112, 0.3916317), 
                               STD=         c(0, 5, 20, 80, 250, 1000, 
                                              NA, NA, NA, NA, NA,
                                              NA, NA, NA, NA, NA), 
                               Absorbance1 = c(1.2689671, 0.7365379, 0.2978207, 0.2838790,
                                               0.2287194, 0.1317211, NA, NA, NA,
                                               NA, NA, NA, NA, NA, NA, NA)), .Names = c("Drug", 
"Absorbance", "STD", "Absorbance1"), class = "data.frame", row.names = c(NA, -16L
))

So, far it looks good, however, moving forward:

ggplot(reshape2::melt(DataFramePGD2,id.vars = "Drug"), aes("Drug", "Absorbance")) +
  geom_point() +
  stat_function(fun = function(x){
    drm_y=function(x, drm){
      coef(drm)[2]+((coef(drm)[3]-coef(drm)[2])/(1+exp((coef(drm)[1]*(log(x)-log(coef(drm)[4]))))))
    }
+ drm_y(x,drm = drm(DataFramePGD2 = reshape2::melt(DataFramePGD2,id.vars = "Drug"), value~Drug, fct=LL.4(), na.action = na.omit))
 })

After executing this function I am getting a warning

Warning: Computation failed in `stat_function()`:
could not find function "drm"

No error, and no execution of my code.

technocrat · November 21, 2021, 10:28pm

means "not in namespace," perhaps because the example is running in a session without a

library(drc)

Try calling the function with drc::drm

Also avoid naming objects after functions, make the data object DRM instead of drm.

remi133 · November 21, 2021, 10:38pm

Thank for your help, but this time I am getting the following error:

Warning: Computation failed in `stat_function()`:
unused argument (DataFramePGD2 = reshape2::melt(DataFramePGD2, id.vars = "Drug"))

Anyway, I am not going to waste more of your time. I am just running out of mine. I have 3 weeks to finish my project (the experiment I finished on Thursday, and since then I am fighting and trying to understand what I am doing) above that I must finish writing up my lab report and prepare the presentation, and most of my time I am spending on the R ehhh. So, this time I must give up, I will do it manually and put the results to the table. Thank you once again.

technocrat · November 21, 2021, 11:04pm

Good call, I think, since the principal purpose is not learning R and there's no guarantee that this is the last hump. If you decide to return to it after you finish with the substantive problem and get stuck come back with a reprex or DM me, depending on how much general interest you perceive in the problem.

remi133 · November 21, 2021, 11:24pm

I mean, do not take me wrong. I have started the R journey having zero knowledge about it. I spent hundreds of hours, and after a few years, I decided to ask for help (last year, I wrote the first post, and the fundamental knowledge I was gaining from the books, forums, etc.) So, the side principal is indeed learning R. It allows me to learn more, learn from my mistakes, and become better and better. But, this time, I literally cannot sacrifice more of my time. I will defo come back to it after my presentation that is scheduled on the mid of December. Thank you once again

bradyjohnston · November 25, 2021, 7:27am

Hey there! I've been stuck in a similar place before (doing biochemistry lab work, trying to do the simple model fitting and getting nowhere.).

What I believe you are after is the stats::predict() function which lets you take a model and some unknowns, and predict what the values will be with those unknowns.

# unknown measurements, looking to predict their values.
unknowns <- data.frame(
  abs = c(0.3300868, 0.3402785, 0.3545702, 0.3813823, 0.3201596, 0.3240057, 0.3440934, 0.2884533, 0.3655528, 0.3594112, 0.3916317)
)

# measurements for the standard curve
std <- data.frame(
  STD = c(0, 5, 20, 80, 250, 1000),
  Absorbance = c(1.268967137, 0.736537869, 0.297820701, 0.283878975, 0.228719425, 0.131721101)
)

# create & fit the model
# I personally don't like loading {drc} with library(drc) as I usually only use it to fit a single model
# and it tends to produce a bunch of namespace clashes, so I just selectively call it with drc::drm() and drc::LL.4()
model <- drc::drm(STD ~ Absorbance, fct = drc::LL.4(), data = std)

# using the model, 'predict' what the values would be given particular measurements
unknowns$predicted <- stats::predict(model, unknowns)

At this point, you have all the values that you want, you can just print the data frame and you have your calculated results!

> print(unknowns)
         abs predicted
1  0.3300868 20.154784
2  0.3402785 16.159646
3  0.3545702 12.007736
4  0.3813823  7.167234
5  0.3201596 25.175213
6  0.3240057 23.077285
7  0.3440934 14.906165
8  0.2884533 53.626582
9  0.3655528  9.656600
10 0.3594112 10.895934
11 0.3916317  5.969585

If you want to plot them, you can do so quickly in base graphics doing this:

# create plot of model
plot(model)
# add points of the unknown values, colouring them red (tomato)
points(unknowns, col = "tomato")

If you want to instead plot it inside of ggplot2 then it requires a bit more work. You need to get the standard curve from the model.

# create a data.frame that will be used to draw the standard curve line.
line <- data.frame(
  x = seq(min(std$Absorbance * 0.9), max(std$Absorbance) * 1.1, length.out = 100)
  )

# fill it with the 'predictions' that will be the Y points along the line
line$y <- stats::predict(model, line)

library(ggplot2)
ggplot(std, aes(Absorbance, STD)) + 
  # add the points of original data
  geom_point() + 
  # plot the standard curve
  geom_line(data = line, aes(x = x, y = y), colour = "gray30") + 
  # plot the measured unknown values
  geom_point(data = unknowns, aes(abs, predicted), colour = "tomato") + 
  # transform the x axis to log10
  scale_x_log10() + 
  # change the ggplot theme to something nicer
  theme_classic()

Hope this helps for trying to do similar work in the future!

remi133 · November 26, 2021, 8:52pm

bradyjohnston:

line <- data.frame(
  x = seq(min(std$Absorbance * 0.9), max(std$Absorbance) * 1.1, length.out = 100)
  )

# fill it with the 'predictions' that will be the Y points along the line
line$y <- stats::predict(model, line)

library(ggplot2)
ggplot(std, aes(Absorbance, STD)) + 
  # add the points of original data
  geom_point() + 
  # plot the standard curve
  geom_line(data = line, aes(x = x, y = y), colour = "gray30") + 
  # plot the measured unknown values
  geom_point(data = unknowns, aes(abs, predicted), colour = "tomato") + 
  # transform the x axis to log10
  scale_x_log10() + 
  # change the ggplot theme to something nicer
  theme_classic()

Wow, I am impressed. Thank you so much. I have checked your suggestions using simple plot and everything was working, however when I tried with ggplot it was giving an error that says Error in FUN(X[[i]], ...) : object 'predicted' not found, that strange because I used the following codes:

#to create data.frame
line <- data.frame(
  x = seq(min(std$Absorbance * 0.9), max(std$Absorbance) * 1.1, length.out = 100)
  )

then,

#for predictions
> line$y <- stats::predict(model, line)

then,


ggplot(std, aes(Absorbance, STD)) +
  labs(title = "Quantifying PGD2 in cell culture lysates and its enzymatic reactions ",
       caption = "PGD2 ELISA")+
  geom_point(colour = "#69b3a2")+
  geom_line(data = line, aes(x = x, y = y), colour = "gray30") + 
  geom_point(data = unknowns, aes(abs, predicted), colour = "tomato") +
  scale_x_log10() +  *#it cannot find predicted object, despite it was called*
  scale_y_continuous(breaks = seq(0,1000, by=100))+

    xlab(expression(paste("%B/"~B[0])))+
    ylab(expression(paste("Prostaglandin"~ D[2], ~~ " MOX Concentration (pg/ml) ")))+
theme(plot.background =  element_rect(fill = "transparent"),
         panel.background = element_blank(),
         panel.grid.major = element_line(color = "grey",
                                         size = 0.1,
                                         linetype = 2),
         panel.grid.minor = element_line(color = "grey",
                                         size = 0.1,
                                         linetype = 2),
         axis.line = element_line(colour = "black"))+
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
  
   theme(legend.spacing.y = unit(0.01, "cm"))+
   theme(legend.position = c(0.77, .91),
         legend.background = element_rect(colour = NA, fill = NA))+
   theme(plot.title = element_text(size = 12, face = "bold.italic"),
         plot.caption = element_text(hjust = 0))

BTW i quite like col = "tomato". Thank you so much.

EDIT
Maybe you will know, when I am using basic plot it indeed produces great graph:

and the warning:
Error in e2[[j]] : subscript out of bounds

#code to produce this outstanding graph
unknowns <- data.frame(
  abs = c(0.347666713, 0.336450405, 0.359086619, 0.353371707, 0.362272805,	0.33008683, 0.340278523, 0.35457017, 0.38138226,	0.320159559, 0.324005727,	0.344093381, 0.288453324,	0.365552809, 0.359411168,	0.391631708)
)

std <- data.frame(
  STD = c(0, 5, 20, 80, 250, 1000),
  Absorbance = c(1.268967137, 0.736537869, 0.297820701, 0.283878975, 0.228719425, 0.131721101)
)

model <- drc::drm(STD ~ Absorbance, fct = drc::LL.4(), data = std)

unknowns$predicted <- stats::predict(model, unknowns)

  
plot(model,      cex=1.5,
                 col= "blue",
                 pch=16,
                 sub = "PGD2 Elisa",
                 xlab = expression(paste("%B/"~B[0]),size = 1, adj= 0),
                 ylab = expression(paste("Prostaglandin" ~~~  D[2], ~~"MOX  Concentration (pg/ml) "))
     )+
  grid(45,45)+
  points(unknowns, cex=1.0, pch=16, col = "tomato") +
  axis(side = 1, at=seq(0.2, 1.25, by = 0.1), labels = TRUE)+
  grid(43, 43)+
  title("Quantifying PGD2 in cell culture lysates and its enzymatic reactions ", adj = 0)+
  legend("topright", legend = c("PGD2 MOX - Standard Curve"),
           lwd = 1.5, pch=16, col = c("blue")
         )

Any ideas why?

Again, thank you so much. You are a pure genius.

bradyjohnston · November 29, 2021, 2:17am

I'm glad it was what you were looking for! I am not a genius, just somebody who has been down the same path before

I'm not super well versed with base plot() so I'm not sure waht the Error in e2[[j]] : subscript out of bounds is from.

For the ggplot() error, it is stating that

 Error in FUN(X[[i]], ...) : object 'predicted' not found

And we are using :

 geom_point(data = unknowns, aes(abs, predicted),  colour = "tomato")  +

So either the unknowns data frame hasn't been created prior to the ggplot call, or you haven't added the predicted values using this line of code:

# using the model, 'predict' what the values would be given particular measurements
unknowns$predicted <- stats::predict(model, unknowns)

Do you definitely have these lines before the ggplot call?

remi133 · November 29, 2021, 11:44pm

Dear bradyjohnston,

Thank you, I missed the last code despite I was certain I used that


unknowns$predicted <- stats::predict(model, unknowns)

Thank you once again

system · December 6, 2021, 11:45pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.