Non-Linear Regressions

logancutler1999 · November 19, 2020, 3:58am

For a Fisheries Research class project, I am using R to get p-values for linear and non-linear regressions. I have successfully gotten p-values for linear and logarithmic regressions, and the equations matched the regressions I did in excel to check.

Despite some success, I have had some troubles with exponential and power function regressions. I have gotten outputs for both, but the estimated parameters are very different from the ones I got in Excel. I've included the code that I've been using for each regression type below.

As additional context, I am doing regressions with percent brown trout (p_brown) on the y axis and environmental conditions on the x axis. I have named the environmental conditions as follows: season_condition_extreme. For example, fa_d_max would correspond to fall_discharge_maximum.

Linear (works & matches Excel)

plot(p_brown~fa_d_max)
linreg_fa_d_max <- lm(p_brown~fa_d_max)
abline(linreg_fa_d_max)
summary(linreg_fa_d_max)

Logarithmic (works & matches Excel)

plot(p_brown~su_d_max)
logreg_su_d_max <- nls(p_brown~a*log(su_d_max)+b,
                       start = list(a = 4, b = -2))
summary(logreg_su_d_max)

Exponential (works, but doesn't match Excel)
Excel regression: y = 5.6287 * e^(0.0325 * x)
R regression: y = 7.248 * e^(0.0271 * x)

plot(p_brown~sp_d_min)
expreg_sp_d_min <- nls(p_brown~a*exp(b*sp_d_min),
                       start = list(a = 5, b = 0))
summary(expreg_sp_d_min)

Power (works, but doesn't match Excel)
Excel regression: y = 3.4559 * x^(0.4272)
R regression: y = 7.5279 * x^(0.2282)

plot(p_brown~fa_d_min)
powreg_fa_d_min <- nls(p_brown~a*(fa_d_min^(b)),
                       start = list(a = 3.5, b = .4))
summary(powreg_fa_d_min)

I appreciate any help or pointers that I can get. My experience in R is minimal, so this is all very new to me. Thanks!

williaml · November 19, 2020, 4:14am

Can you provide a reproducible example?

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

nirgrahamuk · November 19, 2020, 9:08am

Have you compared performance metrics/residuals to form an opinion of whether the excel solutions are preffered to your nls solutions?

logancutler1999 · November 19, 2020, 2:31pm

Here is a test dataset to use for each regression type. I've included a screenshot of the Excel output that I get from it.

tibble::tribble(
    ~env_cond,    ~p_fish,
40.35139752, 34.93150685,
 28.4673913, 21.70564746,
40.75807453, 24.90221643,
36.28462733, 7.305936073,
28.01552795, 13.32464571,
36.32981366, 15.70881226,
30.68152174, 8.983890954,
30.18447205, 18.71401152
)
#> # A tibble: 8 x 2
#>   env_cond p_fish
#>    <dbl>   <dbl>
#> 1   40.4   34.9 
#> 2   28.5   21.7 
#> 3   40.8   24.9 
#> 4   36.3    7.31
#> 5   28.0   13.3 
#> 6   36.3   15.7 
#> 7   30.7    8.98
#> 8   30.2   18.7

logancutler1999 · November 19, 2020, 2:56pm

I just compared the Residual Standard Errors of one of the power functions in Excel vs R. The RSE for R was 8.416 and 6.168 in Excel.

williaml · November 19, 2020, 9:35pm

How does the data that you have provided relate to the code that you initially posted?

logancutler1999 · November 20, 2020, 2:31pm

After some more digging, it seems that the difference in results between in Excel and R is due to a difference in regression methodology. The nls() function in R takes a more complex yet direct approach of running a regression on the data, while Excel transforms the data, runs a linear regression, then converts the parameters back into the nonlinear, non-transformed equivalent.

jrkrideau · November 22, 2020, 5:13pm

My general rule is if there a difference between R and Excel trust R.

Excel will do some strange things sometimes.

jrkrideau · November 23, 2020, 5:34pm

I am Linex so I cannot check in Excel but, if you do not mind a slightly stupid suggestion, what about trying the regression in the spreadsheet program gnumeric? I know this does not answer the R vs Excel question directly but it is another source of comparison and gnumeric algorithms in regression seem a bit more trustworthy than Excel's.

system · December 14, 2020, 5:34pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.