GAM (Generalized Additive Model) Regression

There are two things that bother me here, but admit that I could have missed these points.

  1. I do not see how increased rainfall could result in reduced discharge. I add more water to the system and less water comes out?
  2. I have trouble seeing that increasing rainfall results in an exponential growth in discharge. Maybe small rain events are more local, and large rain events tend to cover most of the drainage basin?
  3. The discharge is a function of the water holding capacity of soil. So A large rain event that follows a drought could be held in the soil, but if it followed another large rain event it would mostly run off. There should be some penalty to discharge based on soil water holding capacity and how close the soil is to capacity at any point in time. You could guess at values based on literature sources. Essentially, the soil can hold "this much" and evapotranspiration will cause soil to lose water at "this rate" and can absorb water "this fast." Given a rain event of "this size" this portion will be held in the soil.

In any case, I suspect that some of the wierdness in these data is an artifact of using a simple model, as well as a general decrease in the quantity of data as rainfall increases. The latter effect was well discussed earlier in other posts.

Hi Bugs, I agree that those issues bother me as well. I have included a response to your points below.

  1. As shown in the figure below, more rainfall did not not result in reduced discharge, in both models ("m_pre" and "m_post"), the square root of the increase in streamflow (shown on the y-axis) increases as the square root of precipitation increases (shown on the x-axis) in both the pre- (blue line) and post scenario. The rate of increase in stream discharge however, plateaus between root 3.5 and root 4.5 in the m_pre scenario and between 4 and 5 in the m_post scenario. I am not sure why it plateaus like this, I am assuming it is a result of noise in the data. Other than this plateau, there is a continuous increase in discharge as precipitation increases.

  2. I am also unsure as to why increasing rainfall events results in exponential growth in discharge. I was also expecting the models to be closer to a straight line. I think the GAM model has added extra wiggles to the model for whatever reason, I am not quite sure. I tried root transforming the data to eliminate some of the wiggles, but it sill looks more like an exponential trend to me. Maybe someone could provide some insight here. I like how the models were fitted in the previous plots (before I combined both curves on the same plot) as these curves appeared to better represent the expected trend. I am not sure why I am getting added wiggles when I plot both models on the same graph.

  3. I agree that discharge is a function of the water holding capacity of the soil, however, I do not have that data available. That is a limitation that I intend to note in the discussion of my results. This is definitely an oversimplified model, but I wanted to first see overall if there is a trend in the relationship between precipitation and the increase in stream discharge, and discuss limitations to these models in the discussion.

My main concerns right now are:

  1. Why my confidence intervals around my regression curves are so narrow, it seems like due to the limitations in my dataset, I should have fairly wide confidence intervals.
  2. Why I have more wiggles in my curves when the curves are plotted together than I do when the curves are plotted separately (as shown in the black and white plots (i.e, figures 1 and 2) above)
  3. How to determine the slope or regression equation of each model in order to compare the overall slope of each model.

I appreciate any feedback that anyone may have.


If anyone has any advice I would greatly appreciate it.

Why? The confidence intervals represented uncertainty in the mean or expected value as a function of precipitation. Are you thinking about prediction intervals which include the variance of the data too?

Which Figures 1 and 2? You've submitted posts with several Figures 1 and 2.

How are you defining the overall slope of each model? Without such a definition it is hard to comment more.

Part of the problem of the wiggles (that as one increases rainfall discharge plateaus or even declines in some models/smooths in some posts) is due, I think, to the fact that your models assuming homogeneity in the data - i.e. they have the same variance. This seems unlikely given the plots you've shown here; the variance is very small at low "increase in discharge" and increase considerably at high "increase in discharge". Because you told the model that the data have the same variance, the very different variances in the data are likely being see as changes in the mean and hence the extra wiggles.

If you want to continue modelling this "increased discharge" (I remain unconvinced that this is good thing) then you need to address this problem.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.