Determine the maximum value for each date from one .csv file, sum of all values for each date from a second .csv file, and combine both .csv files into a single data object

I have two .csv files, one containing precipitation data taken at 15 minute intervals between 2010 and 2022 (PrecipitationforPosit.csv), and another containing discharge data taken at 15 minute intervals between 2010 and 2022 (StageDischargeforPosit.csv). Both the precipitation and discharge date were taken between May 1 and October 31 of each year. I want to sum the precipitation data taken at 15-minute intervals so that I have a daily total precipitation for each date, find the daily maximum stream discharge for each date, and combine the data into a single datafile that contains the date, total daily precipitation depth, and daily maximum discharge.

I then want to add another column to this dataset for the Daily Change in Maximum Discharge, where Daily
Change in Maximum Discharge is the Daily Maximum Discharge from the previous day minus the Daily Maximum Discharge from the current day.

I then want to add another column to the dataset that shows only the daily increase in stream discharge, whereby all negative values in the Daily Change in Maximum Discharge are converted to zero.

The first file contains instantaneous stage (i.e. water level) and discharge data taken at a 15 minute interval from 2010-05-01 until 2022-10-31. This file is in the format Date, Time, Stage, Discharge as shown below:

Date,Time,Stage,Discharge,
2010-05-01,0:00:00,5.042,0.299,
2010-05-01,0:15:00,5.043,0.302,
2010-05-01,0:30:00,5.044,0.305,
2010-05-01,0:45:00,5.044,0.308,

The second file contains precipitation data taken at the same 15 minute interval from 2010-05-01 until 2022-10-31. This file is in the format Date,Time,Precipitation as shown below:

Date,Time,Precipitation
2010-01-01,00:00:00,0
2010-01-01,00:15:00,0
2010-01-01,00:30:00,0
2010-01-01,00:45:00,0
2010-01-01,01:00:00,0
2010-01-01,01:15:00,0

I want to first find the maximum discharge for each date from the "StageDischargeforPosit.csv" file, and the total precipitation depth for each date from the "Precipitation2010to2022.csv" file. I then want to create a new data object with the "Date", and "Daily Precipitation Depth", and "Maximum Discharge", such as what is as shown below:

Date,Daily Precipitation Depth,Daily Maximum Discharge
2010-05-01,0.0,0.308
2010-05-02,0.0,0.215
2010-05-03,8.4,0.413

I then want to add a column to the above dataset for the change in maximum discharge, such as what is shown below

Date,Daily Precipitation Depth,Daily Maximum Discharge, Change in Daily Maximum Discharge
2010-05-01,0.0,0.308, 0.00
2010-05-02,0.0,0.215, -0.093
2010-05-03,8.4,0.413, 0.198

Note that the daily precipitation depths and maximum discharge values are for illustration purposes only. I have included files of the precipitation and discharge data that I am trying to combine. Does anyone have any advice?

Here are the links to the data

This should be close. I didn't test it, so there may be an error or two.

library(dplyr)

Disch <- read.csv("StageDischargeforPosit.csv") 
Precip <- read.csv("Precipitation2010to2022.csv")
MaxDisch <- Disch |> group_by(Date) |> 
  summarize(Max_Disch = max(Discharge, na.rm = TRUE))
TotPrecip <- Precip |> group_by(Date) |> 
  summarize(Total_Precip = sum(Precipitation, na.rm = TRUE))

Disch_Precip <- inner_join(MaxDisch, TotPrecip, by = "Date")
Disch_Precip <- Disch_Precip |> 
  mutate(LagDisch = lag(Max_Disch),
          ChangeDisch = Max_Disch - LagDisch)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.