I am pretty new to Rstudio and so I am having trouble figuring out how to determine how long each of my field plots have been submerged in water (water level reaches the elevation of the plot). I have imported my data (as excel files), one which lists my plots and their average elevation (ft), and another which shows a list of dates and the water level on that day (in ft).
I am not sure what package I would need to use and how to write the code to figure this out. If anyone can help me out with this I would really appreciate it!! Thanks in advance!
Welcome to the forum.
I think the first thing we need is some sample data from both files . A handy way to supply sample data is to use the dput() function. See ?dput. If you have a very large data set then something like head(dput(myfile), 100) will likely supply enough data for us to work with.
For general information on how to ask questions, data and code formatting and so on please have a look at FAQ: How to do a minimal reproducible example ( reprex ) for beginners
@jrkrideau is right—a little data will attract more specific answers.
Here's a framework to approach R.
Every R
problem can be thought of with advantage as the interaction of three objects— an existing object, x , a desired object,y , and a function, f, that will return a value of y given x as an argument. In other words, school algebra— f(x) = y. Any of the objects can be composites.
Here, you have two csv files to start. That's x. y is an object (everything in R
is an object) that contains two variables, plot
and days_flooded
.
f is composed of several functions.
You can use readr::read_csv to bring x into two data frames, DF1
and DF2
. DF1
has two variables, plot
and base_elevation and
DF2 has
plot,
elevationand
date_measured`. (Or whatever you want to name the objects.)
The two data frames need to be combined, which can be done using one of the join
functions in {dplyr}
yielding plot
, base_elevation
, elevation
, date_measured
and elevation
. For each date_mentioned
, create a new variable, based on whether elevation
> base_elevations
. Then it's a matter of doing the date arithmetic on date_measured
.
Hi there,
I agree with @jrkrideau that a sample of the data would be nice to work with. However, since this is your first time posting and you have made a clear description, I have come up with some sample data to show the a way you could approach this.
I have provided a less theoretical approach than @technocrat though this might not be exactly what you need of course.
library(tidyverse)
set.seed(3) #Just for reproducibility
plots = data.frame(
id = LETTERS[1:5],
elevation = sample(10:100, 5)
)
plots
#> id elevation
#> 1 A 14
#> 2 B 67
#> 3 C 21
#> 4 D 45
#> 5 E 99
water = data.frame(
date = seq(as.Date("2021-1-1"), as.Date("2021-12-31"), by = "months"),
level = sample(10:75, 12)
)
water
#> date level
#> 1 2021-01-01 17
#> 2 2021-02-01 29
#> 3 2021-03-01 19
#> 4 2021-04-01 64
#> 5 2021-05-01 49
#> 6 2021-06-01 57
#> 7 2021-07-01 71
#> 8 2021-08-01 75
#> 9 2021-09-01 46
#> 10 2021-10-01 11
#> 11 2021-11-01 38
#> 12 2021-12-01 53
data = map_df(plots$id, function(x){
water %>% mutate(
inun = level >= plots$elevation[plots$id == x],
id = x)
})
data
#> date level inun id
#> 1 2021-01-01 17 TRUE A
#> 2 2021-02-01 29 TRUE A
#> 3 2021-03-01 19 TRUE A
#> 4 2021-04-01 64 TRUE A
#> 5 2021-05-01 49 TRUE A
#> 6 2021-06-01 57 TRUE A
#> 7 2021-07-01 71 TRUE A
#> 8 2021-08-01 75 TRUE A
#> 9 2021-09-01 46 TRUE A
#> 10 2021-10-01 11 FALSE A
data %>% group_by(id) %>%
summarise(inun = sum(inun), total = n()) %>%
mutate(perc = inun / total)
#> # A tibble: 5 x 4
#> id inun total perc
#> <chr> <int> <int> <dbl>
#> 1 A 11 12 0.917
#> 2 B 2 12 0.167
#> 3 C 9 12 0.75
#> 4 D 7 12 0.583
#> 5 E 0 12 0
Created on 2022-01-27 by the reprex package (v2.0.1)
I made use of the Tidyverse packages dplyr and purrr (map_df). If you're not familiar with Tidyverse, please check this out as it is very handy.
Hope this helps,
PJ
Thank you, @pieterjanvc! I think this is just what I was hoping to do. I'll look into the Tidyverse packages to make sure this is what I need.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.