I think my question can be simplified: I think this is a loop question. Can anyone direct me to a good summary of this?

Thanks

Hi, I have a dataset with three columns: stations, Year, and avg.

There are 19 stations, and five to six years per Year, and a value for each year.

I am trying to run the Mann Kendall trend test on each of the 19 stations, using the rkt function.

I starting writing the code, copying, then changing the name of the station 19 times. I am sure there is a more efficient way to do this with loop, or lappy? Could someone help me on this? I am still learning this R!

Thank you.

My code: ("Year" has to be numeric for this to run)

library(rkt)
######
tn_trend1 <-rkt(date=alachua.tn.wide$Year,
y=alachua.tn.wide$BLUCOPK)
#> Error in integer(n): object 'alachua.tn.wide' not found
tn_trend1
#> Error in eval(expr, envir, enclos): object 'tn_trend1' not found

^{Created on 2021-05-03 by the reprex package (v1.0.0)}

My data:--just the first two stations, of five years

library(tidyverse)
library(rkt)
#The data
myData = data.frame(
station = c("ALACHCHAN","ALACHCHAN",
"ALACHCHAN","ALACHCHAN","ALACHCHAN","BLUCOPK","BLUCOPK",
"BLUCOPK","BLUCOPK","BLUCOPK"),
year = c(2014L,2015L,2016L,2017L,
2018L,2014L,2015L,2016L,2017L,2018L),
value = c(0.925, 1.97, 1.5, 2.3, 1.15, 1.5, 1.5, 1.7, 1.95, 1.33)
)
#Run for ONE station
testData = myData %>% filter(station == "ALACHCHAN")
rkt(testData$year, testData$value)
#>
#> Standard model
#> Tau = 0.2
#> Score = 2
#> var(Score) = 16.66667
#> 2-sided p-value = 0.8064959
#> Theil-Sen's (MK) or seasonal/regional Kendall (SKT/RKT) slope= 0.110625
#Run for ALL stations
result = map_df(unique(myData$station), function(myStation){
#Filter data
testData = myData %>% filter(station == myStation)
#Run test
testData = rkt(testData$year, testData$value)
#Transform result into data frame
unlist(testData) %>% t() %>% as.data.frame() %>%
add_column(station = myStation, .before = T)
})
result
#> station sl S B varS sl.corrected varS.corrected partial.S
#> 1 ALACHCHAN 0.8064959 2 0.110625 16.66667 NA NA NA
#> 2 BLUCOPK 1.0000000 1 0.050000 15.66667 NA NA NA
#> partial.sl partial.varS partial.sl.corrected partial.varS.corrected tau
#> 1 NA NA NA NA 0.2
#> 2 NA NA NA NA 0.1

^{Created on 2021-05-13 by the reprex package (v2.0.0)}

@pieterjanvc I having a bit of an issue. Perhaps it is my dataframe. I used your code, but changed:
"myData" to "alachua.tn2"
"year" to "Year"
"value" to "avg"

Why do you have "L" after your year data? I changed my Year to numeric, since rkt requires this.

I am getting the same value for each station--some of my results below:

station

sl

S

B

varS

sl.corrected

varS.corrected

partial.S

partial.sl

partial.varS

partial.sl.corrected

partial.varS.corrected

tau

ALACHCHAN

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

BLUCOPK

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

BLUOTTER

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

BOULSP

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

CEL441

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

GLENSP

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

HOG30US

0.806495941

2

0.111458333

16.66666667

NA

NA

NA

NA

NA

NA

NA

0.2

And my code below:

#Run for ALL stations
result = map_df(unique(alachua.tn2$station), function(myStation){
#Filter data
testData2 = alachua.tn2 %>% filter(station == myStation)
#Run test
testData2 = rkt(testData$Year, testData$avg)
#Transform result into data frame
unlist(testData2) %>% t() %>% as.data.frame() %>%
add_column(station = myStation, .before = T)
})
#> Error in map_df(unique(alachua.tn2$station), function(myStation) {: could not find function "map_df"

^{Created on 2021-05-13 by the reprex package (v1.0.0)}

This is because you have not loaded Tidyverse. It contains the function map_df, plus the other dpyr functions I used. Just install it like this

install.packages("tidyverse") #run only once
library(tidyverse)

It's quite large package, but it contains a lot of amazing other packages that are extremely helpful. You can check it out here: https://tidyverse.tidyverse.org/

@pieterjanvc I actually use "tidyverse" quite alot and I had it loaded. I reloaded it, but got the same result of the same results for all stations. I am at a loss as to what happening. Could it be the data?

Do you have the issue when running my reprex example? If not, then the error lies in your conversion from my code to yours, if you do, there's something with your libraries that is messing up the code...

@pieterjanvc Your code worked great--output was different for each station. Any idea how I trouble shoot this (sorry, I am a relatively new R user, and struggling with this....) Thank you again for your help.

However, when I ran it again, using PURR, I am coming up with this error:

Error in filter(., station == myStation) : object 'station' not found >

result2
station sl S B varS sl.corrected varS.corrected partial.S partial.sl partial.varS partial.sl.corrected
1 ALACHCHAN NA NA NA NA NA NA NA NA NA NA
2 BLUCOPK NA NA NA NA NA NA NA NA NA NA
3 BLUOTTER NA NA NA NA NA NA NA NA NA NA
4 BOULSP NA NA NA NA NA NA NA NA NA NA
5 CEL441 NA NA NA NA NA NA NA NA NA NA
6 GLENSP NA NA NA NA NA NA NA NA NA NA
7 HOG30US NA NA NA NA

You're really confusing me now
Could you please create a real reprex following the guide so that at least I can work with the actual dataset. You are now just showing me random pieces of data and code which are not helping.