I am having a problem with capturing certain values in my database to apply a correlation to lists.
My dataset has the following format, here only the values from the first two locations.
month | year | precipitation | temperature |
---|---|---|---|
Jan | 2014 | 149,6 | 26,434432 |
Feb | 2014 | 251 | 26,192469 |
Mar | 2014 | 161 | 26,379851 |
Apr | 2014 | 244,3 | 26,158357 |
May | 2014 | 760,7 | 25,848118 |
Jun | 2014 | 185,5 | 26,676528 |
Jul | 2014 | 44,3 | 26,655914 |
Aug | 2014 | 3,8 | 26,85959 |
Sep | 2014 | 0,5 | 27,271498 |
Oct | 2014 | 2,2 | 0 |
Nov | 2014 | 1,8 | 27,512732 |
Dec | 2014 | 29,2 | 27,666532 |
Jan | 2015 | 30,2 | 27,422312 |
Feb | 2015 | 90,7 | 27,034226 |
Mar | 2015 | 425,9 | 26,151944 |
Apr | 2015 | 366,5 | 26,217222 |
May | 2015 | 399,7 | 26,529839 |
Jun | 2015 | 106,7 | 26,482361 |
Jul | 2015 | 75,9 | 26,681855 |
Aug | 2015 | 0 | 26,952016 |
Sep | 2015 | 0 | 27,293333 |
Oct | 2015 | 0 | 27,618817 |
Nov | 2015 | 0,3 | 27,906528 |
Dec | 2015 | 11,3 | 28,453629 |
Jan | 2016 | 204,5 | 27,386962 |
Feb | 2016 | 140,8 | 27,195259 |
Mar | 2016 | 362,1 | 26,78414 |
Apr | 2016 | 327,4 | 27,35 |
May | 2016 | 264,9 | 27,212903 |
Jun | 2016 | 133,4 | 26,838599 |
Jul | 2016 | 52,1 | 26,644489 |
Aug | 2016 | 13,4 | 27,16413 |
Sep | 2016 | 1,8 | 27,425139 |
Oct | 2016 | 0 | 27,935887 |
Nov | 2016 | 0 | 28,212361 |
Dec | 2016 | 42,9 | 27,853091 |
Jan | 2017 | 380 | 26,873656 |
Feb | 2017 | 355,4 | 25,938839 |
Mar | 2017 | 442,8 | 25,816263 |
Apr | 2017 | 362,1 | 26,486806 |
May | 2017 | 329,9 | 26,72379 |
Jun | 2017 | 120,6 | 26,829647 |
Jul | 2017 | 201,4 | 26,430181 |
Aug | 2017 | 4,8 | 0 |
Sep | 2017 | 0 | 0 |
Oct | 2017 | 1,2 | 27,88125 |
Nov | 2017 | 0,6 | 27,604722 |
Dec | 2017 | 63,2 | 27,641264 |
Jan | 2018 | 253 | 26,639247 |
Feb | 2018 | 531 | 25,598512 |
Mar | 2018 | 251,4 | 26,650269 |
Apr | 2018 | 478,2 | 25,727778 |
May | 2018 | 374,2 | 25,767742 |
Jun | 2018 | 106,3 | 26,402639 |
Jul | 2018 | 75,4 | 26,162903 |
Aug | 2018 | 27,2 | 26,484677 |
Sep | 2018 | 11,6 | 27,076389 |
Oct | 2018 | 4,2 | 27,53293 |
Nov | 2018 | 8,5 | 27,963889 |
Dec | 2018 | 209,8 | 26,327107 |
Jan | 2019 | 327,9 | 0 |
Feb | 2019 | 521,6 | 0 |
Mar | 2019 | 818,2 | 25,942721 |
Apr | 2019 | 433,7 | 25,956945 |
May | 2019 | 226,7 | 26,466263 |
Jun | 2019 | 218,3 | 26,475417 |
Jul | 2019 | 107,4 | 26,397043 |
Aug | 2019 | 3,4 | 26,905645 |
Sep | 2019 | 4,9 | 27,213056 |
Oct | 2019 | 2,1 | 27,478419 |
Nov | 2019 | 16 | 27,710833 |
Dec | 2019 | 72,8 | 28,079839 |
Jan | 2020 | 556,7 | 26,296838 |
Feb | 2020 | 435,2 | 26,259339 |
Mar | 2020 | 657,8 | 25,947849 |
Apr | 2020 | 538,1 | 26,135972 |
May | 2020 | 236 | 26,918432 |
Jun | 2020 | 136,3 | 26,792917 |
Jul | 2020 | 54,5 | 26,629301 |
Aug | 2020 | 11,9 | 26,892473 |
Sep | 2020 | 0 | 27,460694 |
Oct | 2020 | 0 | 27,748118 |
Nov | 2020 | 35 | 27,759722 |
Dec | 2020 | 46,9 | 27,760081 |
Jan | 2014 | 149,6 | 26,434432 |
Feb | 2014 | 251 | 26,192469 |
Mar | 2014 | 161 | 26,379851 |
Apr | 2014 | 244,3 | 26,158357 |
May | 2014 | 760,7 | 25,848118 |
Jun | 2014 | 185,5 | 26,676528 |
Jul | 2014 | 44,3 | 26,655914 |
Aug | 2014 | 3,8 | 26,85959 |
Sep | 2014 | 0,5 | 27,271498 |
Oct | 2014 | 2,2 | 0 |
Nov | 2014 | 1,8 | 27,512732 |
Dec | 2014 | 29,2 | 27,666532 |
Jan | 2015 | 30,2 | 27,422312 |
Feb | 2015 | 90,7 | 27,034226 |
Mar | 2015 | 425,9 | 26,151944 |
Apr | 2015 | 366,5 | 26,217222 |
May | 2015 | 399,7 | 26,529839 |
Jun | 2015 | 106,7 | 26,482361 |
Jul | 2015 | 75,9 | 26,681855 |
Aug | 2015 | 0 | 26,952016 |
Sep | 2015 | 0 | 27,293333 |
Oct | 2015 | 0 | 27,618817 |
Nov | 2015 | 0,3 | 27,906528 |
Dec | 2015 | 11,3 | 28,453629 |
Jan | 2016 | 204,5 | 27,386962 |
Feb | 2016 | 140,8 | 27,195259 |
Mar | 2016 | 362,1 | 26,78414 |
Apr | 2016 | 327,4 | 27,35 |
May | 2016 | 264,9 | 27,212903 |
Jun | 2016 | 133,4 | 26,838599 |
Jul | 2016 | 52,1 | 26,644489 |
Aug | 2016 | 13,4 | 27,16413 |
Sep | 2016 | 1,8 | 27,425139 |
Oct | 2016 | 0 | 27,935887 |
Nov | 2016 | 0 | 28,212361 |
Dec | 2016 | 42,9 | 27,853091 |
Jan | 2017 | 380 | 26,873656 |
Feb | 2017 | 355,4 | 25,938839 |
Mar | 2017 | 442,8 | 25,816263 |
Apr | 2017 | 362,1 | 26,486806 |
May | 2017 | 329,9 | 26,72379 |
Jun | 2017 | 120,6 | 26,829647 |
Jul | 2017 | 201,4 | 26,430181 |
Aug | 2017 | 4,8 | 0 |
Sep | 2017 | 0 | 0 |
Oct | 2017 | 1,2 | 27,88125 |
Nov | 2017 | 0,6 | 27,604722 |
Dec | 2017 | 63,2 | 27,641264 |
Jan | 2018 | 253 | 26,639247 |
Feb | 2018 | 531 | 25,598512 |
Mar | 2018 | 251,4 | 26,650269 |
Apr | 2018 | 478,2 | 25,727778 |
May | 2018 | 374,2 | 25,767742 |
Jun | 2018 | 106,3 | 26,402639 |
Jul | 2018 | 75,4 | 26,162903 |
Aug | 2018 | 27,2 | 26,484677 |
Sep | 2018 | 11,6 | 27,076389 |
Oct | 2018 | 4,2 | 27,53293 |
Nov | 2018 | 8,5 | 27,963889 |
Dec | 2018 | 209,8 | 26,327107 |
Jan | 2019 | 327,9 | 0 |
Feb | 2019 | 521,6 | 0 |
Mar | 2019 | 818,2 | 25,942721 |
Apr | 2019 | 433,7 | 25,956945 |
May | 2019 | 226,7 | 26,466263 |
Jun | 2019 | 218,3 | 26,475417 |
Jul | 2019 | 107,4 | 26,397043 |
Aug | 2019 | 3,4 | 26,905645 |
Sep | 2019 | 4,9 | 27,213056 |
Oct | 2019 | 2,1 | 27,478419 |
Nov | 2019 | 16 | 27,710833 |
Dec | 2019 | 72,8 | 28,079839 |
Jan | 2020 | 556,7 | 26,296838 |
Feb | 2020 | 435,2 | 26,259339 |
Mar | 2020 | 657,8 | 25,947849 |
Apr | 2020 | 538,1 | 26,135972 |
May | 2020 | 236 | 26,918432 |
Jun | 2020 | 136,3 | 26,792917 |
Jul | 2020 | 54,5 | 26,629301 |
Aug | 2020 | 11,9 | 26,892473 |
Sep | 2020 | 0 | 27,460694 |
Oct | 2020 | 0 | 27,748118 |
Nov | 2020 | 35 | 27,759722 |
Dec | 2020 | 46,9 | 27,760081 |
![image | 45x500](upload://hP2HFPbhMWbzf1VtVFNDhOYX5ZI.png) |
The dataset consists of 123 different locations, where each location has climate data from January 2014 to December 2020, making a dataset with 10332 rows.
A while ago I posted this problem that I got part of solved here, but I have not yet found a way to realize a criteria to achieve the compilation in the desired way.
What the code is currently doing is to select the period of a location where it generates the lists corresponding to the values of the weather variables for the last six months of the location, for example.
But I would like it to repeat this process for all locations, without including values from the previous location. One "solution" for it to go through all neighborhoods and not just one was to extend the date well beyond December 2020 to complete the rows of the dataset, since the dates are only for marking from where to extract the values of the variables. The problem that occurs is that when generating the list for the second location, the code is conditioned to take the December 2020 value from the location before it in the dataset and put it in the list for the current location.
What I would like is for the list values to correspond only to the values of the current location and not to enter values from previous locations. I don't know if because I put the criteria of dates with very high values just to get around the problem of getting only one neighborhood and not the 123 is generating a problem, since I can't generate a loop or something similar. That is, after December 2020 the process restarts and the current list only takes values from that location to form the list with the values of the last six months.
Someone could give me a hand! Thank you