Code for increase/decrease by a value using lag and lead function

Hi everyone!
I would like to automatically filter out the values which are greater or less than 10% of their previous values. Is it possible to code in RStudio? Do you have any idea?

I have 30-min data for the whole day. I would like to filter the values in every next column by the values +/- 10% of those values in every previous column. In the morning, the values will increase according the values in the previous columns + 10% of those values, but in the afternoon, the values will decrease by - 10%. Something like that. Let me show the excel calculation as follow:

6:00 6:30 7:00 7:30 8:00 8:30 9:00 9:30 10:00 10:30 11:00 11:30 12:00 12:30 13:00 13:30 14:00 14:30 15:00 15:30 16:00 16:30 17:00 17:30 18:00
10.00 11.00 12.10 13.31 14.64 16.11 17.72 19.49 21.44 23.58 25.94 11.00 12.10 10.89 9.80 8.82 7.94 7.14 6.43 5.79 5.21 4.69 4.22 3.80 3.42

I want to code like this in RStudio.

Thanks in advance!

May I get some help from you all @williaml, @nirgrahamuk, and @AlexisW. Thanks in advance!

Here is something reproducible to start with anyway:

df <- tibble::tribble(
  ~`6:00`, ~`6:30`, ~`7:00`, ~`7:30`, ~`8:00`, ~`8:30`, ~`9:00`, ~`9:30`, ~`10:00`, ~`10:30`, ~`11:00`, ~`11:30`, ~`12:00`, ~`12:30`, ~`13:00`, ~`13:30`, ~`14:00`, ~`14:30`, ~`15:00`, ~`15:30`, ~`16:00`, ~`16:30`, ~`17:00`, ~`17:30`, ~`18:00`,
      10L,     11L,    12.1,   13.31,   14.64,   16.11,   17.72,   19.49,    21.44,    23.58,    25.94,      11L,     12.1,    10.89,      9.8,     8.82,     7.94,     7.14,     6.43,     5.79,     5.21,     4.69,     4.22,      3.8,     3.42
  )

Hmm, not like that! Actually, I already have dataset. But I want to automatically filter out the values which are less/greater than 10% of the previous values for each column. Excel is just example.

Yeah, I know, it is reproducible data to start with. It will run in R. Your initial post won't.

Anyway, what have you tried, can you provide a reproducible example of what you have done? Can you better explain what your expected output should be with an example?

Dear @williaml
The following is the reproducible example of my data. I will like to filter out the values by the values in the previous cell +/- 10% of the value in the previous cell.

Row No. Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 Col11 Col12 Col13 Col14 Col15 Col16 Col17 Col18
1 10.34666667 10.41666667 10.43666667 10.39333333 10.57333333 10.46333333 10.79333333 11.28 11.19333333 11.19 11.32666667 11.07666667 11.16666667 10.83666667 10.50333333 10.53 10.46 10.83333333
2 10.21333333 10.45333333 10.29666667 10.34333333 10.63333333 10.39 10.62666667 10.89333333 11.00333333 11.16 11.04666667 11.09333333 11.24666667 10.62666667 10.43333333 10.45 10.54 10.80666667
3 10.22666667 10.31333333 10.31666667 10.18333333 10.35666667 10.22666667 10.46666667 10.8 10.77666667 11.12666667 10.99666667 11.23666667 11.17333333 10.51 10.51666667 10.35666667 10.71333333 10.49
4 10.2 10.2 10.28666667 10.28666667 10.35333333 10.06666667 10.35 10.63666667 10.74333333 10.87666667 10.78666667 11.07333333 10.94 10.32666667 10.23333333 10.24 10.59 10.33
5 10.21666667 10.24 9.976666667 10.02333333 10.39333333 9.866666667 10.11 10.65 10.69666667 10.93666667 10.89333333 11.08666667 10.93666667 10.43333333 10.36333333 10.19333333 10.39333333 10.52333333
6 10.01333333 10.03666667 9.886666667 10.1 10.20666667 9.776666667 10.12 10.44666667 10.58 10.82 10.67 10.77666667 10.66666667 10.23333333 9.963333333 10.12333333 10.16333333 10.51333333

Have a read on what is a reproducible example. It will make it easier for people to help you.

I have 30-min data for the whole day. I would like to filter the values in every next column by the values +/- 10% of those values in every previous column. In the morning, the values will increase according the values in the previous columns + 10% of those values, but in the afternoon, the values will decrease by - 10%.

b <- c(10.35,10.42,10.44,10.39,10.57,10.46,10.79,11.28,11.19,11.19, 11.33,11.08,11.17,10.84,10.50,10.53,10.46,10.83) d <- c(10.21,10.45,10.30,10.34,10.63,10.39,10.63,10.89,11.00, 11.16,11.05,11.09,11.25,10.63,10.43,10.45,10.54,10.81) DF <- rbind(b,d) colnames(DF) <- c("06:00","06:30","07:00","07:30","08:00","08:30", "09:00","09:30","10:00","10:30","11:00","11:30", "12:00","12:30","13:00","13:30","14:00","14:30") View(DF)

How can I code in RStudio?

Hi @nchan08,
Credit to @williaml and @nirgrahamuk for getting you to post some reproducible data.
The data is in the "wrong" format for R - it needs to be in the columns of a dataframe so that the required calculations can easily be made.
From your description, I think that you want to calculate lagged differences from each of the b and d columns, identify any rows where the differences are > 10%, and then filter the rows.
Try this code:

library(tidyverse)
library(lubridate)

# Edit some of the data to make bigger differences
b <- c(10.35,9.0,10.44,12.0,10.57,10.46,10.79,11.28,11.19,11.19,
       11.33,11.08,11.17,10.84,10.50,10.53,10.46,10.83)

d <- c(10.21,10.45,10.30,10.34,10.63,12.0,9.0,10.89,11.00,
       11.16,11.05,11.09,11.25,10.63,10.43,10.45,10.54,10.81)

ee <- c("06:00","06:30","07:00","07:30","08:00","08:30",
        "09:00","09:30","10:00","10:30","11:00","11:30",
        "12:00","12:30","13:00","13:30","14:00","14:30")

tt <- as_datetime(paste(today(), ee, ":00"))

DF <- data.frame(tt=tt, b=b, d=d)

# Calculate lagged differences in columns b and d
# Record if they are > +- 10%
DF %>% 
  mutate(lagb = lag(b),
         lagd = lag(d),
         b_diff = (b-lagb)/b,
         d_diff = (d-lagd)/d,
         b_omit = if_else(abs(b_diff) > 0.1, "Bad", "Good"),
         d_omit = if_else(abs(d_diff) > 0.1, "Bad", "Good")) -> out_df

out_df

out_df %>%  
  filter(b_omit == "Good") %>% 
  filter(d_omit == "Good")

HTH

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.