How to find the max value of a column for changing set of rows and get the corresponding value of another column?

I am having the stock price data of IBM.
Code:
library(tseries)
IBM<-tail(get.hist.quote("IBM"),250)

I inserted a new column containing the Serial No. using
Code:
IBM$Serial <- 1:nrow(IBM)
[1] "Open" "High" "Low" "Close" "Serial"

Now, I would like to create a new column named "Max_Vals" and get from the 100th row the corresponding column - Serial No. Value when Close was max.

It's not entirely clear to me, what you are trying to accomplish. It is much easier to get help, if you supply a reproducible example also please look into code formatting using backticks.

Perhaps you're looking for something like this:

library("tidyverse")
set.seed(182635)
d = tibble(Open  = rnorm(n = 250, mean = 100),
           High  = rnorm(n = 250, mean = 100),
           Low   = rnorm(n = 250, mean = 100),
           Close = rnorm(n = 250, mean = 100))
d = d %>% mutate(Serial = 1:nrow(.))
d
# A tibble: 250 x 5
    Open  High   Low Close Serial
   <dbl> <dbl> <dbl> <dbl>  <int>
 1 101.   99.3 100.  100.       1
 2 101.  102.   98.4  98.2      2
 3 101.  100.  100.  101.       3
 4 101.  102.  103.   99.9      4
 5  98.2 102.  101.   98.6      5
 6 101.  100.   98.2 102.       6
 7  98.4 100.   99.4 101.       7
 8 102.   98.5 101.   99.2      8
 9  99.4  99.7 101.  101.       9
10 101.   99.8 100.0 101.      10
# ... with 240 more rows
d %>% summarise(Max_Vals = max(Close), Serial = which.max(Close))
# A tibble: 1 x 2
  Max_Vals Serial
     <dbl>  <int>
1     103.    149

More information on how to get help, can be found here :slightly_smiling_face:

Thank you for your quick reply,
My actual requirement is:
Suppose there are 1000 rows of IBM Historical Stock Price Data, I want to create a new column(starting from the 100th row) named "Max_Val_Serial_No" which gives the Serial No.Value of the Max closing price for the previous 100 days for the entire column.

Note: Every row in the new column should take its previous 100 days closing price for calculating the max closing price and give its corresponding serial no. in the new column.

Like so?

# Load libraries
library("tidyverse")

# Set seed for reproducibility
set.seed(182635)

# Define function for looking back 'size' positions in vector
which_max_in_window = function(x, size){
  out = vector(mode = "numeric", length = length(x))
  win = seq(1, size)
  out[win] = which.max(x[win])
  for( i in seq(size + 1, length(x)) ){
    first  = i - size + 1
    last   = i
    win    = seq(first, last)
    out[i] = which.max(x[win]) + first - 1
  }
  return(out)
}

# Generate dummy data
n = 1000
d = tibble(Open  = rnorm(n = n, mean = 100),
           High  = rnorm(n = n, mean = 100),
           Low   = rnorm(n = n, mean = 100),
           Close = rnorm(n = n, mean = 100))

# Create new variable with max indices for previous 100 elements of Close vector
d = d %>% mutate(Serial = seq(1, nrow(.)),
                 Max_Val_Serial_No = which_max_in_window(x = Close, size = 100))

# Get tibble of top closing values in window
d_max = tibble(x = d %>% select(Max_Val_Serial_No) %>% distinct %>% pull,
               y = d %>% select(Close) %>% slice(x) %>% pull)

# Visualise
d %>%
  ggplot(aes(x = Serial, y = Close)) +
  geom_line() +
  geom_point(data = d_max, aes(x = x, y = y), inherit.aes = FALSE, pch = 'X',
             size = 5, colour = "tomato") +
  scale_x_continuous(breaks = seq(0, nrow(d), by = 100)) +
  guides(colour = FALSE) +
  theme_bw() +
  ggtitle("Top Close Values Within Sliding Windows of size 100")

Please do run some tests, to see if you get the expected output - No guarantees given :wink:

3 Likes

Sorry, I am getting an error.
'''
Code:
library(tseries)
IBM<-tail(get.hist.quote("IBM"),250)
IBM<-tail(get.hist.quote("IBM"),1000)
library(tidyverse)
which_max_in_window = function(x, size){

  • out = vector(mode = "numeric", length = length(x))
    
  • win = seq(1, size)
    
  • out[win] = which.max(x[win])
    
  • for( i in seq(size + 1, length(x)) ){
    
  •     first  = i - size + 1
    
  •     last   = i
    
  •     win    = seq(first, last)
    
  •     out[i] = which.max(x[win]) + first - 1
    
  • }
    
  • return(out)
    
  • }
    IBM = IBM %>% mutate(Serial = seq(1, nrow(.)),
  •              Max_Val_Serial_No = which_max_in_window(x = Close, size = 100))
    

Error in UseMethod("mutate_") :
no applicable method for 'mutate_' applied to an object of class "zoo"
'''

Please insert your code using the following formatting (except for #), it makes it a lot easier to read:

#```{r}
#```

The dummy data we created in my example has the following class and structure:

class(d)
[1] "tbl_df"     "tbl"        "data.frame"
str(d)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	1000 obs. of  4 variables:
 $ Open : num  101.1 101.3 101 100.9 98.2 ...
 $ High : num  102.7 99.2 101.3 101.3 99.6 ...
 $ Low  : num  99.7 98.8 100.8 99.3 101 ...
 $ Close: num  99.6 97.4 99.6 98.8 100.9 ...

The error message is related to the zoo class object you have. There is a stackoverflow discussion here on that particular error. In other words for the function I wrote to work, you will have to convert your zoo object to a data frame.

1 Like

Sorry for the formatting error. Thanks a lot for your timely help, Sir.

How to create a multiple Line graphs below each other ?
1st Graph:
x axis - Time Period
y axis - Closing Price
2nd Graph:
x axis - Serial
y axis - Max_Value_Serial_No and Closing Price

Briefly:

library("tidyverse")
tibble(my_x_var, my_y_var) %>%
  ggplot(aes(x = my_x_var, y = my_y_var)) +
  geom_line() + 
  theme_bw()

Garrett Grolemund and Hadley Wickham has written this really good book on R for Data Science.

Based on your question reg. x/y scatter plots, I would highly recommend, that you to go through this specific section on data visualisation using ggplot

Try out a few things and then come back here with a suggestion of your own for the plots and we will help you on your way :slightly_smiling_face:

1 Like