Using for loop to plot data

wilson · October 16, 2018, 10:33pm

Hello, I'm very new to using RStudio. For an assignment for school we have been assigned to scrape market data on cryptocurrencies. We then have to use a for loop to plot the mean daily return and mean standard deviation. I'm not sure how to begin with using a code to start this. I attached the code that I have so far, this plotted only one point, I need to plot the 10 highest return means. Any help would be greatly appreciated.

library(crypto)
library(ggplot2)
library(dplyr)
crypto_risk_reward_tradeoff <- data.frame(symbol = character(),
                                          mean_return = numeric(),
                                          sd_return = numeric(),
                                          stringsAsFactors=FALSE)
crypto.list <- crypto_list();
coin <- crypto.list[10, 1]
coin_charts <- daily_market(coin)
coin_charts$timestamp <- as.Date(coin_charts$timestamp)
coin2USD <- coin_charts %>%
  group_by(timestamp) %>%
  summarize(Avg_priceUSD = mean(price_usd)) %>%
  arrange(timestamp)
coin2USD <- as.data.frame(coin2USD[, 2])
N <- nrow(coin2USD);
todays_price <- coin2USD[2 : N, 1]
yesterdays_price <- coin2USD[1 : N-1, 1]
coin2USD_dailyReturn <- (todays_price - yesterdays_price)/yesterdays_price
crypto_risk_reward_tradeoff[1, ] <- list(coin, mean(coin2USD_dailyReturn),
                                         sd(coin2USD_dailyReturn))
ggplot(crypto_risk_reward_tradeoff,
       aes(x = mean_return, y = sd_return, label = symbol)) +
  geom_point() +
  geom_text(aes(label = symbol), hjust = 0.5, vjust = 1.5) +
  labs(x = "Mean daily return", y = "Standard deviation of daily return") +
  ggtitle("Cryptocurrency risk-reward tradeoff") +
  theme(plot.title = element_text(hjust = 0.5))

cderv · October 17, 2018, 6:29am

Hi,
This community have a policy regarding homework. Please be sure to comply to it when asking question here

hint on your code

There could be something here. Look for how to subset list and data.frame.

jcblum · October 17, 2018, 6:49am

I think it might help if you can explain more about the plot (or plots?) you are trying to produce. What do you expect it (them?) to look like? (To comply with our homework policy, it's important to do this in your own words, without copy-pasting verbatim text from your actual assignment).

You're using ggplot2 to make your plot here, but you say your assignment specifically requires you to use a for loop. In the ggplot2 graphics system, a for loop is only going to make sense if you're making multiple plots. However, in the R base graphics system, points can be iteratively added to a single plot using a for loop. Have you been using ggplot2 exclusively so far, or is there a chance that the assignment is meant to be completed using base graphics?

wilson · October 17, 2018, 2:57pm

Sorry if I'm bad at explaining this, I really know very little about R still. Basically the graph will have Standard Deviation of daily return on the Y axis, and Mean daily return on the X axis. I need to plot the top 10 average returns. So there will be 10 points on the graph, each represents an individual currency. As for the base graphics, I don't know for sure. We've only been shown using gglpot2 for anything, so I'm assuming that we are supposed to use that. The code I have currently will work to plot each point individually, however I need for loop to get any credit for the assignment. I guess essentially I just need help with how to start out with this code. Anything I have from in class, they only showed us basic for loops, like this
for(i in 1:5){ z <- 2*z print(z) }
I know obviously you can't answer the question specifically in my assignment, but would you be able to help me figure out where to begin? Thank you

jcblum · October 18, 2018, 11:18pm

Thanks for the explanation! That does help to clarify things.

So there isn't any reasonable way that you would construct the plot you describe using both a for loop and ggplot2 — so I'd suggest thinking about how a for loop might be part of preparing the data for plotting, rather than part of the plotting itself.

Something that's challenging about for loops is that they are a very general purpose code tool that can be used in a number of different ways, so it can be hard to reason from simple examples if the example doesn't show the kind of usage you need.

One thing that might make the simple example you gave difficult to extend to your needs is that the code inside the loop doesn't actually make use of the loop variable (i). Your example loop is accomplishing the same thing as just copying and pasting the code inside the loop five times.

z <- 1

for (i in 1:3){ 
  z <- 2 * z 
  print(z) 
}
#> [1] 2
#> [1] 4
#> [1] 8

# This has the same effect as if you'd written...
z <- 1

z <- 2 * z 
print(z)
#> [1] 2
z <- 2 * z 
print(z) 
#> [1] 4
z <- 2 * z 
print(z) 
#> [1] 8

^{Created on 2018-10-18 by the reprex package (v0.2.1)}

Things get a lot more interesting when the code in the loop depends on the loop variable somehow — that allows you to take an existing list of things and "loop over it", applying the same complex operation to each item. In this usage, you typically treat the incrementing loop variable as an index number that grabs corresponding items. Now it's like you copied and pasted the code inside the loop, but each time you changed something to correspond with the list of things you are looping over.

some_zs <- c("Zoe", "Zelda", "Zebulon")

random_zs <- data.frame(
  z_name = some_zs,
  n = NA_integer_,
  mean = NA_real_,
  sd = NA_real_
)
random_zs
#>    z_name  n mean sd
#> 1     Zoe NA   NA NA
#> 2   Zelda NA   NA NA
#> 3 Zebulon NA   NA NA

for (i in seq_along(some_zs)){ 
  # For each name, find the number of characters in the name.
  # Then draw that many random numbers from a normal distribution,
  # find the mean and SD of the random numbers, 
  # and update the `random_zs` data frame with those values
  random_zs$n[[i]] <- nchar(some_zs[i])
  samples <- rnorm(random_zs$n[[i]])
  random_zs$mean[[i]] <- mean(samples)
  random_zs$sd[[i]] <- sd(samples)
}

# Here's the outcome for the data frame we were updating
random_zs
#>    z_name n        mean        sd
#> 1     Zoe 3 -0.22104488 0.4738899
#> 2   Zelda 5  0.09132254 1.0937531
#> 3 Zebulon 7 -0.12718158 0.7068360

# `seq_along()?` Huh??
# This is an R function that's really useful in loops
# It takes a variable and generates a sequence "along" that variable. 
# For various reasons, it's safer than writing `1:length(my_variable)`
seq_along(some_zs)
#> [1] 1 2 3

^{Created on 2018-10-18 by the reprex package (v0.2.1)}

This loop accomplishes the same thing as this mess of copypasta:

random_zs$n[[1]] <- nchar(some_zs[1])
samples <- rnorm(random_zs$n[[1]])
random_zs$mean[[1]] <- mean(samples)
random_zs$sd[[1]] <- sd(samples)

random_zs$n[[2]] <- nchar(some_zs[2])
samples <- rnorm(random_zs$n[[2]])
random_zs$mean[[2]] <- mean(samples)
random_zs$sd[[2]] <- sd(samples)

random_zs$n[[3]] <- nchar(some_zs[3])
samples <- rnorm(random_zs$n[[3]])
random_zs$mean[[3]] <- mean(samples)
random_zs$sd[[3]] <- sd(samples)

For your task, you seem to be on your way to writing code that does what you need once (this is good!). You need to think about what variable you will "loop over", and how that variable fits into your code. If it helps, maybe try writing the horrible copypasta version first! Taking things step by step when you're learning a new kind of abstraction can be really helpful.

wilson · October 19, 2018, 1:59pm

Thank you very much, this is really helpful! I think I understand better what I’m trying to accomplish with the for loop. Thank you!