What I wanted to do is just to create a function where it will calculate for each column / metric in my data frame to find out the percentages of rows that are 0. And I can simply plug it into lapply to do the trick for me, I believe.
Here are two ways to do it. The difference between them is how NAs are handled. In the first function, NAs are ignored completely, so you get the proportion of non-NA values that are zero. In the second function, the denominator includes NA rows, so you get the proportion of all values that are zero.
By the way, it’s better not to post screenshots of code. They can be hard to read, and are invisible to search. If you want to post an error message, you can copy and paste it from the console. To format your code properly, select your pasted code (or console output) and use the little </> button at the top of the posting box.
Thanks a lot! This is exactly what I was looking for. I wasn't thinking. I was already using lapply, there is no reason to use the for loop. I got so hung up for the class matching. I thought I did something wrong when calculating a numeric with two numbers, i.e. the binary operations referred in the error message.
No worries, I already took care of all the NA in Vertica and recoded them to 0. So the NA.RM portion is not needed. But that's a good point to always think about the scope of the denominator.