I am not particularly proficient with R but I am doing some basic machine learning with kNN in RStudio for my thesis, however, I am confused about a section of the code and hoping someone can clarify.
When establishing the window size, the code looks like this:
"win <- rep(1:736, each = 10)
win"
My lecturer has explained this to me that 'rep()' establishes the size of the window and 'each' establishes the number of repetitions, however, when I run the code I get increased accuracy when reducing 'each' and increasing 'rep()', making me think 'each' might be the window size and 'rep()' may be the number of repetitions. Who's correct here?
Could you please turn this into a self-contained reprex (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.
If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.
Without a reprex, it's not immediately obvious to me even what the connection is between rep and the window of a kNN model. However just in terms of what the function rep does, your lecturer is correct.
rep(1:5, each = 4)
#output: 11112222333344445555
The first argument to rep is the vector to be repeated, and "each" determines how many times each element is repeated. Alternatively using "times" you get:
rep(1:5, times = 2)
#output: 1234512345
As for why you have increased accuracy with a smaller window--are you measuring training accuracy or test/validation accuracy? The accuracy of kNN will monotonically approach 100% on the training set as you decrease k