Why do we need a copy of the data for the return value of initial_split()?The way I think of it, just returning the index is fine. Is there any other purpose for returning a copy of the data?
If you don't like the API thats fine, you can do otherwise, but thats the API
for what its worth, because of Rs copy-on-write approach, initial split will not tie up significant memory, unless it or the original data are altered. However ; the training and testing data sets creations will perform a copy action, as they are modifications of the initial data.