It's a great tutorial but had few questions on it.
It seems like last_fit() evaluates on test data that you trained. How can I apply this to some external data (that's not part of initial_split()?
After training data fit() (specifically fit(review_train) in the code), is it possible to tune the penalty on the test data? If both fitting and tuning is done on the training data, wouldn't that lead to over fitting on the training data?
Bonus Q: the code demonstrates how to make a grid on penalty. How can I tune on both penalty and mixture?
You can use extract_workflow() on the object produced by last_fit() and use that for prediction (or save() etc). It should be a self-contained thing that you can use going forward.
Nope, please don't do that. The test set is on reserve purely for assessment and should not be part of the tuning process.
Not if you use resampling to tune. That is the main purpose of resampling - "How do I get good performance estimates without using the test set?".
There is an extra layer of resampling, nested resampling, that can be used to be extra careful of optimization bias. The thing is, while the papers on this are excellent, I have yet to see a case in the wild where optimization bias is significant enough to rise above the uncertainty produce by the data and/or model. In other words, IMO optimization bias is real and usually negligible.