Keywords: Natural Language Processing, Natural Language Processing in R, NLP, Text analysis Hosted by @julia When and where: Thurs 10:30-11AM in the BoF Lounge 2
If you would like to focus on a specific topic within this category, or ensure you are connecting with the right folks, reply below, discuss, and share widely!
I'm excited about hosting this BoF session at rstudio::conf! Who thinks they might come, and what kinds of topics would be fun to chat about in a casual, face-to-face setting?
Yep, we are. I just finished running a concept extraction model (BiLSTM-CRF network) this morning to see how this baseline model would handle part of our data. Not too bad for the first try via transfer learning.
Our application is related to free-form medical notes, which is notoriously difficult (and less accurate) than the usual NLP dataset. To make things even MORE challenging, our domain is veterinary medicine. So we have text that may look like “ the boy just ain’t right”, with no ICD-10 codes for outcomes.
We've been working with some ULMFit models internally, which has been really interesting and fun. I'm looking forward to getting to chat in person at rstudio::conf with some of you about the kinds of work you are doing!
Interesting, Julia. I'm trying to get a feel for the relative merits of ULMFiT compared to ELMo. It's interesting that ELMo seems to benefit from training on a domain-specific data set....even though it is supposed to be more general than word embeddings. Have you found the same or similar observations with ULMFiT?
Hi Julia!
ULMFiT looks promising for a problem that I'm working on (essentially a text classification problem with quite small data sets). I'd love to pick your brain about it.
@DPaschall I haven't tried ELMo on the same datasets that we are using with ULMFit so I don't know if I can speak to a direct performance comparison at this point. However, we are doing something similar where we have a large dataset of domain-specific language, and then a quite small dataset of labeled data for the classifier. It's remarkable what good results we are getting!