Remove all text after the last instance of a specific word, in text analysis

Hi, and welcome!

Please see the FAQ: What's a reproducible example (`reprex`) and how do I do one? Using a reprex, complete with representative data will attract quicker and more answers. This question doesn't require one, though.

The general approach

text_orig<- c("Some text written about a topic, with the word Reference in it",
              "Some more text, with the word References followed by text I wish to delete",
              "More pages with stuff to be deleted")
pat <- "Reference.*$"
retain <- "Reference"
str_replace(text_orig, pat, retain)
#> [1] "Some text written about a topic, with the word Reference"
#> [2] "Some more text, with the word Reference"                 
#> [3] "More pages with stuff to be deleted"

Created on 2020-03-19 by the reprex package (v0.3.0)

You'll need to split the string for this to work twice, on separate sentences, to avoid getting tied up in knots writing the regex. It's possible, and if you are an adept, go for it. Otherwise, don't torture yourself.