I have this file of 500.000 queries and I want to find different companies (30.000), locations and more long lists. However, it is taking a very long time to label. Is there a better way to do this?
Without knowing the structure of g_query_samp or its search_q variable and how its delimited, or what assigning the statement to 1 is supposed to represent, I'd be speculating too much to provide a useful answer.
All I can say that in general you are better off vectorizing columns than subsetting. Do you have some representative data you could share in a reprex? It doesn't have to be big, just enough to show what it is that needs parsing.