features won't cluster as expected; any functions I don't know?

Hello there,

I am fairly new to R and entirely new to this forum and I am sorry if this is not the right place to ask.

I am currently trying to prove that armistice, truce, and ceasefire agreements are NOT used interchangably in diplomatic practice.

So what I did is collect agreements (.txt format) into a corpus without specifying which of the three they are. I am hoping to find a way R delivers a plot where three distict topics emerge as clusters in one way or another.
I have used "textplot_network" and the result seemed to be somewhat accurate but ceasefire and truce did not show up at all and that seems odd to me.
Now it could very well be the case that both of them appear to scarsly in the corpus although I doubt it.
Is there any method I am missing out on where I could analyze the corpus for the three concepts without a predefined dictionary?
Ideally the output would give me keyfeatures for each of them.

Thank you all in advance!
Kindest regards
Valentin

It's better to think of this as trying to test to avoid analyst bias.

Yours objective might be realizable as a problem in latent semantic analysis. See, for example, the {lsa} package or search rseek.org for latent semantic.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.