This is, once again, not a reprex.
You may ask why. Here are a few reasons:
-
We don't have access to your local files. I understand that the allowable file formats are limited, but you could have written a code to write those files, and then reading it later.
-
You haven't included any
library
call. I (and possible many others on this community) do not know what is ajaccard distance
. You used a functiontextrank_jaccard
, but have not mentioned its package. I'm guessingtextrank
, but it may not be the case. -
What is
top
? I've no idea regarding this one.
Please go through the reprex guide. A minimal reproducible example helps others to figure out what problems you may have been facing, and consequently, to help you.
There are a few problems with your code.
-
files_names3
is a vector. You can't usenrow
with it. -
You used
i
in both thefor
loops. -
Why are you using
all <- ''
? It's a character vector, and you can't add rows with this later. -
I'm unable to figure out why do you expect that output will be in that format in your post. The documentation says it returns a single number, so why a tuple? Also, as all the files are identical, why do you think different values will be produced? I've no idea regarding this particular distance measure, but I don't think this is how it is expected to behave.
-
This is not a problem, but
1.txt
,2.txt
,3.txt
as names of some objects is probably a bad idea. It's very confusing in my opinion.
Since it is your first post, I'm making providing a reprex after modifying your code a little bit.
a working code
# loading required library
library(textrank)
# creating files
write.table(x = "ok, good, funny",
file = "1.txt",
row.names = FALSE,
col.names = FALSE)
write.table(x = "ok, good, funny",
file = "2.txt",
row.names = FALSE,
col.names = FALSE)
write.table(x = "ok, good, funny",
file = "3.txt",
row.names = FALSE,
col.names = FALSE)
# listing files
file_names <- list.files(pattern="*.txt")
# reading files
file_contents <- vector(mode = "list",
length = length(x = file_names))
for (i in seq_len(length.out = length(x = file_names)))
{
file_contents[[i]] <- read.delim(file = file_names[i])
}
# calculation
all <- matrix(ncol = 3,
nrow = ((length(x = file_names)) ^ 2))
for(i in seq_len(length.out = length(x = file_names)))
{
for(j in seq_len(length.out = length(x = file_names)))
{
all[((i - 1) * length(x = file_names) + j), ] <- c(file_names[i], file_names[j], textrank_jaccard(termsa = file_contents[[i]],
termsb = file_contents[[j]]))
}
}
all <- as.data.frame(x = all)
all
#> V1 V2 V3
#> 1 1.txt 1.txt 1
#> 2 1.txt 2.txt 1
#> 3 1.txt 3.txt 1
#> 4 2.txt 1.txt 1
#> 5 2.txt 2.txt 1
#> 6 2.txt 3.txt 1
#> 7 3.txt 1.txt 1
#> 8 3.txt 2.txt 1
#> 9 3.txt 3.txt 1
Created on 2019-03-25 by the reprex package (v0.2.1)