Hi!
On my Mac desktop I have a folder with many files in txt. I'm trying to use the "quanteda" package but I haven't understood how to upload them. With tm it was easy, with quanteda / readtext I didn't find a way to load files from desktop. Can anyone help me, please?
Thanks and best wishes to all!
The vignette for readtext
provides an example you can adapt
library(readtext)
DATA_DIR <- system.file("extdata/", package = "readtext")
(rt1 <- readtext(paste0(DATA_DIR, "/txt/UDHR/*")))
#> readtext object consisting of 13 documents and 0 docvars.
#> # Description: df[,2] [13 × 2]
#> doc_id text
#> <chr> <chr>
#> 1 UDHR_chinese.txt "\"世界人权宣言\n联合国\"..."
#> 2 UDHR_czech.txt "\"VŠEOBECNÁ \"..."
#> 3 UDHR_danish.txt "\"Den 10. de\"..."
#> 4 UDHR_english.txt "\"Universal \"..."
#> 5 UDHR_french.txt "\"Déclaratio\"..."
#> 6 UDHR_georgian.txt "\"FLFVBFYBC \"..."
#> # … with 7 more rows
Created on 2019-12-28 by the reprex package (v0.3.0)
For DATA_DIR
of *.txt
files in your desktop (assuming that you have only *.txt
files in your desktop.
DATA_DIR <- system.file("/Users/YOURS/Desktop")
(rt1 <- readtext(DATA_DIR))
Thank you very much! I apologize: I have a folder, "texts", on my Desktop. I need the files.txt in the folder... Can it works DATA_DIR <- system.file("/Users/YOURS/Desktop/folder")?
Thanks!
If the folder is named texts
, you can use
DATA_DIR <- system.file("/Users/YOURS/Desktop/texts")
(rt1 <- readtext(DATA_DIR/*))
Thanks! The first is clear - DATA_DIR <- system.file("/Users/YOURS/Desktop/texts") - but it doesn't work - (rt1 <- readtext(DATA_DIR/*)) -
(rt1 <- readtext(DATA_DIR/))
Errore: unexpected '' in "(rt1 <- readtext(DATA_DIR/*"
Example: I can load the same files folder using tm, in this way:
> library(tm)
Carico il pacchetto richiesto: NLP
> cname <- file.path("~", "Desktop", "texts")
> cname
[1] "~/Desktop/texts"
> dir(cname)
[1] "BIN 2017 1.txt" "BIN 2017 2.txt" "BIN 2017 3.txt" "BIN 2017 4.txt" "BIN 2018 1.txt" "BIN 2018 2.txt"
> docs <- VCorpus(DirSource(cname))
> summary(docs)
Length Class Mode
BIN 2017 1.txt 2 PlainTextDocument list
BIN 2017 2.txt 2 PlainTextDocument list
BIN 2017 3.txt 2 PlainTextDocument list
BIN 2017 4.txt 2 PlainTextDocument list
BIN 2018 1.txt 2 PlainTextDocument list
> inspect(docs)
<<VCorpus>>
Metadata: corpus specific: 0, document level (indexed): 0
Content: documents: 32
And then I can analyze it.
I would like to do the same thing using quanteda, but it doesn't work:
> library(readtext)
> DATA_DIR <- system.file("extdata/", package = "readtext")
> DATA_DIR <- system.file("~", "Desktop", "texts")
> (rt1 <- readtext(DATA_DIR/*))
Errore: unexpected '*' in "(rt1 <- readtext(DATA_DIR/*"
> library(readtext)
> DATA_DIR <- system.file("extdata/", package = "readtext")
> DATA_DIR <- system.file("~", "Desktop", "texts")
> (rt1 <- readtext(paste0(DATA_DIR, "/txt/texts/*")))
Error in list_files(file, ignore_missing, TRUE, verbosity) :
File '' does not exist.
>
I looked for tutorials and examples, but the examples concern URL, not folders on disk: https://cran.r-project.org/web/packages/quanteda/vignettes/quickstart.html.
Instead of ~ use the full path name: /Users/rc/Desktop/texts, for example
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.