Hi,
I want to use this code to identify spam, but I have a problem when I create a document term matrix for spam and easy_ham.
Sys.setenv(LANG = "en")
require(tm)
#> Loading required package: tm
#> Loading required package: NLP
suppressWarnings(require(RTextTools))
#> Loading required package: RTextTools
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
setwd("C:/Users/Maciek/Desktop/spamassasin")
spam <- Corpus(DirSource("spam"), readerControl = list(language="lat"))
easy_ham <- Corpus(DirSource("easy_ham"), readerControl = list(language="lat"))
if (file.exists("easy_ham/cmds")) file.remove("easy_ham/cmds")
if (file.exists("spam/cmds")) file.remove("spam/cmds")
meta(spam, tag = "type") <- "spam"
meta(easy_ham, tag = "type") <- "easy_ham"
combinedcorpusEasy <- c(spam,easy_ham, recursive=T)
combinedResampledCorpusEasy <- sample(combinedcorpusEasy, 750)
spamTDMEasy<- DocumentTermMatrix(combinedResampledCorpusEasy)
#> Error in UseMethod("TermDocumentMatrix", x): no applicable method for 'TermDocumentMatrix' applied to an object of class "character"
I dont know how to fix that, can someone help me?
Here is the database I use (at the bottom)
mara
March 21, 2018, 12:48pm
2
Could you please turn this into a self-contained reprex (short for minimal repr oducible ex ample)? It will help us help you if we can be sure we're all working with/looking at the same stuff.
Right now the best way to install reprex is:
# install.packages("devtools")
devtools::install_github("tidyverse/reprex")
If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page . The reprex dos and don'ts are also useful.
For pointers specific to the community site, check out the reprex FAQ, linked to below.
Why reprex?
Getting unstuck is hard. Your first step here is usually to create a reprex, or reproducible example. The goal of a reprex is to package your code, and information about your problem so that others can run it and feel your pain. Then, hopefully, folks can more easily provide a solution.
What's in a Reproducible Example?
Parts of a reproducible example:
background information - Describe what you are trying to do. What have you already done?
complete set up - include any library() calls and data to reproduce your issue.
data for a reprex: Here's a discussion on setting up data for a reprex
make it run - include the minimal code required to reproduce your error on the data…
2 Likes
nviau
March 21, 2018, 5:41pm
3
Your Corpus object has been turned into a a character vector. If you supply data and reproducible example I can help you out.
Here's a simple recreation using the crude
corpus from the tm package:
library(tm)
library(RTextTools)
data(crude)
test1 <- crude[1:10]
test2 <- crude[11:20]
meta(test1, tag = "type") <- "test1"
meta(test2, tag = "type") <- "test2"
combinedcorpusEasy <- c(test1, test2, recursive=T)
combinedResampledCorpusEasy <- sample(combinedcorpusEasy, 10)
spamTDMEasy <- DocumentTermMatrix(combinedResampledCorpusEasy)
# This works...
spamTDMEasy
#> <<DocumentTermMatrix (documents: 10, terms: 654)>>
#> Non-/sparse entries: 1060/5480
#> Sparsity : 84%
#> Maximal term length: 15
#> Weighting : term frequency (tf)
# Recreating your error with a character vector...
DocumentTermMatrix("text")
#> Error in UseMethod("TermDocumentMatrix", x): no applicable method for 'TermDocumentMatrix' applied to an object of class "character"
I use that DATA
I have a code that works from book Machine Learning for Hackers here is a code its a Naive Bayes classify, but i want also classify with Support Vector Machine, but i dont know how, and that is the reason why i'm trying to use code from website that I placed in the first post.