Data Preprocessing Template
Importing the dataset
dataset = read.csv('Data.csv')
Splitting the dataset into the Training set and Test set
install.packages('caTools')
library(caTools)
set.seed(101)
sample = sample.split(dataset$DependentVariable, SplitRatio = 0.75)
training_set = subset(dataset, sample == TRUE)
test_set = subset(dataset, sample == FALSE)
Feature Scaling
training_set = scale(training_set)
test_set = scale(test_set)
I am getting an error:
test_set = subset(dataset, split == FALSE)
Fehler in split == FALSE :
Vergleich (1) ist nur für atomare und Listentypen möglich
test_set = scale(test_set)
Fehler in scale(test_set) : Objekt 'test_set' nicht gefunden
FJCC
May 13, 2019, 5:00pm
2
tri■■■■a:
sample = sample.split(dataset$DependentVariable, SplitRatio = 0.75)
training_set = subset(dataset, sample == TRUE)
test_set = subset(dataset, sample == FALSE)
The above part of your code makes sense. Then when you quote the error, it says
test_set = subset(dataset, split == FALSE)
Fehler in split == FALSE :
Vergleich (1) ist nur für atomare und Listentypen möglich
What is split ? Shouldn't that be sample ? If split is not a vector, that would account for the error.
Data Preprocessing Template
Importing the dataset
dataset = read.csv('Data.csv')
Splitting the dataset into the Training set and Test set
install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(dataset$DependentVariable, SplitRatio = 0.8)
training_set = subset(dataset, split == TRUE)
test_set = subset(dataset, split == FALSE)
#Feature Scaling
training_set = scale(training_set)
test_set = scale(test_set)
I am getting the following error:
test_set = scale(test_set)
Fehler in scale(test_set) : Objekt 'test_set' nicht gefunden
mara
May 14, 2019, 10:15am
4
Running the error message through translate, it says
Error in scale (test_set): Object 'test_set' not found
This is strange, since you seem to create test_set
earlier in the code.
Could you please turn this into a self-contained reprex (short for repr oducible ex ample)? It will help us help you if we can be sure we're all working with/looking at the same stuff.
install.packages("reprex")
If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page . The reprex dos and don'ts are also useful.
There's also a nice FAQ on how to do a minimal reprex for beginners, below:
A minimal reproducible example consists of the following items:
A minimal dataset, necessary to reproduce the issue
The minimal runnable code necessary to reproduce the issue, which can be run
on the given dataset, and including the necessary information on the used packages.
Let's quickly go over each one of these with examples:
Minimal Dataset (Sample Data)
You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue.
Let's say, as an example, that you are working with the iris data frame
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.…
What to do if you run into clipboard problems
If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.
reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")
For pointers specific to the community site, check out the reprex FAQ .
Now I am having this error:
Fehler in sample.split(dataset$DependentVariable, SplitRatio = 0.8) :
Error in sample.split: 'SplitRatio' parameter has to be i [0, 1] range or [1, length(Y)] range
training_set = subset(dataset, split == TRUE)
Fehler in split == TRUE :
Vergleich (1) ist nur für atomare und Listentypen möglich
FJCC
May 14, 2019, 4:49pm
6
Please post a reproducible example as requested above by Mara. It is very difficult to debug code without data and the full actual code. Here is a reproducible example of the type of thing you are trying to do that works for me.
library(caTools)
#> Warning: package 'caTools' was built under R version 3.5.2
df <- data.frame(X = runif(100, 0, 5),
DependentVar = rnorm(100))
split = sample.split(df$DependentVar, SplitRatio = 0.8)
training_set = subset(df, split == TRUE)
test_set = subset(df, split == FALSE)
training_set = scale(training_set)
test_set = scale(test_set)
head(training_set)
#> X DependentVar
#> 1 -1.435215 0.7215806
#> 3 1.459590 0.2844358
#> 4 -1.505967 0.5092594
#> 6 0.811956 0.5417502
#> 7 1.219653 1.2789982
#> 8 -1.511057 -1.2552037
Created on 2019-05-14 by the reprex package (v0.2.1)
1 Like
system
Closed
June 4, 2019, 4:49pm
7
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.