r - Predict the class variable using naiveBayes -
i tried use naivebayes
function in e1071
package. here process:
>library(e1071) >data(iris) >head(iris, n=5) sepal.length sepal.width petal.length petal.width species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa >model <-naivebayes(species~., data = iris) > pred <- predict(model, newdata = iris, type = 'raw') > head(pred, n=5) setosa versicolor virginica [1,] 1.00000 2.981309e-18 2.152373e-25 [2,] 1.00000 3.169312e-17 6.938030e-25 [3,] 1.00000 2.367113e-18 7.240956e-26 [4,] 1.00000 3.069606e-17 8.690636e-25 [5,] 1.00000 1.017337e-18 8.885794e-26
so far, fine. in next step, tried create new data point , used naivebayes model (model
) predict class variable (species
) , chose 1 of training data points.
> test = c(5.1, 3.5, 1.4, 0.2) > prob <- predict(model, newdata = test, type=('raw'))
and here result:
> prob setosa versicolor virginica [1,] 0.3333333 0.3333333 0.3333333 [2,] 0.3333333 0.3333333 0.3333333 [3,] 0.3333333 0.3333333 0.3333333 [4,] 0.3333333 0.3333333 0.3333333
and strange. data point used test
row of iris
dataset. based on actual data, class variable of data point setosa
:
sepal.length sepal.width petal.length petal.width species 1 5.1 3.5 1.4 0.2 setosa
and naivebayes
predicted correctly:
setosa versicolor virginica [1,] 1.00000 2.981309e-18 2.152373e-25
but when try predict test
data point, returns incorrect results. why returns 4 rows predicted when i'm looking prediction of 1 data point? doing wrong?
you need column names correspond training data column names. training data
test2 = iris[1,1:4] predict(model, newdata = test2, type=('raw')) setosa versicolor virginica [1,] 1 2.981309e-18 2.152373e-25
"new" test data defined data.frame
test1 = data.frame(sepal.length = 5.1, sepal.width = 3.5, petal.length = 1.4, petal.width = 0.2) predict(model, newdata = test1, type=('raw')) setosa versicolor virginica [1,] 1 2.981309e-18 2.152373e-25
if feed 1 dimension, can predict via bayes rule.
predict(model, newdata = data.frame(sepal.width = 3), type=('raw')) setosa versicolor virginica [1,] 0.2014921 0.3519619 0.446546
if feed dimension not found in training data, equally classes. inputting longer vector gives more predictions.
predict(model, newdata = 1, type=('raw')) setosa versicolor virginica [1,] 0.3333333 0.3333333 0.3333333
Comments
Post a Comment