我正在尝试学习如何在R中使用KNN,并在来自包nycflights13的飞行数据集上进行练习。我在运行以下代码时遇到一个错误:
“火车”和“班次”有不同的长度
我的代码:
library(nycflights13)
library(class)
deparr <- na.omit(flights[c(4, 7, 16)])
classframe <- deparr[3]
flights %>% ggvis(~dep_time, ~arr_time, fill = ~distance) %>% layer_points()
set.seed(1234)
ind <- sample(2, nrow(deparr), replace=TRUE, prob=c(0.67, 0.33))
flights.training <- deparr[ind==1, 1:2]
flights.test <- deparr[ind==2, 1:2]
flights.trainlabels <- deparr[ind==1, 3]
flights.testlabels <- deparr[ind==2, 3]
predictions <- knn(train = flights.training, test = flights.test, cl = flights.trainlabels[,1], k = 3)发布于 2017-05-24 20:25:26
下面是根据百分比划分训练集和测试集的代码。如果您想以不同的方式拆分这两个子集,您应该能够从这一点出发,但是它证明了它是有效的。
deparr <- na.omit(flights[c(4, 7, 16)])
set.seed(1234)
# prepare to divide up the full dataset into two groups, 65%/35%
n <- nrow(deparr)
train_n <- round(0.65 * n)
# randomize our data
deparr <- deparr[sample(n)]
# split up the actual data. We will use these as inputs to knn
flights.train <- deparr[1:train_n, ]
flights.test <- deparr[(train_n + 1):n, ]
# target variable, $distance, is in column 3, so exclude from train and test
predictions <- knn(train = flights.train[, 1:2], test = flights.test[, 1:2], cl = flights.train$distance, k = 10)我得到的结果是:
> str(predictions)
Factor w/ 209 levels "80","94","96",..: 121 159 18 54 207 18 94 55 159 136 ...https://stackoverflow.com/questions/44166228
复制相似问题