sklearn中knn的分类算法对应的是neighbors模块中的KNeighborsClassifier类,以sklearn自带的iris(鸢尾花)数据集为实例如下:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import neighbors
from sklearn.metrics import classification_report
# 加载sklearn自带的iris(鸢尾花)数据集
iris = load_iris()
# 提取特征数据和目标数据
X = iris.data
y = iris.target
# 将数据集以9:1的比例随机分为训练集和测试集,为了重现随机分配设置随机种子,即random_state参数
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.9, test_size=0.1, random_state=188)
# 实例化knn分类器对象
knc = neighbors.KNeighborsClassifier()
# knn分类器拟合训练数据
knc.fit(X_train, y_train)
# 训练完的knc分类器对测试数据进行预测
y_pred = knc.predict(X_test)
# classification_report函数用于显示主要分类指标的文本报告
print(classification_report(y_test, y_pred, target_names=['setosa', 'versicolor', 'virginica']))
程序运行后的输出如下:
precision recall f1-score support
setosa 1.00 1.00 1.00 7
versicolor 1.00 1.00 1.00 2
virginica 1.00 1.00 1.00 6
avg / total 1.00 1.00 1.00 15