手写数字识别实战 - 从传统机器学习到深度学习

发布于:2025-08-12 ⋅ 阅读:(18) ⋅ 点赞:(0)

关键词:MNIST数据集、SVM、神经网络、模型对比

python

# 第一部分:使用Scikit-learn的SVM识别手写数字
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split

# 加载MNIST数据集
digits = datasets.load_digits()
X, y = digits.images.reshape((len(digits.images), -1)), digits.target

# 数据预处理
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 创建SVM分类器
clf = svm.SVC(gamma=0.001)
clf.fit(X_train, y_train)

# 预测与评估
y_pred = clf.predict(X_test)
print(f"分类报告:\n{metrics.classification_report(y_test, y_pred)}")

输出示例

text

              precision    recall  f1-score   support
           0       1.00      1.00      1.00        33
           1       1.00      1.00      1.00        28
           2       1.00      1.00      1.00        33
           3       1.00      0.97      0.99        34
...
    accuracy                           0.99       360

python

# 第二部分:使用PyTorch实现神经网络
import torch
import torch.nn as nn
import torch.optim as optim

# 定义神经网络
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(64, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# 转换数据为Tensor
X_train_t = torch.FloatTensor(X_train)
y_train_t = torch.LongTensor(y_train)

# 训练模型
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

for epoch in range(100):
    optimizer.zero_grad()
    outputs = model(X_train_t)
    loss = criterion(outputs, y_train_t)
    loss.backward()
    optimizer.step()

关键结论

  • SVM在小型数据集上准确率达99%

  • 神经网络通过特征自动提取获得更强泛化能力

  • 参数量对比:SVM(支持向量) vs NN(权重矩阵)