一、Atlas 300I Duo推理卡相关安装步骤
这张48G Atlas 300I Duo推理卡基于MindIE+WebUI方式跑起了DeepSeek-R1-Distill-Qwen-7B大语言模型,但不知道为什么回答问题有点乱扯,所以这里是跑DeepSeek-R1-Distill-Qwen-14B的介绍
1.1 服务器系统与内核说明
CPU | 系统版本 | 内核版本 |
---|---|---|
S5000C | Kylin V10 | 4.19.90-89.11.v2401.ky10.aarch64 |
P.S.服务器安装好系统后先不要执行yum update -y更新,否则内核版本会从4.19.90-89.11升级到4.19.90-89.21,Atlas 300I Duo推理卡的driver包会安装失败
1.2 系统环境说明
本服务器IP地址:192.168.2.71
登录用户:root
新开一个 terminal ,执行以下命令确认是否有探到Atlas 300I Duo推理卡:
lspci | grep Huawei
如有卡,回显信息是:
0000:01:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d500 (rev 23)
uname -a
回显信息是:
Linux localhost.localdomain 4.19.90-89.11.v2401.ky10.aarch64 #1 SMP Thu Apr 25 18:20:10 CST 2024 aarch64 aarch64 aarch64 GNU/Linux
cat /etc/*release
回显信息是:
Kylin Linux Advanced Server release V10 (Halberd)
DISTRIB_ID=Kylin
DISTRIB_RELEASE=V10
DISTRIB_CODENAME=Halberd
DISTRIB_DESCRIPTION=“Kylin V10”
DISTRIB_KYLIN_RELEASE=V10
DISTRIB_VERSION_TYPE=enterprise
DISTRIB_VERSION_MODE=normal
NAME=“Kylin Linux Advanced Server”
VERSION=“V10 (Halberd)”
ID=“kylin”
VERSION_ID=“V10”
PRETTY_NAME=“Kylin Linux Advanced Server V10 (Halberd)”
ANSI_COLOR=“0;31”
Kylin Linux Advanced Server release V10 (Halberd)
############################################################################################
1.3 准备安装驱动固件
1.3.1 新增HwHiAiUser用户
groupadd HwHiAiUser
useradd -g HwHiAiUser -d /home/HwHiAiUser -m HwHiAiUser -s /bin/bash
1.3.2 准备驱动与固件文件并安装
Ascend-hdk-310p-npu-driver_24.1.0.1_linux-aarch64.run
Ascend-hdk-310p-npu-firmware_7.5.0.5.220.run
将下载好安装文件,放到/root/work目录下:
cd /root/work
chmod +x *
参考《Atlas 中心推理卡 24.1.0 NPU驱动和固件安装指南 02.pdf》文档“2 物理机安装与卸载”章节中介绍的方法安装驱动与固件
因为Atlas 300I Duo推理卡是新采购回来的卡,本次安装为首次安装场景,需先安装驱动再安装固件
安装driver:
./Ascend-hdk-310p-npu-driver_24.1.0.1_linux-aarch64.run --check
./Ascend-hdk-310p-npu-driver_24.1.0.1_linux-aarch64.run --full
安装成功回显信息是:
Driver package installed successfully! The new version takes effect immediately.
安装firmware:
./Ascend-hdk-310p-npu-firmware_7.5.0.5.220.run --check
./Ascend-hdk-310p-npu-firmware_7.5.0.5.220.run --full
安装成功回显信息是:
Firmware package installed successfully! Reboot now or after driver installation for the installation/upgrade to take effect.
执行reboot命令重启
如果驱动固件安装正确,执行 npu-smi info 命令探到信息如下:
以上驱动固件安装完毕
############################################################################################
二、安装docker
Kylin V10并没有自带docker命令,需自行安装,请参考:https://blog.csdn.net/weixin_43273656/article/details/145469516
2.1 查看内核版本
uname -a
回显信息是:
Linux localhost.localdomain 4.19.90-89.11.v2401.ky10.aarch64 #1 SMP Thu Apr 25 18:20:10 CST 2024 aarch64 aarch64 aarch64 GNU/Linux
2.2 查看内核参数
cat /proc/version
回显信息是:
Linux version 4.19.90-89.11.v2401.ky10.aarch64 (root@localhost.localdomain) (gcc version 7.3.0 (GCC)) #1 SMP Thu Apr 25 18:20:10 CST 2024
2.3 查看系统和内核的详细信息
hostnamectl
回显信息是:
Static hostname: localhost.localdomain
Icon name: computer-server
Chassis: server
Machine ID: 889689ba3a9f48c4985c1519c2d8f553
Boot ID: 24cf07b36d6d4db69befaca323c4be93
Operating System: Kylin Linux Advanced Server V10 (Halberd)
Kernel: Linux 4.19.90-89.11.v2401.ky10.aarch64
Architecture: arm64
总结:需要下载aarch64的官方下载docker离线安装包,这里下载docker-27.2.0.tgz
2.4 将下载好安装文件,放到/root/work目录下,解压安装包
cd /root/work
tar -zxvf docker-27.2.0.tgz
2.5 移动 Docker 文件
mv /root/work/docker/* /usr/bin/
2.6 修改docker.service
vim /usr/lib/systemd/system/docker.service
新增以下内容:
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
2.7 修改daemon.json文件
mkdir -p /etc/docker
vim /etc/docker/daemon.json
新增以下内容
{
“exec-opts”: [“native.cgroupdriver=systemd”],
“insecure-registries”: [
“http://172.31.192.88:81”,“http://111.51.123.456:2222”
]
}
2.8 运行守护进程,启动 Docker
dockerd
2.9 docker 其他命令介绍
- 启动
systemctl start docker
- 查看状态
systemctl status docker
- 设置开机自启动
systemctl enable docker
reboot重启设备,并完成以下操作
三、安装与部署
3.1 拉取镜像
https://www.hiascend.com/developer/ascendhub/detail/af85b724a7e5469ebd7ea13c3439d48f
切到镜像版本页面,找到1.0.0-300I-Duo-py311-openeuler24.03-lts镜像点击下载,按指引将镜像拉取到服务器
3.1.1.docker login -u cn-south-1@HST3UBLG0X38GM0FMAGK swr.cn-south-1.myhuaweicloud.com
3.1.2.密码[d153e20f53b515e9f388f5bedf341c09b22b573e143c0cf33e1dd1f834535862]
3.1.3.docker pull swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:1.0.0-300I-Duo-py311-openeuler24.03-lts
拉取镜像完毕以后:
执行docker images
回显信息是:
3.2 新建容器
docker run -it -d --net=host --shm-size=1g
–privileged
–name sakway
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver:ro
-v /usr/local/sbin:/usr/local/sbin:ro
-v /root/work:/root/work:rw
-v /path-to-weights:/path-to-weights:ro
swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:1.0.0-300I-Duo-py311-openeuler24.03-lts bash
3.3 查询正在运行的docker
[root@localhost work]# docker ps -a
3.4 进容器:
docker exec -it sakway bash
3.5 下载权重模型:
3.5.1 确保进docker以后:
cd /root/work/
3.5.2 安装modelscope命令:
pip install modelscope --index-url https://mirrors.huaweicloud.com/repository/pypi/simple/
3.5.3 下载权重:
modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
将权重移动到/root/work/目录
mv /root/.cache/modelscope/hub/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B /root/work/
将权重文件放在/root/work/目录以后,把x权限去掉,给添加config.json文件赋750权限
cd /root/work/
chmod 750 /root/work/DeepSeek-R1-Distill-Qwen-14B/config.json
vim /root/work/DeepSeek-R1-Distill-Qwen-14B/config.json
将"torch_dtype": “bfloat16”,修改为"torch_dtype": “float16”,
vim /usr/local/Ascend/mindie/latest/mindie-service/conf/config.json
有九处要修改,在/usr/local/Ascend/mindie/latest/mindie-service/conf/目录下有修改以后的config.json与原始的config.json_org,具体修改项可对比
{
"Version" : "1.1.0",
"LogConfig" :
{
"logLevel" : "Info",
"logFileSize" : 20,
"logFileNum" : 20,
"logPath" : "logs/mindservice.log"
},
"ServerConfig" :
{
"ipAddress" : "192.168.2.71",
"managementIpAddress" : "127.0.0.2",
"port" : 1040,
"managementPort" : 1041,
"metricsPort" : 1042,
"allowAllZeroIpListening" : false,
"maxLinkNum" : 1000,
"httpsEnabled" :false,
"fullTextEnabled" : false,
"tlsCaPath" : "security/ca/",
"tlsCaFile" : ["ca.pem"],
"tlsCert" : "security/certs/server.pem",
"tlsPk" : "security/keys/server.key.pem",
"tlsPkPwd" : "security/pass/key_pwd.txt",
"tlsCrlPath" : "security/certs/",
"tlsCrlFiles" : ["server_crl.pem"],
"managementTlsCaFile" : ["management_ca.pem"],
"managementTlsCert" : "security/certs/management/server.pem",
"managementTlsPk" : "security/keys/management/server.key.pem",
"managementTlsPkPwd" : "security/pass/management/key_pwd.txt",
"managementTlsCrlPath" : "security/management/certs/",
"managementTlsCrlFiles" : ["server_crl.pem"],
"kmcKsfMaster" : "tools/pmt/master/ksfa",
"kmcKsfStandby" : "tools/pmt/standby/ksfb",
"inferMode" : "standard",
"interCommTLSEnabled" : true,
"interCommPort" : 1121,
"interCommTlsCaPath" : "security/grpc/ca/",
"interCommTlsCaFiles" : ["ca.pem"],
"interCommTlsCert" : "security/grpc/certs/server.pem",
"interCommPk" : "security/grpc/keys/server.key.pem",
"interCommPkPwd" : "security/grpc/pass/key_pwd.txt",
"interCommTlsCrlPath" : "security/grpc/certs/",
"interCommTlsCrlFiles" : ["server_crl.pem"],
"openAiSupport" : "vllm"
},
"BackendConfig" : {
"backendName" : "mindieservice_llm_engine",
"modelInstanceNumber" : 1,
"npuDeviceIds" : [[0,1]],
"tokenizerProcessNumber" : 8,
"multiNodesInferEnabled" : false,
"multiNodesInferPort" : 1120,
"interNodeTLSEnabled" : true,
"interNodeTlsCaPath" : "security/grpc/ca/",
"interNodeTlsCaFiles" : ["ca.pem"],
"interNodeTlsCert" : "security/grpc/certs/server.pem",
"interNodeTlsPk" : "security/grpc/keys/server.key.pem",
"interNodeTlsPkPwd" : "security/grpc/pass/mindie_server_key_pwd.txt",
"interNodeTlsCrlPath" : "security/grpc/certs/",
"interNodeTlsCrlFiles" : ["server_crl.pem"],
"interNodeKmcKsfMaster" : "tools/pmt/master/ksfa",
"interNodeKmcKsfStandby" : "tools/pmt/standby/ksfb",
"ModelDeployConfig" :
{
"maxSeqLen" : 2560,
"maxInputTokenLen" : 2048,
"truncation" : false,
"ModelConfig" : [
{
"modelInstanceType" : "Standard",
"modelName" : "DeepSeek-R1-Distill-Qwen-14B",
"modelWeightPath" : "/root/work/DeepSeek-R1-Distill-Qwen-14B",
"worldSize" : 2,
"cpuMemSize" : 5,
"npuMemSize" : -1,
"backendType" : "atb",
"trustRemoteCode" : false
}
]
},
"ScheduleConfig" :
{
"templateType" : "Standard",
"templateName" : "Standard_LLM",
"cacheBlockSize" : 128,
"maxPrefillBatchSize" : 50,
"maxPrefillTokens" : 8192,
"prefillTimeMsPerReq" : 150,
"prefillPolicyType" : 0,
"decodeTimeMsPerReq" : 50,
"decodePolicyType" : 0,
"maxBatchSize" : 200,
"maxIterTimes" : 512,
"maxPreemptCount" : 0,
"supportSelectBatch" : false,
"maxQueueDelayMicroseconds" : 5000
}
}
}
四、跑服务化(有个加载模型的过程需要点时间)
cd /usr/local/Ascend/mindie/latest/mindie-service/bin&&./mindieservice_daemon
成功标志:
Daemon start success!
4.1 命令行推理方式
新开一个terminal(问问题,可以不进docker)
curl 192.168.2.71:1040/generate -d ‘{
“prompt”: “请输出100个生僻字?”,
“max_tokens”: 32,
“stream”: false,
“do_sample”:true,
“repetition_penalty”: 1.00,
“temperature”: 0.01,
“top_p”: 0.001,
“top_k”: 1,
“model”: “qwen”
}’
大概3秒回答问题
五、MindIE+webUI方式
关闭防火墙(在docker外执行):
systemctl stop firewalld
安装webUI:
新开一个terminal,进docker:
docker exec -it sakway bash
cd /root/work/
pip install open-webui --index-url https://mirrors.huaweicloud.com/repository/pypi/simple/
这里open-webui的安装,大概需要十来分钟
安装成功后,到跑服务化的界面按Ctrl+C停止服务化进程(mindieservice_daemon):
vim /usr/local/Ascend/mindie/latest/mindie-service/conf/config.json
将"ipAddress" : “127.0.0.1”,修改为实际IP地址"ipAddress" : “192.168.2.71”,
如果已经是"ipAddress" : "192.168.2.71"则不需要再修改
启动Open-WebUI服务:
open-webui serve
成功标志:
有一个大的OPEN WEBUI的LOGO
新开一个terminal,进docker:
docker exec -it sakway bash
跑服务化(有个加载模型的过程需要点时间):
cd /usr/local/Ascend/mindie/latest/mindie-service/bin&&./mindieservice_daemon
成功标志:
Daemon start success!
在web浏览器中访问:
http://192.168.2.71:8080
点击开始使用
首次需要创建管理员账户:
名称:sakway
电子邮箱:jliangli@126.com
密码:ABC123
点击创建管理员账户,此时会提示注册成功,已登录
点击确认,开始使用
点击右上角带颜色的圆圈图标(选择管理员面板)
点击上面那一排右边的设置
点击左侧的外部链接
将"管理OpenAI API连接"修改为实际的IP
https://api.openai.com/v1修改为http://192.168.2.71:1040/v1
点击该行最右边的"设置",点击刷新,在弹出的“编辑连接”页面中点击保存
新开一个浏览器在web上访问http://192.168.2.71:8080开启对话
该局域网内的其他的PC用户也可以在浏览器打开http://192.168.2.71:8080开启新对话
以上大语言模型顺利跑成功
############################################################################################