RTX4090上跑CenterNet训练

发布于:2024-03-27 ⋅ 阅读:(91) ⋅ 点赞:(0)

参考实现

url=https://github.com/xingyizhou/CenterNet.git 
commit_id=5b1a490a52da57d3580e80b8bb4bbead9ef2af96

python环境安装

原仓文档中说明需要安装3.6版本的python,经实际测试pyton3.9可以运行。
conda create --name py39_pt python=3.9

pytorch安装

conda install pytorch=2.0.1+cu118 torchvision -c pytorch

安装COCOAPI

git clone https://github.com/cocodataset/cocoapi.git
cd PythonAPI
make
python setup.py install --user

执行make报错:

cl: 命令行 error D8021 :无效的数值参数“/Wno-cpp”
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.39.33519\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
Makefile:3: recipe for target 'all' failed
make: *** [all] Error 1

将setup.py中的编译参数注释掉:

#extra_compile_args=['-Wno-cpp', '-Wno-unused-function', '-std=c99'],

下载CenterNet源码

git clone https://github.com/xingyizhou/CenterNet
​
pip install -r requirements.txt

重新下载DCNv2代码编译

cd CenterNet/src/lib/models/networks
git clone https://github.com/lucasjinreal/DCNv2_latest  DCNv2
​
cd DCNv2
./make.sh

运行./make.sh如果报错:ModuleNotFoundError: No module named 'torch'

删除make.sh中最后一行的sudo,解决此问题。

#!/usr/bin/env bash
sudo rm *.so
sudo rm -r build/
sudo python3 setup.py build develop
​
删除sudo后的脚本
#!/usr/bin/env bash
sudo rm *.so
sudo rm -r build/
python3 setup.py build develop

再次运行 ./make.sh编译成功。

编译nms

cd src/lib/external
make

测试demo

Windows下

python demo.py ctdet --demo ..\images\16004479832_a748d55f21_k.jpg --load_model ..\models\ctdet_coco_dla_2x.pth

Linux下:

python demo.py ctdet --demo ../images/16004479832_a748d55f21_k.jpg --load_model ../models/ctdet_coco_dla_2x.pth

测试报错1:

 raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

解决方法:

参考:https://blog.csdn.net/WOSHIRENXIN/article/details/127415609
获取torch与torchvision的配套关系
查找对应版本的whl文件:https://download.pytorch.org/whl/torch_stable.html

测试报错2:

training chunk_sizes: [32]
The output will be saved to  E:\GPU\CenterNet\src\lib\..\..\exp\ctdet\default
heads {'hm': 80, 'wh': 2, 'reg': 2}
Creating model...
Downloading: "http://dl.yf.io/dla/models\imagenet\dla34-ba72cf86.pth" to C:\Users\xxx/.cache\torch\hub\checkpoints\dla34-ba72cf86.pth
Traceback (most recent call last):
  File "D:\Program Files\MindStudio 6.0.0\plugins\python-ce\helpers\pydev\pydevd.py", line 1496, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "D:\Program Files\MindStudio 6.0.0\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "E:/GPU/CenterNet/src/demo.py", line 57, in <module>
    demo(opt)
  File "E:/GPU/CenterNet/src/demo.py", line 21, in demo
    detector = Detector(opt)
  File "E:\GPU\CenterNet\src\lib\detectors\ctdet.py", line 26, in __init__
    super(CtdetDetector, self).__init__(opt)
  File "E:\GPU\CenterNet\src\lib\detectors\base_detector.py", line 24, in __init__
    self.model = create_model(opt.arch, opt.heads, opt.head_conv)
  File "E:\GPU\CenterNet\src\lib\models\model.py", line 28, in create_model
    model = get_model(num_layers=num_layers, heads=heads, head_conv=head_conv)
  File "E:\GPU\CenterNet\src\lib\models\networks\pose_dla_dcn.py", line 486, in get_pose_net
    model = DLASeg('dla{}'.format(num_layers), heads,
  File "E:\GPU\CenterNet\src\lib\models\networks\pose_dla_dcn.py", line 434, in __init__
    self.base = globals()[base_name](pretrained=pretrained)
  File "E:\GPU\CenterNet\src\lib\models\networks\pose_dla_dcn.py", line 314, in dla34
    model.load_pretrained_model(data='imagenet', name='dla34', hash='ba72cf86')
  File "E:\GPU\CenterNet\src\lib\models\networks\pose_dla_dcn.py", line 300, in load_pretrained_model
    model_weights = model_zoo.load_url(model_url)
  File "d:\miniconda3\envs\py39\lib\site-packages\torch\hub.py", line 746, in load_state_dict_from_url
    download_url_to_file(url, cached_file, hash_prefix, progress=progress)
  File "d:\miniconda3\envs\py39\lib\site-packages\torch\hub.py", line 611, in download_url_to_file
    u = urlopen(req)
  File "D:\Miniconda3\envs\py39\lib\urllib\request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "D:\Miniconda3\envs\py39\lib\urllib\request.py", line 523, in open
    response = meth(req, response)
  File "D:\Miniconda3\envs\py39\lib\urllib\request.py", line 632, in http_response
    response = self.parent.error(
  File "D:\Miniconda3\envs\py39\lib\urllib\request.py", line 561, in error
    return self._call_chain(*args)
  File "D:\Miniconda3\envs\py39\lib\urllib\request.py", line 494, in _call_chain
    result = func(*args)
  File "D:\Miniconda3\envs\py39\lib\urllib\request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
python-BaseException
​

解决方法:将模型下载好后手动复制到C:\Users\xxxx/.cache\torch\hub\checkpoints\dla34-ba72cf86.pth

开始训练

https://github.com/xingyizhou/CenterNet/blob/master/readme/GETTING_STARTED.md

python main.py ctdet --exp_id pascal_resdcn18_384 --arch resdcn_18 --dataset pascal --batch_size 32 --master_batch 15 --lr 1.25e-4  --gpus 0 

本文含有隐藏内容,请 开通VIP 后查看

网站公告

今日签到

点亮在社区的每一天
去签到