# [TF 2.X] Mask R-CNN for Object Detection and Segmentation
[Notice] : The original mask-rcnn uses the tensorflow 1.X version. I modified it for tensorflow 2.X version.
### Development Environment
- OS : Ubuntu 20.04.2 LTS
- GPU : Geforce RTX 3090
- CUDA : 11.2
- Tensorflow : 2.5.0
- Keras : 2.5.0 (tensorflow backend)
- Python 3.8
This is an implementation of [Mask R-CNN](https://arxiv.org/abs/1703.06870) on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image.该模型为图像中对象的每个实例生成边界框和分割掩码。 It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.

实例分割样本
The repository includes:
存储库包括:
* Source code of Mask R-CNN built on FPN and ResNet101.
掩码R-CNN建立在FPN和ResNet101上的源代码。
* Training code for MS COCO
MScoco训练代码
* Pre-trained weights for MS COCO
MSCOCO的预训练权重
* Jupyter notebooks to visualize the detection pipeline at every stepJupyter
jupyternotabook可以可视化每一步的检测管道
* ParallelModel class for multi-GPU training
ParallelModel类用于多gpu训练
* Evaluation on MS COCO metrics (AP)
对MS COCO指标的评估(AP)
* Example of training on your own dataset
在你自己的数据集上训练的例子
The code is documented and designed to be easy to extend. If you use it in your research, please consider citing this repository (bibtex below).
代码有文档记录,并且易于扩展。如果你在你的研究中使用它,请考虑引用这个库(bibtex below)。
You can see more examples [here](https://matterport.com/gallery/).
你可以在[here]看到更多的例子。
# Getting Started开始
* [demo.ipynb](samples/demo.ipynb)
Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images.它展示了一个使用MS COCO预先训练的模型在您自己的图像中分割对象的示例。
It includes code to run object detection and instance segmentation on arbitrary images.
它包括在任意图像上运行对象检测和实例分割的代码。
* [train_shapes.ipynb](samples/shapes/train_shapes.ipynb) shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.显示如何在自己的数据集上训练掩码R-CNN。这本笔记本介绍了一个玩具数据集(形状)来演示在一个新的数据集上的训练。
* ([model.py](mrcnn/model.py), [utils.py](mrcnn/utils.py), [config.py](mrcnn/config.py))
These files contain the main Mask RCNN implementation. 这些文件包含主要的掩码RCNN实现。
* [inspect_data.ipynb](samples/coco/inspect_data.ipynb).
This notebook visualizes the different pre-processing steps to prepare the training data.
这个笔记本显示了不同的预处理步骤准备培训数据。
* [inspect_model.ipynb](samples/coco/inspect_model.ipynb)
This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.
这本笔记本深入研究了检测和分割对象所执行的步骤。它提供了管道的每个步骤的可视化。
* [inspect_weights.ipynb](samples/coco/inspect_weights.ipynb)
This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.
这个笔记本检查一个训练过的模型的权重,并寻找异常和奇怪的模式。
# Step by Step Detection 逐步检测
To help with debugging and understanding the model, there are 3 notebooks
为了帮助调试和理解模型,有3本笔记本
([inspect_data.ipynb](samples/coco/inspect_data.ipynb),
[inspect_model.ipynb](samples/coco/inspect_model.ipynb),
[inspect_weights.ipynb](samples/coco/inspect_weights.ipynb))
that provide a lot of visualizations and allow running the model step by step to inspect the output at each point. Here are a few examples:
提供了大量可视化,并允许逐步运行模型来检查每个点的输出。以下是一些例子。
## 1. Anchor sorting and filtering 锚点排序和过滤
Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.
可视化第一阶段区域提议网络的每一步,并显示正锚和负锚以及锚框细化。

## 2. Bounding Box Refinement 边界框细化
This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.
这是一个最终检测框(虚线)的示例,以及在第二阶段应用于它们的细化(实线)。

## 3. Mask Generation 面具一代
Examples of generated masks. These then get scaled and placed on the image in the right location.
生成的掩码示例。然后将它们按比例放置在图像的正确位置。

## 4.Layer activations 层激活
Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise). 通常,检查不同层的激活来寻找故障的迹象(全零或随机噪声)是有用的。

## 5. Weight Histograms 权重直方图
Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.
另一个有用的调试工具是检查权重直方图。这些都包含在inspect_weights.ipynb笔记本。

## 6. Logging to TensorBoard 日志记录TensorBoard
TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.
TensorBoard是另一个很棒的调试和可视化工具。该模型被配置为在每个历结束时记录损失并保存权重。

## 6. Composing the different pieces into a final result
将不同的片段组合成最终的结果

# Training on MS COCO MSCOCO训练
We're providing pre-trained weights for MS COCO to make it easier to start.
我们为ms coco提供了预先训练过的权重,更容易开始。
You canuse those weights as a starting point to train your own variation on the network.
你可以使用这些权重作为起点,在网络上训练您自己的变体。
Training and evaluation code is in `samples/coco/coco.py`.
训练和评估代码在' samples/coco/ copy .py '中。
You can import this module in Jupyter notebook (see the provided notebooks for examples) or you can run it directly from the command line as such:
你可以在Jupyter笔记本中导入这个模块(参见提供的笔记本示例),或者你可以直接从命令行运行它,如下所示:
```
# Train a new model starting from pre-trained COCO weights 从预训练的COCO权重开始训练一个新模型
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco
# Train a new model starting from ImageNet weights 从ImageNet权重开始训练一个新模型
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet
# Continue training a model that you had trained earlier 继续训练您之前训练过的模型
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5
# Continue training the last model you trained. This will find the last trained weights in the model directory.继续训练你训练的最后一个模型。这将在模型目录中找到最后训练的权重。
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last
```
You can also run the COCO evaluation code with:
你也可以用以下方法运行COCO评估代码:
```
# Run COCO evaluation on the last trained model
#在最后一个训练的模型上运行COCO评估
python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last
```
The training schedule, learning rate, and other parameters should be set in `samples/coco/coco.py`.
训练计划、学习率和其他参数应该在“samples/coco/coco.py”中设置。
# Training on Your Own Dataset 在自己的数据集上进行培训
Start by reading this从阅读这篇文章开始 [blog post about the balloon color splash sample](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46). It covers the process starting from annotating images to training to using the results in a sample application.它涵盖了从标注图像到训练到在示例应用程序中使用结果的过程。
In summary, to train the model on your own dataset you'll need to extend two classes:
总之,为了在你自己的数据集上训练模型,你需要扩展两个类:
```Config```
This class contains the default configuration. Subclass it and modify the attributes you need to change.
该类包含默认配置。子类化它并修改需要更改的属性。
```Dataset```
This class provides a consistent way to work with any dataset.
该类提供了一种一致的方式来处理任何数据集。
It allows you to use new datasets for training without having to change the code of the model.
它允许您使用新的数据集进行训练,而不必更改模型的代码。
It also supports loading multiple datasets at the same time, which is useful if the objects you want to detect are not all available in one dataset.
它还支持同时加载多个数据集,如果您想检测的对象在一个数据集中不是所有可用的,这是很有用的。
See examples in
`samples/shapes/train_shapes.ipynb`,
`samples/coco/coco.py`,
`samples/balloon/balloon.py`,
`samples/nucleus/nucleus.py`.
## Differences from the Official Paper 与官方文件的差异
This implementation follows the Mask RCNN paper for the most part, but there are a few cases where we deviated in favor of code simplicity and generalization. These are some of the differences we're aware of. If you encounter other differences, please do let us know.
这个实现在很大程度上遵循了Mask RCNN论文的内容,但在一些情况下,我们偏向于代码的简单性和泛化。这些是我们意识到的一些差异。如果您遇到其他差异,请务必让我们知道。
* **Image Resizing图片大小::**
To support training multiple images per batch we resize all images to the same size.
为了支持每批训练多个图像,我们将所有图像调整为相同的大小。
For example, 1024x1024px on MS COCO. We preserve the aspect ratio, so if an image is not square we pad it with zeros.
我们保留长宽比,所以如果图像不是正方形的,我们用0填充它。
In the paper the resizing is done such that the smallest side is 800px and the largest is trimmed at 1000px.例如,MS COCO上的1024x1024px。
在这张纸上,调整尺寸是这样的,最小的边是800px,最大的边是1000px。
* **Bounding Boxes边界框**:
Some datasets provide bounding boxes and some provide masks only. To support training on multiple datasets we opted to ignore the bounding boxes that come with the dataset and generate them on the fly instead.有些数据集提供边界框,有些仅提供掩码。为了支持对多个数据集的训练,我们选择忽略数据集附带的边界框,而是动态地生成它们。
We pick the smallest box that encapsulates all the pixels of the mask as the bounding box. This simplifies the implementation and also makes it easy to apply image augmentations that would otherwise be harder to apply to bounding boxes, such as image rotation.
我们选择封装了掩码的所有像素的最小框作为边界框。这简化了实现,也使应用图像增强变得容易,否则很难应用到边界框,例如图像旋转。
To validate this approach, we compared our computed bounding boxes to those provided by the COCO dataset.为了验证这种方法,我们将计算得到的边界框与COCO数据集提供的边界框进行比较。
We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more, and only 0.01% differed by 10px or more.
我们发现~2%的边界框相差1px或更多,~0.05%的边界框相差5px或更多,只有0.01%的人相差10px或更多。
* **Learning Rate学习速率:**
The paper uses a learning rate of 0.02, but we found that to be too high, and often causes the weights to explode, especially when using a small batch size.
这篇论文使用了0.02的学习率,但我们发现这个学习率太高了,经常会导致权重激增,特别是在使用小批处理时。
It might be related to differences between how Caffe and TensorFlow compute gradients (sum vs mean across batches and GPUs).
这可能与Caffe和TensorFlow计算梯度的方式(批次和gpu之间的和与平均值)的差异有关。
Or, maybe the official model uses gradient clipping to avoid this issue. We do use gradient clipping, but don't set it too aggressively.We found that smaller learning rates converge faster anyway so we go with that.
或者,也许官方模型使用渐变裁剪来避免这个问题。我们确实使用了渐变裁剪,但不要设置得太激进。我们发现,学习率越小,收敛越快,所以我们就这么做了。
## Citation 引用
Use this bibtex to cite this repository:使用这个bibtex引用这个存储库:
```
@misc{matterport_maskrcnn_2017,
title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},
author={Waleed Abdulla},
year={2017},
publisher={Github},
journal={GitHub repository},
howpublished={\url{https://github.com/matterport/Mask_RCNN}},
}
```
## Requirements
Python 3.4, TensorFlow 1.3, Keras 2.0.8 and other common packages listed in `requirements.txt`.
### MS COCO Requirements:
To train or test on MS COCO, you'll also need:
* pycocotools (installation instructions below)
* [MS COCO Dataset](http://cocodataset.org/#home)
* Download the 5K [minival](https://dl.dropboxusercontent.com/s/o43o90bna78omob/instances_minival2014.json.zip?dl=0)
and the 35K [validation-minus-minival](https://dl.dropboxusercontent.com/s/s3tw5zcg7395368/instances_valminusminival2014.json.zip?dl=0)
subsets. More details in the original [Faster R-CNN implementation](https://github.com/rbgirshick/py-faster-rcnn/blob/master/data/README.md).
If you use Docker, the code has been verified to work on
[this Docker container](https://hub.docker.com/r/waleedka/modern-deep-learning/).
## Installation安装
1. Clone this repository克隆这个存储库
2. Install dependencies安装依赖关系
```bash
pip3 install -r requirements.txt
```
3. Run setup from the repository root directory 从存储库根目录运行setup
```bash
python3 setup.py install
```
3. Download pre-trained COCO weights (mask_rcnn_coco.h5) from the [releases page](https://github.com/matterport/Mask_RCNN/releases).
从[发布页面](https://github.com/matterport/Mask_RCNN/releases)下载预训练的COCO权重(mask_rcnn_coco.h5)。
4. (Optional) To train or test on MS COCO install `pycocotools` from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).
要在MS COCO上进行培训或测试,请从其中一个回购中安装“pycocotools”。它们是最初pycocotools的分支,对Python3和Windows进行了修复(官方回购似乎不再活跃了)。
* Linux: https://github.com/waleedka/coco
You must have the Visual C++ 2015 build tools on your path (see the repo for additional details)你必须有Visual c++ 2015构建工具在你的路径上(参见回购更多细节)
# Projects Using this Model使用此模型的项目
If you extend this model to other datasets or build projects that use it, we'd love to hear from you.如果您将这个模型扩展到其他数据集或构建使用它的项目,我们很乐意听到您的意见。
### [4K Video Demo](https://www.youtube.com/watch?v=OOT3UIXZztE) by Karol Majek.
[](https://www.youtube.com/watch?v=OOT3UIXZztE)
### [Images to OSM](https://github.com/jremillard/images-to-osm): Improve OpenStreetMap by adding baseball, soccer, tennis, football, and basketball fields.
通过添加棒球、足球、网球、足球和篮球场地来改进OpenStreetMap。
![Identify sport fields in satellite images]在卫星图像中识别运动场
(assets/images_to_osm.png)
### [Splash of Color](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46). A blog post explaining how to train this model from scratch and use it to implement a color splash effect.
一篇博客文章解释了如何从零开始训练这个模型,并使用它来实现色彩飞溅的效果。

### [Segmenting Nuclei in Microscopy Images显微图像中的核分割](samples/nucleus).
Built for the [2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018)
Code is in the `samples/nucleus` directory.

### [Detection and Segmentation for Surgery Robots手术机器人的检测与分割](https://github.com/SUYEgit/Surgery-Robot-Detection-Segmentation) by the NUS Control & Mechatronics Lab.

### [Reconstructing 3D buildings from aerial LiDAR利用空中激光雷达重建三维建筑](https://medium.com/geoai/reconstructing-3d-buildings-from-aerial-lidar-with-ai-details-6a81cb3079c0)
A proof of concept project by [Esri](https://www.esri.com/), in collaboration with Nvidia and Miami-Dade County. Along with a great write up and code by Dmitry Kudinov, Daniel Hedges, and Omar Maher.

### [Usiigaci: Label-free Cell Tracking in Phase Contrast Microscopy相位对比显微镜中无标签细胞跟踪](https://github.com/oist/usiigaci)
A project from Japan to automatically track cells in a microfluidics platform. Paper is pending, but the source code is released.
 
### [Characterization of Arctic Ice-Wedge Polygons in Very High Spatial Resolution Aerial Imagery]超高空间分辨率航空图像中北极冰楔多边形的表征
(http://www.mdpi.com/2072-4292/10/9/1487)
Research project to understand the complex processes between degradations in the Arctic and climate change. By Weixing Zhang, Chandi Witharana, Anna Liljedahl, and Mikhail Kanevskiy.

### [Mask-RCNN Shiny](https://github.com/huuuuusy/Mask-RCNN-Shiny)
A computer vision class project by HU Shiyu to apply the color pop effect on people with beautiful results.

### [Mapping Challenge映射的挑战](https://github.com/crowdAI/crowdai-mapping-challenge-mask-rcnn): Convert satellite imagery to maps for use by humanitarian organisations.

### [GRASS GIS Addon草GIS插件](https://github.com/ctu-geoforall-lab/i.ann.maskrcnn) to generate vector masks from geospatial imagery. Based on a [Master's thesis](https://github.com/ctu-geoforall-lab-projects/dp-pesek-2018) by Ondřej Pešek.
