Skip to content

Commit 3af2e21

Browse files
littletomatodonkeyqingqing01
authored andcommitted
Add models basd on OpenImage-v5 and Object365 (#26)
* Add cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms model on oidv5 and obj365. * Add obj365 config annotation * Update docs.
1 parent dadbbca commit 3af2e21

14 files changed

+1016
-6
lines changed

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ PaddleDetection的目的是为工业界和学术界提供丰富、易用的目
7171
- [行人检测和车辆检测预训练模型](contrib/README_cn.md) 针对不同场景的检测模型
7272
- [YOLOv3增强模型](docs/YOLOv3_ENHANCEMENT.md) 改进原始YOLOv3,精度达到41.4%,原论文精度为33.0%,同时预测速度也得到提升
7373
- [Objects365 2019 Challenge夺冠模型](docs/CACascadeRCNN.md) Objects365 Full Track任务中最好的单模型之一,精度达到31.7%
74+
- [Open Images V5和Objects365数据集模型](docs/OIDV5_BASELINE_MODEL.md)
7475

7576

7677
## 模型压缩
@@ -90,8 +91,13 @@ PaddleDetection的目的是为工业界和学术界提供丰富、易用的目
9091

9192
## 版本更新
9293

93-
### 10/2019
94+
### 21/11/2019
95+
- 增加CascadeClsAware RCNN模型。
96+
- 增加CBNet,ResNet200和Non-local模型。
97+
- 增加SoftNMS。
98+
- 增加Open Image V5数据集和Objects365数据集模型。
9499

100+
### 10/2019
95101
- 增加增强版YOLOv3模型,精度高达41.4%。
96102
- 增加人脸检测模型BlazeFace、Faceboxes。
97103
- 丰富基于COCO的模型,精度高达51.9%。

README_en.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ Advanced Features:
8080
- [Pretrained models for pedestrian and vehicle detection](contrib/README.md) Models for object detection in specific scenarios.
8181
- [YOLOv3 enhanced model](docs/YOLOv3_ENHANCEMENT.md) Compared to MAP of 33.0% in paper, enhanced YOLOv3 reaches the MAP of 41.4% and inference speed is improved as well
8282
- [Objects365 2019 Challenge champion model](docs/CACascadeRCNN.md) One of the best single models in Objects365 Full Track of which MAP reaches 31.7%.
83+
- [Open Images Dataset V5 and Objects365 Dataset models](docs/OIDV5_BASELINE_MODEL.md)
8384

8485
## Model compression
8586

@@ -98,6 +99,12 @@ Advanced Features:
9899

99100
## Updates
100101

102+
#### 21/11/2019
103+
- Add CascadeClsAware RCNN model.
104+
- Add CBNet, ResNet200 and Non-local model.
105+
- Add SoftNMS.
106+
- Add models of Open Images Dataset V5 and Objects365 Dataset.
107+
101108
#### 10/2019
102109

103110
- Add enhanced YOLOv3 models, box mAP up to 41.4%.
Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
architecture: CascadeRCNNClsAware
2+
train_feed: FasterRCNNTrainFeed
3+
eval_feed: FasterRCNNEvalFeed
4+
test_feed: FasterRCNNTestFeed
5+
max_iters: 800000
6+
snapshot_iter: 10000
7+
use_gpu: true
8+
log_smooth_window: 20
9+
save_dir: output
10+
pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/ResNet200_vd_pretrained.tar
11+
weights: output/cascade_rcnn_cls_aware_r200_vd_fpn_dcnv2_nonlocal_softnms/model_final
12+
# obj365 dataset format and its eval method are same as those for coco
13+
metric: COCO
14+
num_classes: 366
15+
16+
CascadeRCNNClsAware:
17+
backbone: ResNet
18+
fpn: FPN
19+
rpn_head: FPNRPNHead
20+
roi_extractor: FPNRoIAlign
21+
bbox_head: CascadeBBoxHead
22+
bbox_assigner: CascadeBBoxAssigner
23+
24+
ResNet:
25+
norm_type: bn
26+
depth: 200
27+
feature_maps: [2, 3, 4, 5]
28+
freeze_at: 2
29+
variant: d
30+
dcn_v2_stages: [3, 4, 5]
31+
nonlocal_stages: [4]
32+
33+
FPN:
34+
min_level: 2
35+
max_level: 6
36+
num_chan: 256
37+
spatial_scale: [0.03125, 0.0625, 0.125, 0.25]
38+
39+
FPNRPNHead:
40+
anchor_generator:
41+
anchor_sizes: [32, 64, 128, 256, 512]
42+
aspect_ratios: [0.5, 1.0, 2.0]
43+
stride: [16.0, 16.0]
44+
variance: [1.0, 1.0, 1.0, 1.0]
45+
anchor_start_size: 32
46+
min_level: 2
47+
max_level: 6
48+
num_chan: 256
49+
rpn_target_assign:
50+
rpn_batch_size_per_im: 256
51+
rpn_fg_fraction: 0.5
52+
rpn_positive_overlap: 0.7
53+
rpn_negative_overlap: 0.3
54+
rpn_straddle_thresh: 0.0
55+
train_proposal:
56+
min_size: 0.0
57+
nms_thresh: 0.7
58+
pre_nms_top_n: 2000
59+
post_nms_top_n: 2000
60+
test_proposal:
61+
min_size: 0.0
62+
nms_thresh: 0.7
63+
pre_nms_top_n: 1000
64+
post_nms_top_n: 1000
65+
66+
FPNRoIAlign:
67+
canconical_level: 4
68+
canonical_size: 224
69+
min_level: 2
70+
max_level: 5
71+
box_resolution: 14
72+
sampling_ratio: 2
73+
74+
CascadeBBoxAssigner:
75+
batch_size_per_im: 512
76+
bbox_reg_weights: [10, 20, 30]
77+
bg_thresh_lo: [0.0, 0.0, 0.0]
78+
bg_thresh_hi: [0.5, 0.6, 0.7]
79+
fg_thresh: [0.5, 0.6, 0.7]
80+
fg_fraction: 0.25
81+
class_aware: True
82+
83+
CascadeBBoxHead:
84+
head: CascadeTwoFCHead
85+
nms: MultiClassSoftNMS
86+
87+
CascadeTwoFCHead:
88+
mlp_dim: 1024
89+
90+
MultiClassSoftNMS:
91+
score_threshold: 0.001
92+
keep_top_k: 300
93+
softnms_sigma: 0.15
94+
95+
LearningRate:
96+
base_lr: 0.01
97+
schedulers:
98+
- !PiecewiseDecay
99+
gamma: 0.1
100+
milestones: [520000, 740000]
101+
- !LinearWarmup
102+
start_factor: 0.1
103+
steps: 1000
104+
105+
OptimizerBuilder:
106+
optimizer:
107+
momentum: 0.9
108+
type: Momentum
109+
regularizer:
110+
factor: 0.0001
111+
type: L2
112+
113+
FasterRCNNTrainFeed:
114+
batch_size: 1
115+
dataset:
116+
dataset_dir: dataset/obj365
117+
annotation: train.json
118+
image_dir: train
119+
sample_transforms:
120+
- !DecodeImage
121+
to_rgb: True
122+
with_mixup: False
123+
- !RandomFlipImage
124+
prob: 0.5
125+
- !NormalizeImage
126+
is_channel_first: false
127+
is_scale: True
128+
mean:
129+
- 0.485
130+
- 0.456
131+
- 0.406
132+
std:
133+
- 0.229
134+
- 0.224
135+
- 0.225
136+
- !ResizeImage
137+
interp: 1
138+
target_size: [416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024, 1056, 1088, 1120, 1152, 1184, 1216, 1248, 1280, 1312, 1344, 1376, 1408]
139+
max_size: 1800
140+
use_cv2: true
141+
- !Permute
142+
to_bgr: false
143+
batch_transforms:
144+
- !PadBatch
145+
pad_to_stride: 32
146+
drop_last: false
147+
num_workers: 2
148+
149+
FasterRCNNEvalFeed:
150+
batch_size: 1
151+
dataset:
152+
dataset_dir: dataset/obj365
153+
annotation: val.json
154+
image_dir: val
155+
sample_transforms:
156+
- !DecodeImage
157+
to_rgb: True
158+
with_mixup: False
159+
- !NormalizeImage
160+
is_channel_first: false
161+
is_scale: True
162+
mean:
163+
- 0.485
164+
- 0.456
165+
- 0.406
166+
std:
167+
- 0.229
168+
- 0.224
169+
- 0.225
170+
- !ResizeImage
171+
interp: 1
172+
target_size:
173+
- 1200
174+
max_size: 2000
175+
use_cv2: true
176+
- !Permute
177+
to_bgr: false
178+
batch_transforms:
179+
- !PadBatch
180+
pad_to_stride: 32
181+
182+
FasterRCNNTestFeed:
183+
batch_size: 1
184+
dataset:
185+
annotation: dataset/obj365/val.json
186+
batch_transforms:
187+
- !PadBatch
188+
pad_to_stride: 32
189+
drop_last: false
190+
num_workers: 2

0 commit comments

Comments
 (0)