Center Net Paper
Github Repo

Simple Intro

One Shot

输出结果直接表示目标

Anchor Free

Fast

可以换很小的backbone
No NMS

输出5通道feature map,tensor shape:(Batch,Channel=5,Height,Width)(Batch,Channel=5,Height,Width)
忽略batch,每个Hi,WiH_i,W_i位置的5个channel value

(p,w,h,offsetx,offsety)(p, w, h, offset_x, offset_y)

表示在(Wi+offsetx,Hi+offsety)(W_i+offset_x,H_i+offset_y)处有一个宽wwhh的目标的概率为pp

offset:经过backbone下采样(通常为4x)后,128×128128\times 128的输入变成了8×88\times 8,feature map上一个位置表示原始输入的16×1616\times 16区域(如feature map的坐标(x,y)(x, y)映射到原始输入坐标(x×16,y×16)(x\times 16,y\times 16)),需要借助offset得到原始输入上的精确坐标

取p高于阈值的位置进行Decode即可得到目标bboxes

Head输出都是回归值,需要预测其他的只需要更改head输出channel即可

Many objectives

如果需要进行多分类检测,直接增加head中p的数量,即(pclass1,pclass2,pclass3,,pclassn,w,h,offsetx,offsety)(p_{class-1}, p_{class-2}, p_{class-3} ,\dots , p_{class-n}, w, h, offset_x, offset_y)

Decode

在输出的probability channel (HeatMap)中,概率最高的位置就是预测框的中心,但是实际输出的概率不是非0即1的,而是以某个点为中心弥散开来的圆。需要取这个区域中最高的那个点。

Filter low score

首先将概率低于阈值的去掉

NMS

在一个峰值附近可能有很多差不多高分的点,会输出多个框,需要"NMS", 这里直接对 HeapMap 进行Max Pool。Position(MaxPool(hm)==hm)Position(MaxPool(hm) == hm)即为所求

Rescale

所有的位置是经过下采样的,需要恢复到输入图片的scale。得到最终BBox

x=Posx×scale+offsetx,y=Posy×scale+offsety,width=w,height=hx = Pos_x\times scale + offset_x, \\ y = Pos_y\times scale + offset_y, \\ width = w, \\ height = h \\

Loss

Focal Loss on heatmap channel.
L1 Loss on the other channels.

Sample

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
import torch
import torch.nn as nn


class CenterNetHead(nn.Module):
def __init__(self, in_channels, inner_conv_channels, n_classes):
super().__init__()
self.head_in_channels = in_channels
self.head_conv_channels = inner_conv_channels
self.p_head_out_channels = n_classes
self.p_head = nn.Sequential(
nn.Conv2d(
self.head_in_channels,
self.head_conv_channels,
kernel_size=3,
padding=1,
bias=True,
),
nn.ReLU(inplace=True),
nn.Conv2d(
self.head_conv_channels,
self.p_head_out_channels,
kernel_size=1,
stride=1,
padding=0,
bias=True,
),
)
self.wh_head = nn.Sequential(
nn.Conv2d(
self.head_in_channels,
self.head_conv_channels,
kernel_size=3,
padding=1,
bias=True,
),
nn.ReLU(inplace=True),
nn.Conv2d(
self.head_conv_channels,
out_channels=2,
kernel_size=1,
stride=1,
padding=0,
),
)
self.offset_head = nn.Sequential(
nn.Conv2d(
self.head_in_channels,
self.head_conv_channels,
kernel_size=3,
padding=1,
bias=True,
),
nn.ReLU(inplace=True),
nn.Conv2d(
self.head_conv_channels,
out_channels=2,
kernel_size=1,
stride=1,
padding=0,
),
)

def forward(self, x):
prob = self.p_head(x)

# need clamp sigmoid
# https://github.com/xingyizhou/CenterNet/blob/master/src/lib/models/utils.py#L9
prob = torch.clamp(torch.sigmoid(prob), min=1e-5, max=1 - 1e-5)
wh = self.wh_head(x)
offset = self.offset_head(x)
return {"prob": prob, "wh": wh, "offset": offset}