Metrics in MOT
定义来自:
https://arxiv.org/abs/1907.12740
1 | @article{ciaparrone2020deep, |
Trajectory v.s. Tracklet
Trajectory (轨迹):一条轨迹对应这一个目标在一个时间段内的位置序列
Tracklet (轨迹段):形成Trajectory过程中的轨迹片段。完整的Trajectory是由属于同一object的Tracklets构成的。
Metrics
Classical metrics
Mostly Tracked (MT) trajectories : number of ground-truth trajectories that are correctly tracked in at least 80% of the frames.
Note that it is irrelevant for this measure whether the ID remains the same throughout the track.
至少80%的帧中被正确跟踪的ground-truth中的轨迹数。这里需要注意的一点是:不管这条轨迹上 ID 如何的变化(比如预测的时候发生了变化),但只要还是这条轨迹占到ground-truth轨迹的 80% 以上就可以认为是 MT,即得到匹配就视为正确跟踪。
Mostly Lost (ML) trajectories : number of ground-truth trajectories that are correctly tracked in less than 20% of the frames. 少于20%的帧中被正确跟踪的ground-truth轨迹数。
Partially Tracked (PT) : PT 部分跟踪
除了 MT、ML ,其他的都认为是 PT
False trajectories : predicted trajectories which do not correspond to a real object (i.e. to a ground truth trajectory). 不能对应到真实目标的预测轨迹的数量
ID switches : number of times when the object is correctly tracked, but the associated ID for the object is mistakenly changed. 正确跟踪对象的次数,但错误关联了ID(ID被改变)。
Fragmentation (FM ):To that end, the number of track fragmentations(FM) counts how many times a ground truth trajectory is interrupted (untracked). In other words, a fragmentation is counted each time a trajectory changes its status from tracked to untracked and tracking of that same trajectory is resumed at a later point. FM计算的是跟踪有多少次被打断,这个与ID变换无关
CLEAR MOT metrics
- FP : the number of false positives in the whole video; 假阳性:整个视频不能与真实边界框关联的假设数量。误报数量。
关联成功的认定:IoU > $\alpha$(0.5),交并比,如果在$t-1$帧,ground truth$o_i$和hypothesis$h_j$匹配,即$IoU(o_i,h_j)\geqslant 0.5$,在$t$帧,仍然有$IoU(o_i,h_j)\geqslant 0.5$,那么,即使有$IoU(o_i,h_k)>IoU(o_i,h_j)$也认为$o_i$和$h_j$匹配
FN : the number of false negatives in the whole video; 假阴性:整个视频中不能与假设关联的真实边界框的数量。漏报数量,未命中。
Fragm : the total number of fragmentations; 每次ground truth对象跟踪被中断并随后恢复时,都被视为碎片。碎片的总数。同上,FM
IDSW : the total number of ID switches. 每次被跟踪的真实对象ID在跟踪持续时间内被错误地更改时,将被视为一个ID switch。同上,ID switch
where $GT $ is the number of ground truth boxes.
缺失率$\frac{FN}{GT}$,误判率$\frac{FP}{GT}$,误匹配率$\frac{IDSW}{GT}$
MOTA越接近1越好,MOTA 主要考虑的是 tracking 中所有对象匹配错误,主要是 FP、FN、IDs、MOTA 给出的是非常直观的衡量跟踪其在检测物体和保持轨迹时的性能,与目标检测精度无关。
where $c_t
$ denotes the number of matches in frame $t $ and $d_{t,i}
$ is the bounding box overlap between the hypothesis $i
$ with its assigned ground truth object.
$c_t
$ 为第t帧匹配的次数,$d_{t,i}$为假设$i $与其指定的ground truth对象之间的匹配误差。值得注意的是,这个指标只考虑了很少的跟踪信息,而更关注于检测的质量。
ID scores
ID相关的指标,有时候比较关注长时间的跟踪错误(如航空场景),这时比较关注ID问题。
具体计算的时候是构建了一个二分图,一边的点是$V_{T}$,是由所有存在的gt轨迹和对每个计算得到的点构建一个FN点构成,另一边的点$V_{C}$是由计算得到的点和对每个gt点建一个对应的FP点构成,最后做最小费用的匹配,边的费用在\cite{ristani2016performance}有更详细的解释。如果$V_{T}$和计算得到的点匹配,那么就是$IDTP$,如果计算得到的点与FN点匹配计入$IDFN$,如果gt的点与FP点匹配则计入$IDFP$
Identification precision (IDP)
Identification recall (IDR)
Identification F1 (IDF1)
Reference
https://www.yuque.com/aicv/lab/tc9yqd
计算:
https://github.com/cheind/py-motmetrics
1 | @inproceedings{ristani2016performance, |