Efficient Video Object Detection and the Neglected Delay Issue

Speaker: Huizi Mao

Date & Time: 2019년 10월 30일(수), 17:00

Where: 삼성전자 서울대연구소 3층 대회의실


We first present CaTDet – Cascaded Tracked Detector for efficient video object detection. CaTDet breaks the standard detect-to-track pipeline and feeds a tracker’s output back to the detector to reduce the workload. For most videos, the same object appears in adjacent frames (temporal locality) and in the nearby locations (spatial locality). CaTDet is designed to exploit these two types of locality. We next investigate the evaluation metric of video object detection, and point out that the standard average precision (AP) does not tell the whole story. A novel average delay (AD) metric is introduced to measure the swiftness of a video object detector. We find out that multiple published methods deteriorate the delay, even though they preserve or even improve the average precision, which indicates the neglected delay issue. CaTDet appeared on SysML 2019 and the AD metric will be presented on ICCV 2019.


Huizi Mao is a fourth-year PhD student at Stanford University advised by William J. Dally. He is broadly interested in efficient and low-latency video understanding, with a focus on video object detection. He also did research on general neural network compression and acceleration. Before coming to Stanford, Huizi earned a B.E. (with honor) in Electronic Engineering and a B.S. (minor) in Mathematics, both from Tsinghua University.


초청자: 융합과학부 지능형융합시스템전공 안정호 교수(gajh@snu.ac.kr)