TY - JOUR
T1 - DART : distribution aware retinal transform for event-based cameras
AU - Ramesh, Bharath
AU - Yang, Hong
AU - Orchard, Garrick
AU - Le Thi, Ngoc Anh
AU - Zhang, Shihao
AU - Xiang, Cheng
PY - 2020
Y1 - 2020
N2 - We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.
AB - We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.
KW - cameras
KW - classification
KW - computer vision
UR - http://hdl.handle.net/1959.7/uws:57066
U2 - 10.1109/TPAMI.2019.2919301
DO - 10.1109/TPAMI.2019.2919301
M3 - Article
SN - 0162-8828
VL - 42
SP - 2767
EP - 2780
JO - IEEE Transactions of Pattern Analysis and Machine Intelligence
JF - IEEE Transactions of Pattern Analysis and Machine Intelligence
IS - 11
ER -