DART : distribution aware retinal transform for event-based cameras

Bharath Ramesh, Hong Yang, Garrick Orchard, Ngoc Anh Le Thi, Shihao Zhang, Cheng Xiang

Research output: Contribution to journalArticlepeer-review

64 Citations (Scopus)

Abstract

We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-words classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101); (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) Statistical bootstrapping is leveraged with online learning for overcoming the low-sample problem during the one-shot learning of the tracker, (ii) Cyclical shifts are induced in the log-polar domain of the DART descriptor to achieve robustness to object scale and rotation variations; (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset; (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.
Original languageEnglish
Pages (from-to)2767-2780
Number of pages14
JournalIEEE Transactions of Pattern Analysis and Machine Intelligence
Volume42
Issue number11
DOIs
Publication statusPublished - 2020

Keywords

  • cameras
  • classification
  • computer vision

Fingerprint

Dive into the research topics of 'DART : distribution aware retinal transform for event-based cameras'. Together they form a unique fingerprint.

Cite this