Spatial and temporal downsampling in event-based visual classification

Gregory Cohen, Saeed Afshar, Garrick Orchard, Jonathan Tapson, Ryad Benosman, André van Schaik

Research output: Contribution to journalArticlepeer-review

24 Citations (Scopus)

Abstract

As the interest in event-based vision sensors for mobile and aerial applications grows, there is an increasing need for high-speed and highly robust algorithms for performing visual tasks using event-based data. As event rate and network structure have a direct impact on the power consumed by such systems, it is important to explore the efficiency of the event-based encoding used by these sensors. The work presented in this paper represents the first study solely focused on the effects of both spatial and temporal downsampling on event-based vision data and makes use of a variety of data sets chosen to fully explore and characterize the nature of downsampling operations. The results show that both spatial downsampling and temporal downsampling produce improved classification accuracy and, additionally, a lower overall data rate. A finding is particularly relevant for bandwidth and power constrained systems. For a given network containing 1000 hidden layer neurons, the spatially downsampled systems achieved a best case accuracy of 89.38% on N-MNIST as opposed to 81.03% with no downsampling at the same hidden layer size. On the N-Caltech101 data set, the downsampled system achieved a best case accuracy of 18.25%, compared with 7.43% achieved with no downsampling. The results show that downsampling is an important preprocessing technique in eventbased visual processing, especially for applications sensitive to power consumption and transmission bandwidth.
Original languageEnglish
Pages (from-to)5030-5044
Number of pages15
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume29
Issue number10
DOIs
Publication statusPublished - 2018

Keywords

  • classification
  • detectors

Fingerprint

Dive into the research topics of 'Spatial and temporal downsampling in event-based visual classification'. Together they form a unique fingerprint.

Cite this