Skip to main navigation Skip to search Skip to main content

A spatial-spectral multi-view contrastive learning framework for scene representation learning

  • Jubo Chen
  • , Xiaosheng Yu
  • , Zhengxuan Jiang
  • , Zi Teng
  • , Hao Wu
  • , Chengdong Wu
  • Macquarie University

Research output: Contribution to journalArticlepeer-review

Abstract

Despite numerous scene representation learning methods improving the quality of novel view synthesis task and demonstrating better generalization capabilities in new scenarios, several pivotal challenges persist that call for enhancements. Firstly, representation models based on deep neural networks entail significant computational demands, especially when acquiring prior knowledge from large datasets. Secondly, even if traditional spatial domain models converge to ideal solutions, learning paradigms inherently exhibit inconsistencies in detail and edge information in the spectral domain. To address the aforementioned issues, our Spatial-spectral Multi-view Contrastive Learning (SMCL) framework integrates modules specifically tailored for processing different scene data. It employs operations such as ray-based multi-head self attention mechanism, cross-domain enhancement and dual-domain convolution to update representation features. In addition, to constrain the training process of SMCL, we also introduce two specific loss functions. The cross-domain contrastive loss is designed to ensure that the fast Fourier transform can efficiently extract edge features in the spectral domain at low computational cost, while the multi-view frequency loss preserves the consistency of feature representation learning across multi-view and coordinated frequency responses. Finally, high-quality novel views are synthesized from the constructed scene representations through improved neural rendering. Experimental results demonstrate that SMCL surpasses current state-of-the-art models on four popular datasets. The code and data can be found at: https://github.com/jubo-neu/SMCL.
Original languageEnglish
Pages (from-to)103889
Number of pages21
JournalInformation Fusion
Volume127
DOIs
Publication statusPublished - 2026
Externally publishedYes

Keywords

  • Representation learning
  • Novel view synthesis
  • Spectral information
  • Fast Fourier transform
  • Cross-domain fusion

Fingerprint

Dive into the research topics of 'A spatial-spectral multi-view contrastive learning framework for scene representation learning'. Together they form a unique fingerprint.

Cite this