Abstract
Despite numerous scene representation learning methods improving the quality of novel view synthesis task and demonstrating better generalization capabilities in new scenarios, several pivotal challenges persist that call for enhancements. Firstly, representation models based on deep neural networks entail significant computational demands, especially when acquiring prior knowledge from large datasets. Secondly, even if traditional spatial domain models converge to ideal solutions, learning paradigms inherently exhibit inconsistencies in detail and edge information in the spectral domain. To address the aforementioned issues, our Spatial-spectral Multi-view Contrastive Learning (SMCL) framework integrates modules specifically tailored for processing different scene data. It employs operations such as ray-based multi-head self attention mechanism, cross-domain enhancement and dual-domain convolution to update representation features. In addition, to constrain the training process of SMCL, we also introduce two specific loss functions. The cross-domain contrastive loss is designed to ensure that the fast Fourier transform can efficiently extract edge features in the spectral domain at low computational cost, while the multi-view frequency loss preserves the consistency of feature representation learning across multi-view and coordinated frequency responses. Finally, high-quality novel views are synthesized from the constructed scene representations through improved neural rendering. Experimental results demonstrate that SMCL surpasses current state-of-the-art models on four popular datasets. The code and data can be found at: https://github.com/jubo-neu/SMCL.
| Original language | English |
|---|---|
| Pages (from-to) | 103889 |
| Number of pages | 21 |
| Journal | Information Fusion |
| Volume | 127 |
| DOIs | |
| Publication status | Published - 2026 |
| Externally published | Yes |
Keywords
- Representation learning
- Novel view synthesis
- Spectral information
- Fast Fourier transform
- Cross-domain fusion
Fingerprint
Dive into the research topics of 'A spatial-spectral multi-view contrastive learning framework for scene representation learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver