Abstract
Ad hoc table retrieval refers to the task of performing semantic matching between given queries and candidate tables. In recent years, the approach to addressing this retrieval task has undergone significant shifts, transitioning from utilizing hand-crafted features to leveraging the power of Pre-Trained Language Models (PLMs). However, key challenges arise when candidate tables contain shared items, and/or queries may refer to only a subset of table items rather than the entire one. Existing models often struggle to distinguish the most informative items and fail to accurately identify the relevant items required to match with the query. To bridge this gap, we propose C onditional O ptimal T ransport based table retrievER (COTER). The proposed algorithm is characterized by simplifying candidate tables, where the semantic meaning of one or several words (from the original table) is enabled to be effectively "transported'' to individual words (from the simplified table), under the prior condition of the query. COTER achieves two essential goals simultaneously: minimizing the semantic loss during the table simplification and ensuring that retained items from simplified tables effectively match the given query. Importantly, the theoretical foundation of COTER empowers it to adapt dynamically to different queries and enhances the overall performance of the table retrieval. Experiments on two popular Web-Table retrieval benchmarks show that COTER can effectively identify informative table items without sacrificing retrieval accuracy. This leads to the new state-of-The-Art with substantial gains of up to 0.48 absolute Mean Average Precision (MAP) points, compared to the previously reported best result.
Original language | English |
---|---|
Title of host publication | WSDM ’24: Proceedings of the 17th ACM International Conference on Web Search and Data Mining |
Place of Publication | U.S. |
Publisher | Association for Computing Machinery |
Pages | 911-919 |
Number of pages | 9 |
ISBN (Electronic) | 9798400703713 |
DOIs | |
Publication status | Published - Mar 2024 |
Event | International Conference on Web Search & Data Mining - Merida, Mexico Duration: 4 Mar 2024 → 8 Mar 2024 Conference number: 17th |
Conference
Conference | International Conference on Web Search & Data Mining |
---|---|
Country/Territory | Mexico |
City | Merida |
Period | 4/03/24 → 8/03/24 |
Keywords
- conditional optimal transport
- semantic matching
- table representation
- table retrieval
- web search