TY - JOUR
T1 - Cross-domain lithology identification using active learning and source reweighting
AU - Chang, Ji
AU - Kang, Yu
AU - Li, Zerui
AU - Zheng, Wei Xing
AU - Lv, Wenjun
AU - Feng, De-Yong
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2022
Y1 - 2022
N2 - Cross-domain lithology identification (CDLI) is a common case in lithology identification, which aims to train a machine learning model using the logging data of an interpreted well to predict the lithology of another uninterpreted well. Compared with the general lithology identification problem, the CDLI problem is more challenging for two reasons: the data distribution shift between the wells, and the expensive label acquisition on the uninterpreted well. To tackle these issues, we propose a novel framework that embeds active learning (AL) and domain adaptation into lithology identification. The proposed framework is composed of two components: an AL algorithm that selects the most uncertain and diverse target samples to query their real labels, and a source reweighting method that leverages the target labels to reduce data distribution discrepancy. Experimental results on two real-world data sets demonstrate that the proposed method can more effectively suppress the performance degradation caused by the data distribution shift than the baselines, with fewer target label queries.
AB - Cross-domain lithology identification (CDLI) is a common case in lithology identification, which aims to train a machine learning model using the logging data of an interpreted well to predict the lithology of another uninterpreted well. Compared with the general lithology identification problem, the CDLI problem is more challenging for two reasons: the data distribution shift between the wells, and the expensive label acquisition on the uninterpreted well. To tackle these issues, we propose a novel framework that embeds active learning (AL) and domain adaptation into lithology identification. The proposed framework is composed of two components: an AL algorithm that selects the most uncertain and diverse target samples to query their real labels, and a source reweighting method that leverages the target labels to reduce data distribution discrepancy. Experimental results on two real-world data sets demonstrate that the proposed method can more effectively suppress the performance degradation caused by the data distribution shift than the baselines, with fewer target label queries.
UR - http://hdl.handle.net/1959.7/uws:61006
U2 - 10.1109/LGRS.2020.3041960
DO - 10.1109/LGRS.2020.3041960
M3 - Article
SN - 1545-598X
VL - 19
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
ER -