TY - GEN
T1 - Learning out-of sample mapping in non-vectorial data reduction using constrained twin kernel embedding
AU - Guo, Yi
AU - Gao, Junbin
AU - Kwan, Paul W.
PY - 2007
Y1 - 2007
N2 - Twin Kernel Embedding (TKE) is a powerful non-vectorial data reduction algorithm proposed recently for advanced applications in clustering and visualization, manifold learning, etc. Due to the requirement of online processing in many cutting edge research problems involving highly structured data like DNA, protein sequences and biometric features that are non-vectorial in nature, learning the out-of-sample (OOS) mapping becomes a necessity. To address this, we propose Constrained TKE, which is an OOS extension of TKE capable of learning such a mapping function. This is achieved by including the mapping in the objective function optimized by the TKE algorithm. More broadly, this mapping function can be applied in other data reduction methods as an OOS extension. Furthermore, to improve the accuracy of predictions in case where new samples are presented in batch, a refinement strategy is introduced by exploiting the similarity between new samples which is often ignored by other methods. Experimental results on the Reuters-21578 text collection confirmed the usefulness of the proposed method.
AB - Twin Kernel Embedding (TKE) is a powerful non-vectorial data reduction algorithm proposed recently for advanced applications in clustering and visualization, manifold learning, etc. Due to the requirement of online processing in many cutting edge research problems involving highly structured data like DNA, protein sequences and biometric features that are non-vectorial in nature, learning the out-of-sample (OOS) mapping becomes a necessity. To address this, we propose Constrained TKE, which is an OOS extension of TKE capable of learning such a mapping function. This is achieved by including the mapping in the objective function optimized by the TKE algorithm. More broadly, this mapping function can be applied in other data reduction methods as an OOS extension. Furthermore, to improve the accuracy of predictions in case where new samples are presented in batch, a refinement strategy is introduced by exploiting the similarity between new samples which is often ignored by other methods. Experimental results on the Reuters-21578 text collection confirmed the usefulness of the proposed method.
KW - Dimensionality reduction
KW - Out-of-sample
KW - TKE
UR - http://www.scopus.com/inward/record.url?scp=37849027343&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2007.4370108
DO - 10.1109/ICMLC.2007.4370108
M3 - Conference Paper
AN - SCOPUS:37849027343
SN - 142440973X
SN - 9781424409730
T3 - Proceedings of the Sixth International Conference on Machine Learning and Cybernetics, ICMLC 2007
SP - 19
EP - 24
BT - Proceedings of the Sixth International Conference on Machine Learning and Cybernetics, ICMLC 2007
T2 - 6th International Conference on Machine Learning and Cybernetics, ICMLC 2007
Y2 - 19 August 2007 through 22 August 2007
ER -