Leveraging vision-language embeddings for zero-shot learning in histopathology images

Md Mamunur Rahaman, Ewan K.A. Millar, Erik Meijering

Research output: Contribution to journalArticlepeer-review

11 Downloads (Pure)

Abstract

Zero-shot learning (ZSL) offers tremendous potential for histopathology image analysis, enabling models to generalize to unseen classes without extensive labeled data. Recent vision-language model (VLM) advancements have expanded ZSL capabilities, allowing task performance without task-specific fine-tuning. However, applying VLMs to histopathology presents considerable challenges due to the complexity of histopathological imagery and the nuanced nature of diagnostic tasks. We propose Multi-Resolution Prompt-guided Hybrid Embedding (MR-PHE), a novel framework for zero-shot histopathology image classification. MR-PHE mimics pathologists' workflow through multiresolution patch extraction to capture key cellular and tissue features. It introduces a hybrid embedding strategy that integrates global image embeddings with weighted patch embeddings, effectively combining local and global contextual information. Additionally, we develop a comprehensive prompt generation and selection framework, enriching class descriptions with domain-specific synonyms and clinically relevant features to enhance semantic understanding. A similarity-based patch weighting mechanism assigns attention-like weights to patches based on their relevance to class embeddings, emphasizing diagnostically important regions during classification. Experimental results demonstrate MR-PHE significantly improves zero-shot classification performance on histopathology datasets, often surpassing fully supervised models, showing its effectiveness and potential to advance computational pathology.

Original languageEnglish
Number of pages12
JournalIEEE Journal of Biomedical and Health Informatics
DOIs
Publication statusE-pub ahead of print (In Press) - 2025

Keywords

  • Computational Pathology
  • Histopathology
  • Hybrid Embedding
  • Prompt Generation
  • Vision-Language Models (VLMs)
  • Zero-Shot Learning

Fingerprint

Dive into the research topics of 'Leveraging vision-language embeddings for zero-shot learning in histopathology images'. Together they form a unique fingerprint.

Cite this