In the field of machine learning, semi-supervised learning (SSL) plays an important role, especially when there are few labelled observations. In scenarios where labelled data is difficult or costly to obtain, SSL provides an innovative way to lever age both labelled and unlabelled data. Compared to traditional supervised learning, this paradigm integrates unlabelled data with scarce, labelled data, making it possible for models to generalize and make predictions more accurately. In spite of a scarcity of labelled data, SSL’s effectiveness has been demonstrated in a number of domains. As an example, SSL enhances the ability of computer vision models to identify and categorize objects with a limited labelled dataset by making use of a large number of unlabelled images. With SSL techniques, natural language processing can analyze and classify text better, even if there are few labelled examples. SSL has also shown promise in the analysis of medical imaging, the detection of diseases, and the prediction of patient outcomes in healthcare, where acquiring labelled medical data can be constrained by privacy concerns and the need for ex pert annotations. When labelled data is scarce, techniques like few-shot learning, active learning, and transfer learning can be used to build accurate classifiers by combining limited labelled observations with large amounts of unlabelled data. Classifiers can only be justified if they are accurate. Therefore, it is necessary to develop algorithms that can be trained using a small set of labelled data while maintaining classification accuracy. We aim to develop new novel or extend existing active learning, few-shot learning and transfer learning techniques that can help us classify unlabelled data even when there are few labels and the data is manifold-distributed. Our active learning method, developed over limited labelled datasets that are manifold-distributed, will be used to identify unlabelled samples form annual labeling by a human expert, which will significantly enhance the accuracy of a classifier when these manually labelled data is added to the training data. Through our few-shot learning technique, we intend to classify limited labelled datasets that are manifold-distributed to achieve our ultimate goal of developing accurate few-shot learning classifiers for such datasets. Lastly, we intend to develop novel transfer learning techniques for limited la belled datasets that are manifold-distributed. The transfer learning techniques will be capable of assessing the suitability of a dataset for transfer learning before it is applied. Our ultimate objective is to construct accurate classifiers for limited la belled datasets using transfer learning.
| Date of Award | 2025 |
|---|
| Original language | English |
|---|
| Awarding Institution | - Western Sydney University
|
|---|
| Supervisor | Laurence Park (Supervisor) & Oliver Obst (Supervisor) |
|---|
Semi-supervised learning with a limited number of labelled observations
Qayyumi, S. W. (Author). 2025
Western Sydney University thesis: Doctoral thesis