Masked-enhanced food segment anything model for automatic dietary intake monitoring

Zhongsui Guo, Bahman Javadi, Sonit Singh, Arcot Sowmya

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

Abstract

Dietary management plays a crucial role in maintaining long-term health, preventing diseases and aiding in recovery, particularly amidst the increasing prevalence of chronic conditions such as hypertension, cardiovascular diseases and diabetes. To achieve best practice in dietary management, providing insights into calorie intake based on the food items that people are eating is pivotal. In this paper, we propose a methodology to achieve this in three phases: instance segmentation, volume estimation and calorie calculation. Building on the foundations of Food Segment Anything Model (FoodSAM), we propose Masked-enhanced FoodSAM (MFoodSAM) focussing on enhancing semantic and instance masks and providing better food identification. MFoodSAM incorporates ResNet50 as a background model, improving its ability to separate non-food elements and reducing dependency on merging thresholds. We follow a coarse-to-fine segmentation approach to create detailed and semantically enriched masks, providing accurate instance and panoptic segmentation masks. After we accurately identify food items, we estimate food volume by calculating the pixel count from the food instance masks and combine this data with information from the USDA National Nutrient Database and NutritionData to compute the caloric content of the food. We applied the proposed method on both a public and a private food dataset, and obtained state-of-the-art results, with mean IoU of 87.99; mean Accuracy of 95.55; average Acc of 94.85 on MyFood dataset, and mean IoU of 82.38; mean Acc of 92.85; average Acc of 94.66 on UNSW-Food dataset. This intelligent food monitoring system provides users with precise caloric information, facilitating more effective nutrition monitoring and contributing to improved health outcomes.
Original languageEnglish
Title of host publicationProceedings of the 25th International Conference on Digital Image Computing: Techniques and Applications (DICTA 2024), Perth, Australia, 27-29 November 2024
Place of PublicationU.S.
PublisherIEEE
Pages190-197
Number of pages8
ISBN (Electronic)9798350379037
DOIs
Publication statusPublished - 2024
EventDICTA (Conference) - Perth, Australia
Duration: 27 Nov 202429 Nov 2024
Conference number: 25th

Conference

ConferenceDICTA (Conference)
Country/TerritoryAustralia
CityPerth
Period27/11/2429/11/24

Notes

WIP LB

Keywords

  • calorie estimation
  • convolutional neural networks
  • deep learning
  • Food monitoring
  • food volume estimation
  • semantic segmentation

Fingerprint

Dive into the research topics of 'Masked-enhanced food segment anything model for automatic dietary intake monitoring'. Together they form a unique fingerprint.

Cite this