Authors:
Bruno G. Ferreira
1
;
2
;
3
;
4
;
Armando Sousa
2
;
3
and
Luis Paulo Reis
4
;
3
Affiliations:
1
Edge Innovation Center, Federal University of Alagoas, Maceió, Brazil
;
2
INESC TEC - INESC Technology and Science, Porto, Portugal
;
3
FEUP - Faculty of Engineering of the University of Porto, Porto, Portugal
;
4
LIACC - Artificial Intelligence and Computer Science Laboratory, Porto, Portugal
Keyword(s):
Semantic Segmentation, Robotic Indoor Mapping, Semantic Mapping, Segment Anything Model (SAM), Performance Evaluation, Computer Vision, FastSAM, SAM2, MobileSAM.
Abstract:
Semantic segmentation is a relevant process for creating the rich semantic maps required for indoor navigation by autonomous robots. While foundation models like Segment Anything Model (SAM) have significantly advanced the field by enabling object segmentation without prior references, selecting an efficient variant for real-time robotics applications remains a challenge due to the trade-off between performance and accuracy. This paper evaluates three such variants - FastSAM, MobileSAM, and SAM 2 - comparing their speed and accuracy to determine their suitability for semantic mapping tasks. The models were assessed within the Robot@VirtualHome dataset across 30 distinct scenes, with performance quantified using Frames Per Second (FPS), Precision, Recall, and an Over-Segmentation metric, which quantifies the fragmentation of an object into multiple masks, preventing high quality semantic segmentation. The results reveal distinct performance profiles: FastSAM achieves the highest speed
but exhibits poor precision and significant mask fragmentation. Conversely, SAM 2 provides the highest precision but is computationally intensive for real-time applications. MobileSAM emerges as the most balanced model, delivering high recall, good precision, and viable processing speed, with minimal over-segmentation. We conclude that MobileSAM offers the most effective trade-off between segmentation quality and efficiency, making it a good candidate for indoor semantic mapping in robotics.
(More)