Exploring the Application Boundaries of Lightweight Multimodal Models

Rui Wang

2025

Abstract

With the rapid development of Multimodal Large Language Models (MLLMs), models have shown unprecedented potential in graph comprehension and generation tasks. However, current research mostly focuses on the performance breakthroughs of giga-parameter-level models, and lacks systematic exploration of the capability boundaries of lightweight models (with parameters less than or equal to 3B) in real-world application scenarios. This paper systematically explores the performance boundaries of the lightweight MLLM in different application scenarios based on the Bunny-v1_0-2B-zh open source model. By designing a multi-dimensional evaluation framework covering core capabilities such as text comprehension, visual reasoning, and revealing the practical application potential of the 2 billion parametric scale model. Experiments show that the model can achieve excellent performance in resource-constrained scenarios such as mobile device deployment, but has shortcomings in complex logical reasoning and specialized domain knowledge. This study provides empirical evidence for the selection of applicable scenarios and optimization direction of the lightweight model.

Download


Paper Citation


in Harvard Style

Wang R. (2025). Exploring the Application Boundaries of Lightweight Multimodal Models. In Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE; ISBN 978-989-758-765-8, SciTePress, pages 476-480. DOI: 10.5220/0013699600004670


in Bibtex Style

@conference{icdse25,
author={Rui Wang},
title={Exploring the Application Boundaries of Lightweight Multimodal Models},
booktitle={Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE},
year={2025},
pages={476-480},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013699600004670},
isbn={978-989-758-765-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE
TI - Exploring the Application Boundaries of Lightweight Multimodal Models
SN - 978-989-758-765-8
AU - Wang R.
PY - 2025
SP - 476
EP - 480
DO - 10.5220/0013699600004670
PB - SciTePress