Visual-Text Multimodal Large Language Models | Views : 0
下载量:
236
CSCD:
0
TextLLM: a document multimodal large model based on dynamic resolution
- “The latest research breakthrough proposes a dynamic resolution based multimodal document model, TextLLM, which can process high-resolution document images without the need for OCR tools, significantly improving document understanding performance.”
- Vol. 30, Issue 9, Pages: 3068-3082(2025)
Received:16 October 2024,
Revised:2025-01-17,
Accepted:18 February 2025,
Published:16 September 2025
DOI: 10.11834/jig.240608
移动端阅览