Abstract: Large language models (LLMs) have drawn a lot of attention due to their exceptional comprehension and reasoning capabilities. The development of LLM methods are leading to countless ...
Abstract: Recent progress in document retrieval increasingly leverages vision-language models that bypass OCR by jointly encoding both visual and textual content. These models capture layout, font, ...