Abstract: To effectively reduce the visual tokens in Visual Large Language Models (VLLMs), we propose a novel approach called Wi ndow Token Co ncatenation (WiCo). Specifically, we employ a sliding ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results