Abstract: Weakly supervised video anomaly detection aims to locate abnormal activities in untrimmed videos without the need for frame-level supervision. Prior work has utilized graph convolution ...
Video Dense Caption: PPLLaVA can effectively balance the content, state, and motion of both the foreground and background, while maintaining detail and accuracy. Multi-turn dialogue and reasoning: ...
A Model Context Protocol (MCP) server that provides a "prompts" primitive for managing and serving customizable prompt templates. This server allows you to create, organize, and serve prompt templates ...
Abstract: Pre-Trained vision-language models, like CLIP, make few-shot action recognition possible via text prompt. However, teaching scenarios are complex and CLIP has difficulties in understanding ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results