Abstract: The ability to perceive emotions is an important criterion for judging whether a machine is intelligent. To this end, a large number of emotion recognition algorithms have been developed ...
Abstract: Extending large image-text pre-trained models (e.g., CLIP) for video understanding has made significant advancements. To enable the capability of CLIP to perceive dynamic information in ...