Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2025/htm…
CVPR 2025 Open Access Repository
Building on this foundation, we introduce \ours, a novel approach that identifies and aligns the most relevant visual and textual representations in a modular manner.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2023/pap…
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
In this work, we propose Mod-Squad, a modular multi-task learner based on mixture-of-experts and a novel loss to address the gradient conflicts among tasks. We demonstrate its potential to scale up in both model capacity and target task numbers while keeping the computation cost low.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/WACV2024/pap…
MOPA: Modular Object Navigation With PointGoal Agents
We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to sys-tematically investigate the inherent modularity of the object navigation task in Embodied AI.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2024/pap…
PARA-Drive: Parallelized Architecture for Real-time Autonomous Driving
End-to-End Yet Modular Architecture distinguishes it-self from traditional and end-to-end planning approaches by integrating modular design with end-to-end training. As a result, it maintains safety and interpretability, while simul-taneously optimizing all modules for downstream planning.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/ICCV2025/htm…
ICCV 2025 Open Access Repository
This necessitates the creation of modular and scalable ways to teach VLMs about physical reasoning. To that end, we introduce Physics Context Builders (PCBs), a modular framework where specialized smaller VLMs are fine-tuned to generate detailed physical scene descriptions.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2024/pap…
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering
The modular architecture of MoReVQA allows it to be dynamically tailored to a wide range of datasets, question types, and tasks by selectively engaging different APIs and reasoning strategies based on the task at hand.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2025/pap…
SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Building on this foundation, we introduce SmartCLIP, a novel approach that identifies and aligns the most relevant visual and textual representations in a modular manner.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2024/htm…
CVPR 2024 Open Access Repository
Modular Blind Video Quality Assessment Wen Wen, Mu Li, Yabin Zhang, Yiting Liao, Junlin Li, Li Zhang, Kede Ma; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 2763-2772
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2025/htm…
CVPR 2025 Open Access Repository
We design modular adapter consisting of a functional adapter and a representation descriptor. The representation descriptors are trained as a distribution shift indicator and used to trigger self-expansion signals.
Global web icon
thecvf.com
https://openaccess.thecvf.com/content/CVPR2024/pap…
Orthogonal Adaptation for Modular Customization of Diffusion Models
In this paper, we ad-dress a new problem called Modular Customization, with the goal of efficiently merging customized models that were fine-tuned independently for individual concepts.