E-Book, Englisch, Band 15106, 486 Seiten, eBook
18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLVIII
E-Book, Englisch, Band 15106, 486 Seiten, eBook
Reihe: Lecture Notes in Computer Science
ISBN: 978-3-031-73195-2
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
SmartControl: Enhancing ControlNet for Handling Rough Visual Conditions.- InterFusion: Text-Driven Generation of 3D Human-Object Interaction.- GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval.- DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving.- Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition.- NeRF-XL: NeRF at Any Scale with Multi-GPU.- CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems.- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?.- Compositional Substitutivity of Visual Reasoning for Visual Question Answering.- LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models.- DNI: Dilutional Noise Initialization for Diffusion Video Editing.- Two-Stage Video Shadow Detection via Temporal-Spatial Adaption.- Towards Physical World Backdoor Attacks against Skeleton Action Recognition.- SAM-guided Graph Cut for 3D Instance Segmentation.- Fully Authentic Visual Question Answering Dataset from Online Communities.- Active Generation for Image Classification.- FuseTeacher: Modality-fused Encoders are Strong Vision Supervisors.- Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes.- Understanding Multi-compositional learning in Vision and Language models via Category Theory.- FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients.- Panel-Specific Degradation Representation for Raw Under-Display Camera Image Restoration.- Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image.- Diffusion-Guided Weakly Supervised Semantic Segmentation.- Weakly-Supervised Spatio-Temporal Video Grounding with Variational Cross-Modal Alignment.- When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset.- NVS-Adapter: Plug-and-Play Novel View Synthesis from a Single Image.- Segment and Recognize Anything at Any Granularity.