E-Book, Englisch, Band 15108, 485 Seiten, eBook
18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part L
E-Book, Englisch, Band 15108, 485 Seiten, eBook
Reihe: Lecture Notes in Computer Science
ISBN: 978-3-031-72973-7
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
Revisit Human-Scene Interaction via Space Occupancy.- Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control.- WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model.- Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning.- Mitigating Background Shift in Class-Incremental Semantic Segmentation.- Relation DETR: Exploring Explicit Position Relation Prior for Object Detection.- BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation.- Agent Attention: On the Integration of Softmax and Linear Attention.- Learning by Aligning 2D Skeleton Sequences and Multi-Modality Fusion.- Resolving Scale Ambiguity in Multi-view 3D Reconstruction using Dual-Pixel Sensors.- Object-Oriented Anchoring and Modal Alignment in Multimodal Learning.- Towards Stable 3D Object Detection.- FYI: Flip Your Images for Dataset Distillation.- On-the-fly Category Discovery for LiDAR Semantic Segmentation.- Dual-Camera Smooth Zoom on Mobile Phones.- ProtoComp: Diverse Point Cloud Completion with Controllable Prototype.- CONDA: Condensed Deep Association Learning for Co-Salient Object Detection..- Cascade Prompt Learning for Visual-Language Model Adaptation.- PolyRoom: Room-aware Transformer for Floorplan Reconstruction.- BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models.- SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution.- HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras.- Hierarchical Unsupervised Relation Distillation for Source Free Domain Adaptation.- Customized Generation Reimagined: Fidelity and Editability Harmonized.- AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors.- Improving Video Segmentation via Dynamic Anchor Queries.- Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights.