Buch, Englisch, 458 Seiten, Format (B × H): 155 mm x 235 mm, Gewicht: 715 g
13th International Conference, CVM 2025, Hong Kong SAR, China, April 19-21, 2025, Proceedings, Part III
Buch, Englisch, 458 Seiten, Format (B × H): 155 mm x 235 mm, Gewicht: 715 g
Reihe: Lecture Notes in Computer Science
ISBN: 978-981-965814-5
Verlag: Springer Nature Singapore
This book constitutes the refereed proceedings of CVM 2025, the 13th International Conference on Computational Visual Media, held in Hong Kong SAR, China, in April 2025.
The 67 full papers were carefully reviewed and selected from 335 submissions. The papers are organized in topical sections as follows:
Part I: Medical Image Analysis, Detection and Recognition, Image Enhancement and Generation, Vision Modeling in Complex Scenarios
Part II: 3D Geometry and Rendering, Generation and Editing, Image Processing and Optimization
Part III: Image and Video Analysis, Multimodal Learning, Geometrical Processing, Applications
Zielgruppe
Research
Autoren/Hrsg.
Fachgebiete
- Mathematik | Informatik EDV | Informatik Angewandte Informatik
- Mathematik | Informatik EDV | Informatik Informatik Künstliche Intelligenz Computer Vision
- Mathematik | Informatik EDV | Informatik Informatik Künstliche Intelligenz Mustererkennung, Biometrik
- Mathematik | Informatik EDV | Informatik Programmierung | Softwareentwicklung Algorithmen & Datenstrukturen
- Mathematik | Informatik EDV | Informatik Programmierung | Softwareentwicklung Grafikprogrammierung
Weitere Infos & Material
Image and Video Analysis
DepthFisheye: Efficient Fine-Tuning of Depth Estimation Models for Fisheye Cameras.- DIMATrack: Dimension Aware Data Association for Multi-Object Tracking.- Efficient Transformer Network for Visible and Ultraviolet Object Tracking.- LightGR-Transformer: Light Grouped Residual Transformer for Multispectral Object Detection.- ADMMOA: Attribute-Driven Multimodal Optimization for Face Recognition Adversarial Attacks.- Training-Free Language-Guided Video Summarization via Multi-Grained Saliency Scoring.-
Multimodal Learning
Reinforced Label Denoising for Weakly-Supervised Audio-Visual Video Parsing.- Bridging the Modality Gap: Advancing Multimodal Human Pose Estimation with Modality-Adaptive Pose Estimator and Novel Benchmark Datasets.- Momentum-Based Uni-Modal Soft-Label Alignment and Multi-Modal Latent Projection Networks for Optimizing Image-Text Retrieval.- Multi-Granularity and Multi-Modal Prompt Learning for Person Re-Identification.- Local and Global Feature Cross-attention Multimodal Place Recognition.- IML-CMM - A Multimodal Sentiment Analysis Framework Integrating Intra-Modal Learning and Cross-Modal Mixup Enhancement.-
Geometrical Processing
MCFG with GUMAP: A Simple and Effective Clustering Framework on Grassmann Manifold.- Joint UMAP for Visualization of Time-Dependent Data.- Unsupervised Domain Adaptation on Point Cloud Classification via Imposing Structural Manifolds into Representation Space.-
Applications
Learning Adaptive Basis Fonts to Fuse Content Features for Few-shot Font Generation.- TaiCrowd: A High-Performance Simulation Framework for Massive Crowd.-Feature Disentanglement and Fusion Model for Multi-Source Domain Adaptation with Domain-Specific Features.- A Trademark Retrieval Method Based on Self-Supervised Learning.- Weaken Noisy Feature: Boosting Semi-Supervised Learning by Noise Estimation.- Multi-Dimension Full Scene Integrated Visual Emotion Analysis Network.- Gap-KD: Bridging the Significant Capacity Gap Between Teacher and Student Model.