Buch, Englisch, 304 Seiten, Format (B × H): 178 mm x 254 mm
LLM Deployment, Fine-Tuning, and Application
Buch, Englisch, 304 Seiten, Format (B × H): 178 mm x 254 mm
ISBN: 978-1-041-09000-7
Verlag: CRC Press
From fundamental concepts to advanced implementations, this book thoroughly explores the DeepSeek-V3 model, focusing on its Transformer-based architecture, technological innovations, and applications.
The book begins with a thorough examination of theoretical foundations, including self-attention, positional encoding, the Mixture of Experts mechanism, and distributed training strategies. It then explores DeepSeek-V3’s technical advancements, including sparse attention mechanisms, FP8 mixed-precision training, and hierarchical load balancing, which optimize memory and energy efficiency. Through case studies and API integration techniques, the model's high-performance capabilities in text generation, mathematical reasoning, and code completion are examined. The book highlights DeepSeek’s open platform and covers secure API authentication, concurrency strategies, and real-time data processing for scalable AI applications. Additionally, the book addresses industry applications, such as chat client development, utilizing DeepSeek’s context caching and callback functions for automation and predictive maintenance.
This book is aimed primarily at AI researchers and developers working on large-scale AI models. It is an invaluable resource for professionals seeking to understand the theoretical underpinnings and practical implementation of advanced AI systems, particularly those interested in efficient, scalable applications.
Zielgruppe
Academic, Postgraduate, Professional Practice & Development, Professional Reference, Undergraduate Advanced, and Undergraduate Core
Autoren/Hrsg.
Fachgebiete
Weitere Infos & Material
Part I: Theoretical Foundations and Technical Architecture of Generative AI 1. Core Principles of Transformer and Attention Mechanisms 2 DeepSeek-V3 Core Architecture and its Training Techniques in Detail 3 Introduction to DeepSeek-V3 Model-Based Development Part II: Development and Application of Generative AI and Advanced Prompt Design 4. A First Look at the DeepSeek-V3 Big Model 5. DeepSeek Open Platform and API Development Details 6. Dialogue Generation, Code Completion, and Customized Model Development 7. Conversation Prefix Completion, FIM and JSON Output Development Details 8. Callback Functions and Contextual Disk Caching 9. The DeepSeek Prompt Library: Exploring More Possibilities for Prompts Part III: Integration of Practical Experience and Advanced Applications 10. Integration Practice 1: LLM-Based Chat Client Development 11. Integration Hands-On 2: AI Assisted Development 12. Integration Practice 3: Assisted Programming Plugin Development Based on VS Code