Overview
Objectives
This course provides practical experience in managing Generative AI systems and develops skills to assess Large Language models. It focuses on the deployment, monitoring, optimization, and scaling of Generative AI applications in production using Generative AI Operations (GenAIOps).
A light touch on Machine Learning Operations (MLOps) ensures foundational understanding before diving into LLM-specific challenges such as performance tuning, cost efficiency, observability, retrieval augmented generation (RAG), LLM Agents, and prompt management.
Participants will gain experience with common (Commercial and open-source) cloud-based LLM tooling and platforms, ensuring they can build scalable, efficient, and cost-effective LLM applications.
Audience
Learning outcomes
This course offers participants key competencies in deploying, monitoring, optimizing, and scaling large language models (LLMs) within operational environments.
Here are the key competencies provided by the course:
- Foundational Knowledge of MLOps and LLMOps: Understanding the lifecycle of LLMs including pre-training, fine-tuning, inference, and monitoring; distinguishing between MLOps and LLMOps.
- Deployment Strategies for LLMs: Learning various hosting options, tackling scaling issues like model parallelism, and exploring cost optimization strategies such as caching.
- Prompt Engineering and Management: Gaining skills in prompt engineering for enhanced efficiency, accuracy, and cost-effectiveness; managing and versioningÌýprompts to ensure consistency and maintainability.
- Performance and Cost Optimization: Techniques to optimize inference speed and reduce operational costs.
- Observability and Monitoring: Setting up systems to track LLM performance metrics such as latency and token usage, detecting errors, and implementing feedback loops mechanisms.
Programme
Day 1. ÌýGenAIOps/LLMOps Fundamentals
1. Introduction to MLOps and LLMOps
- MLOps vs. GenAIOps/LLMOps: Key differences and challenges
- LLM lifecycle: Pre-training, fine-tuning, inference, and monitoring
- Challenges in deploying LLMs: Scalability, latency, cost, observability
Ìý
2. GenAIOps/LLMOps Lifecycle
- Foundational models
- Fine Tuning
- LLM Deployment strategies and infrastructure
- Scaling challenges: Model parallelism, quantization, and distillation
- Cost optimization strategies: Caching, batch inference, serverless deployments, Optimizing inference speed
Ìý
3. Hands-on Session: LLM Finetuning
- FineTune a small LLM
Ìý
4. Prompt Engineering
- Prompt Engineering Basics
- Optimizing prompts for efficiency, accuracy, and cost
- Structuring prompts for different use cases
- Fine-tuning vs. prompt engineering: When to use which
Ìý
5. Prompt Management
- Why prompt management matters: Consistency, scalability, and maintainability
- Versioning and tracking prompts: Best practices
- Using prompt management tools
- A/B testing prompts: Measuring effectiveness and iterating
Ìý
6. Hands-on Session: Prompts Engineering & Management
- Experimenting with different prompt engineering techniques and tools
Ìý
Day 2. Advanced LLMOps
1. Retrieval augmented generation (RAG)
- What is RAG and how does it work?
- Exploring the different types of RAG
- Common issues in RAG solutions
- When to use Fine tuning vs prompt engineering vs RAG
Ìý
2. Agentic workflows
- What are LLM Agents?
- Key components of LLM Agent with an overview on opensource frameworks such as LangChain and LlamaIndex
- Use cases
3. Hands-on: Building a GenAI Application
- Building a RAG and Agent application
- Implementing monitoring & logging for performance tracking
- Optimizing deployment for cost and scalability
Ìý
4. Observability, Monitoring, and Feedback Loops
- Tracking LLM performance: Latency, token usage, response quality
- Detecting hallucinations and errors
- Implementing feedback loop mechanisms
Ìý
5. Security, Compliance, and Responsible AI
- Data privacy in LLMOps: PII redaction, secure API handling
- Regulatory compliance
- Bias detection and mitigation strategies
Registration
Registration deadline
Fees:
- CHFÌý1300.- for the full course
- CHF 650.- (reduced rate)
- An additional CHF 200.- for the Micro-credential
Curriculum
Period
Microcertification
Date(s)
Description
1. Introduction to MLOps and LLMOps
- MLOps vs. GenAIOps/LLMOps: Key differences and challenges
- LLM lifecycle: Pre-training, fine-tuning, inference, and monitoring
- Challenges in deploying LLMs: Scalability, latency, cost, observability
Ìý
2. GenAIOps/LLMOps Lifecycle
- Foundational models
- Fine Tuning
- LLM Deployment strategies and infrastructure
- Scaling challenges: Model parallelism, quantization, and distillation
- Cost optimization strategies: Caching, batch inference, serverless deployments, Optimizing inference speed
Ìý
3. Hands-on Session: LLM Finetuning
- FineTune a small LLM
Ìý
4. Prompt Engineering
- Prompt Engineering Basics
- Optimizing prompts for efficiency, accuracy, and cost
- Structuring prompts for different use cases
- Fine-tuning vs. prompt engineering: When to use which
Ìý
5. Prompt Management
- Why prompt management matters: Consistency, scalability, and maintainability
- Versioning and tracking prompts: Best practices
- Using prompt management tools
- A/B testing prompts: Measuring effectiveness and iterating
Ìý
6. Hands-on Session: Prompts Engineering & Management
- Experimenting with different prompt engineering techniques and tools
Speakers
Date(s)
Description
1. Retrieval augmented generation (RAG)
- What is RAG and how does it work?
- Exploring the different types of RAG
- Common issues in RAG solutions
- When to use Fine tuning vs prompt engineering vs RAG
Ìý
2. Agentic workflows
- What are LLM Agents?
- Key components of LLM Agent with an overview on opensource frameworks such as LangChain and LlamaIndex
- Use cases
3. Hands-on: Building a GenAI Application
- Building a RAG and Agent application
- Implementing monitoring & logging for performance tracking
- Optimizing deployment for cost and scalability
Ìý
4. Observability, Monitoring, and Feedback Loops
- Tracking LLM performance: Latency, token usage, response quality
- Detecting hallucinations and errors
- Implementing feedback loop mechanisms
Ìý
5. Security, Compliance, and Responsible AI
- Data privacy in LLMOps: PII redaction, secure API handling
- Regulatory compliance
- Bias detection and mitigation strategies
Intervenant-es
PhD Hisham MOHAMED,ÌýAV¶ÌÊÓÆµ
Hisham is an AI and machine learning expert with over 10 years of experience in machine learning, software engineering, and big data. With a PhD in Computer Science from the AV¶ÌÊÓÆµ, he has led high-impact projects and built and managed diverse teams.
Hisham has deep experience in deploying and scaling AI systems. In this session, he will focus on GenAIOps/LLMOps, sharing insights on managing, optimizing, and operationalizing large language models in real-world applications.
Director(s)
Prof. Giovanna DI MARZO SERUGENDO, Centre universitaire d'informatique (CUI), AV¶ÌÊÓÆµ