Short course Managing and Evaluating Generative AI systems 2025

Information
Period
5 June 2025 - 6 June 2025EQF-level
Language
EnglishRegistration
Registration deadline
23 May 2025Fees:
- CHFÂ 1300.- for the full course
- CHF 650.- (reduced rate)
- An additional CHF 200.- for the Micro-credential
Contribution to the SDGs
Objectives
This course provides practical experience in managing Generative AI systems and develops skills to assess various Generative AI models. It focuses on the deployment, monitoring, optimization, and scaling of large language models (LLMs) in production using Large Language Model Operations (LLMOps). A light touch on Machine Learning Operations (MLOps) ensures foundational understanding before diving into LLM-specific challenges such as performance tuning, cost efficiency, observability, LLM Agents, and prompt management. Participants will gain experience with common (Commercial and open-source) cloud-based LLM tooling and platforms, ensuring they can build scalable, efficient, and cost-effective LLM applications.
Audience
Learning outcomes
This course offers participants key competencies in deploying, monitoring, optimizing, and scaling large language models (LLMs) within operational environments.
Here are the key competencies provided by the course:
- Foundational Knowledge of MLOps and LLMOps: Understanding the lifecycle of LLMs including pre-training, fine-tuning, inference, and monitoring; distinguishing between MLOps and LLMOps.
- Deployment Strategies for LLMs: Learning various hosting options, tackling scaling issues like model parallelism, and exploring cost optimization strategies such as caching.
- Prompt Engineering and Management: Gaining skills in prompt engineering for enhanced efficiency, accuracy, and cost-effectiveness; managing and versioning prompts to ensure consistency and maintainability.
- Performance and Cost Optimization: Techniques to optimize inference speed and reduce operational costs.
- Observability and Monitoring: Setting up systems to track LLM performance metrics such as latency and token usage, detecting errors, and implementing feedback loops mechanisms.
Programme
Day 1 -Â LLMOps Fundamentals
- Introduction to MLOps and LLMOps
- LLM Deployment Strategies and Infrastructure
- Prompt Engineering and Prompt Management
- Prompt Management in LLMOps
- Hands-on Session: Deploying an LLM and Experimenting with Prompts
Day 2 -Â Advanced LLMOps
- Performance and Cost Optimization in LLMOps
- Observability, Monitoring, and Feedback Loops
- Security, Compliance, and Responsible AI
- Agentic workflows
- Hands-on Capstone: Deploying a Scalable LLM Application
Director(s)
Prof. Giovanna DI MARZO SERUGENDO, Centre universitaire d'informatique (CUI), AV¶ÌÊÓÆµ
Coordinator(s)
Date(s)
Speakers
Description
1. Introduction to MLOps and LLMOps
- MLOps vs. LLMOps: Key differences and challenges
- LLM lifecycle: Pre-training, fine-tuning, inference, and monitoring
- Challenges in deploying LLMs: Scalability, latency, cost, observability
2. LLM Deployment Strategies and Infrastructure
- Hosting options: API-based vs. self-hosted
- Scaling challenges: Model parallelism, quantization, and distillation
- Cost optimization strategies: Caching, batch inference, serverless deployments
3. Prompt Engineering and Prompt Management
- Prompt Engineering Basics
- Optimizing prompts for efficiency, accuracy, and cost
- Structuring prompts for different use cases
- Fine-tuning vs. prompt engineering: When to use which
4. Prompt Management in LLMOps
- Why prompt management matters: Consistency, scalability, and maintainability
- Versioning and tracking prompts: Best practices
- Using prompt management tools
- A/B testing prompts: Measuring effectiveness and iterating
5. Hands-on Session: Deploying an LLM and Experimenting with Prompts
- Deploying a LLM
- Experimenting with different prompt engineering techniques
Date(s)
Description
1. Performance and Cost Optimization in LLMOps
- Optimizing inference speed
- Cost-effective LLM usage: Reducing API calls, caching, batch inference
2. Observability, Monitoring, and Feedback Loops
- Tracking LLM performance: Latency, token usage, response quality
- Detecting hallucinations and errors
- Implementing feedback loop mechanisms
3. Security, Compliance, and Responsible AI
- Data privacy in LLMOps: PII redaction, secure API handling
- Regulatory compliance
- Bias detection and mitigation strategies
4. Agentic workflows
- What are LLM Agents?
- Key components of LLM Agent
- Introduction to LangChain & LlamaIndex
- Use cases
5. Hands-on Capstone: Deploying a Scalable LLM Application
- Building an LLM-powered chatbot, search engine, or knowledge assistant
- Implementing monitoring & logging for performance tracking
- Optimizing deployment for cost and scalability
PhD Hisham MOHAMED, AV¶ÌÊÓÆµ
Hisham is an AI and machine learning expert with over 10 years of experience in machine learning, software engineering, and big data. With a PhD in Computer Science from the AV¶ÌÊÓÆµ, he has led high-impact projects and built and managed diverse teams. Hisham has deep experience in deploying and scaling AI systems. In this session, he will focus on LLMOps, sharing insights on managing, optimizing, and operationalizing large language models in real-world applications.