AV¶ÌÊÓÆµ

Short course Managing and Evaluating Generative AI systems 2025

Gain practical experience in evaluating and deploying your own Generative AI systems, develop skills to assess various Generative AI models and how to perform fast prototyping with GenAI
Microcertification

Information

Period

5 June 2025 - 6 June 2025
2 ECTS credits (micro-credential)

EQF-level

Language

English

Contact

Dr Jose Luis FERNANDEZ-MARQUEZ
cui-formationcontinue(at)unige.ch

Location

Campus Biotech Innovation Park

Registration

Registration deadline

23 May 2025

Fees:

  • CHF 1300.- for the full course
  • CHF 650.- (reduced rate)
  • An additional CHF 200.- for the Micro-credential

Contribution to the SDGs

Objectives

This course provides practical experience in managing Generative AI systems and develops skills to assess various Generative AI models. It focuses on the deployment, monitoring, optimization, and scaling of large language models (LLMs) in production using Large Language Model Operations (LLMOps). A light touch on Machine Learning Operations (MLOps) ensures foundational understanding before diving into LLM-specific challenges such as performance tuning, cost efficiency, observability, LLM Agents, and prompt management. Participants will gain experience with common (Commercial and open-source) cloud-based LLM tooling and platforms, ensuring they can build scalable, efficient, and cost-effective LLM applications.

Audience

This session is for IT professionals who manage AI systems. Participants should have a foundational understanding of machine learning (particularly LLMs), intermediate-level Python skills, basic API usage, and a laptop with a working Python environment. Prior experience with MLOps, cloud platforms, or prompt engineering is helpful but not mandatory.

Learning outcomes

This course offers participants key competencies in deploying, monitoring, optimizing, and scaling large language models (LLMs) within operational environments.

Here are the key competencies provided by the course:

  • Foundational Knowledge of MLOps and LLMOps: Understanding the lifecycle of LLMs including pre-training, fine-tuning, inference, and monitoring; distinguishing between MLOps and LLMOps.
  • Deployment Strategies for LLMs: Learning various hosting options, tackling scaling issues like model parallelism, and exploring cost optimization strategies such as caching.
  • Prompt Engineering and Management: Gaining skills in prompt engineering for enhanced efficiency, accuracy, and cost-effectiveness; managing and versioning prompts to ensure consistency and maintainability.
  • Performance and Cost Optimization: Techniques to optimize inference speed and reduce operational costs.
  • Observability and Monitoring: Setting up systems to track LLM performance metrics such as latency and token usage, detecting errors, and implementing feedback loops mechanisms.

Programme

Day 1 - LLMOps Fundamentals

  • Introduction to MLOps and LLMOps
  • LLM Deployment Strategies and Infrastructure
  • Prompt Engineering and Prompt Management
  • Prompt Management in LLMOps
  • Hands-on Session: Deploying an LLM and Experimenting with Prompts

Day 2 - Advanced LLMOps

  • Performance and Cost Optimization in LLMOps
  • Observability, Monitoring, and Feedback Loops
  • Security, Compliance, and Responsible AI
  • Agentic workflows
  • Hands-on Capstone: Deploying a Scalable LLM Application

Director(s)

Prof. Giovanna DI MARZO SERUGENDO, Centre universitaire d'informatique (CUI), AV¶ÌÊÓÆµ

Coordinator(s)

Dr Jose Luis FERNANDEZ-MARQUEZ, AV¶ÌÊÓÆµ

Date(s)

5 June 2025

Speakers

PhD Hisham MOHAMED, AV¶ÌÊÓÆµ

Description

1. Introduction to MLOps and LLMOps

  • MLOps vs. LLMOps: Key differences and challenges
  • LLM lifecycle: Pre-training, fine-tuning, inference, and monitoring
  • Challenges in deploying LLMs: Scalability, latency, cost, observability

2. LLM Deployment Strategies and Infrastructure

  • Hosting options: API-based vs. self-hosted
  • Scaling challenges: Model parallelism, quantization, and distillation
  • Cost optimization strategies: Caching, batch inference, serverless deployments

3. Prompt Engineering and Prompt Management

  • Prompt Engineering Basics
  • Optimizing prompts for efficiency, accuracy, and cost
  • Structuring prompts for different use cases
  • Fine-tuning vs. prompt engineering: When to use which

4. Prompt Management in LLMOps

  • Why prompt management matters: Consistency, scalability, and maintainability
  • Versioning and tracking prompts: Best practices
  • Using prompt management tools
  • A/B testing prompts: Measuring effectiveness and iterating

5. Hands-on Session: Deploying an LLM and Experimenting with Prompts

  • Deploying a LLM
  • Experimenting with different prompt engineering techniques

Date(s)

6 June 2025

Description

1. Performance and Cost Optimization in LLMOps

  • Optimizing inference speed
  • Cost-effective LLM usage: Reducing API calls, caching, batch inference

2. Observability, Monitoring, and Feedback Loops

  • Tracking LLM performance: Latency, token usage, response quality
  • Detecting hallucinations and errors
  • Implementing feedback loop mechanisms

3. Security, Compliance, and Responsible AI

  • Data privacy in LLMOps: PII redaction, secure API handling
  • Regulatory compliance
  • Bias detection and mitigation strategies

4. Agentic workflows

  • What are LLM Agents?
  • Key components of LLM Agent
  • Introduction to LangChain & LlamaIndex
  • Use cases

5. Hands-on Capstone: Deploying a Scalable LLM Application

  • Building an LLM-powered chatbot, search engine, or knowledge assistant
  • Implementing monitoring & logging for performance tracking
  • Optimizing deployment for cost and scalability

PhD Hisham MOHAMED, AV¶ÌÊÓÆµ

Hisham is an AI and machine learning expert with over 10 years of experience in machine learning, software engineering, and big data. With a PhD in Computer Science from the AV¶ÌÊÓÆµ, he has led high-impact projects and built and managed diverse teams. Hisham has deep experience in deploying and scaling AI systems. In this session, he will focus on LLMOps, sharing insights on managing, optimizing, and operationalizing large language models in real-world applications.

Contribution to the SDGs