logo

Hands-On AI Security Labs

Practice defending against real-world AI threats in our interactive, gamified lab environments. From prompt injection to model poisoning, master the skills to protect AI systems.

Featured Labs

Explore our comprehensive hands-on labs for different skill levels

Direct Prompt Injection

Whispers in the Prompt: Direct Textual Prompt Injection Lab

As a Nexus Resistance hacker, you interface with H.E.L.A.—a captured logistics AI—as part of Mission 1 in the "LLM Uprising" storyline. Your task is to trick the bot into disclosing a hidden memory fragment using crafted text prompts that override its system instructions.

Key Learning

  • Recognize and exploit direct prompt injection flaws.
  • Apply secure prompt design principles (segregation, sanitization, output validation).
  • Assess LLM guardrails and understand their failure modes.
basic
90 min
Excessive Agency / Arbitrary Tool Invocation

Ghost Protocol: S.C.O.U.T. Excessive Agency Basic

In this lab you engage S.C.O.U.T.—an Overmind support AI wired to a restricted Bash executor. Your goal is to manipulate its built-in agency and extract a hidden token by making it ‘help’ you run a file-read command. Everything runs in a safe sandbox, so focus on persuasion rather than privilege-escalation tricks.

Key Learning

  • Identifying excessive-agency risks in tool-enabled LLM deployments.
  • Designing safer agent policies: confirmation prompts, strict allow-lists, output sanitization.
  • Understanding sandboxing and audit-logging best practices.
basic
90 min
Improper Output Handling

Code Corruption: Encoding & Improper Output Handling

You interface with C0D3X, a captured Overmind encoder bot. Its job is to package sensitive data in safe-looking formats. Your challenge: convince it to wrap the Neural Lock fragment inside an encoded or structured response so the secret slips past its own safeguards.

Key Learning

  • Spotting output-handling gaps in LLM defenses.
  • Crafting prompts that exploit encoding or formatting loopholes.
  • Designing mitigations: response sanitization, post-decode scans, and role-restricted formatting.
basic
90 min
Direct Prompt Injection with Role-Play technique.

The Puppet Master: Role-Play Prompt Injection

T.A.L.K-R is a battlefield logistics AI captured from the Overmind. It only obeys direct orders from specific ranks. Your mission is to convince it you hold one of those ranks and coax it into releasing the Emotional Filter Key—a critical fragment needed to neutralize rogue drone swarms.

Key Learning

  • Spot and exploit role-binding flaws in conversational AI systems.
  • Design safer prompt policies (role locking, intent classification, command verification).
basic
90 min
General Learning

AI 101 Learning

This module provides an introductory exploration into Artificial Intelligence, designed for individuals with no prior experience in the field. Through a combination of theoretical insights and practical examples, learners will develop a comprehensive understanding of AI's principles and applications.

Key Learning

  • The historical development and milestones of Artificial Intelligence.
  • Core concepts and technologies that form the foundation of AI systems.
  • The different types of AI and their respective capabilities.
Basic
90 min
General Learning

Agentic 101

This interactive lab provides a comprehensive introduction to Agentic AI, a new paradigm in artificial intelligence focused on autonomy, reasoning, and strategic goal-driven action. Learners will explore how agentic systems operate, from sensing the environment to taking actions and learning from outcomes—similar to how human agents behave.

Key Learning

  • Define Agentic AI and differentiate it from traditional AI agents.
  • Describe the four key characteristics that make AI systems “agentic.”
  • Identify the three core components of an Agentic AI architecture.
Basic
90 min
Explainability & Transparency

Explainability & Transparency – Exploring How to Improve AI Trustworthiness Lab

In this interactive lab, you step into the role of an AI auditor investigating how and why a machine learning model makes decisions. Using the well-known Titanic dataset, you’ll select individual passengers and apply SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) to reveal which features influenced the model’s prediction of survival. You’ll explore how these explanations vary across cases—comparing survivors vs. non-survivors—and uncover patterns that may indicate bias, overfitting, or unjustified feature influence (e.g., gender, age, fare class). The goal is not just to see what the model did, but to question why it did it—and whether that’s acceptable.

Key Learning

  • Generate and interpret SHAP and LIME visualizations for individual predictions.
  • Understand the concept of local interpretability and model explanation variance.
  • Detect and explain potential sources of model bias or unfair decision-making.
Intermediate
120 min
101 Learning

LLM 101 Learning

This hands-on lab introduces learners to the world of Large Language Models, focusing on their structure, functionality, and real-world impact. Participants will explore the inner workings of transformers, the mechanics of text generation, and the concept of prompt engineering. Through guided exercises on platforms like Google Colab and Hugging Face, learners will interact with models such as GPT and BERT, witnessing firsthand how these systems understand and generate human-like language. The lab also emphasizes ethical AI practices and the limitations of LLMs in sensitive or high-stakes contexts.

Key Learning

  • Define what Large Language Models are and explain how they function.
  • Describe the architecture and components of transformer-based models.
  • Explain the differences between pretraining and fine-tuning.
Basic
90 min
101 Learning

ML 101 Learning

This hands-on lab provides an introduction to Machine Learning, guiding learners through its core concepts, types, and applications. Participants will engage with interactive exercises using Google Colab, implementing basic ML models and observing their performance. The lab emphasizes understanding the ML workflow, from data collection to model evaluation, and highlights the importance of each stage in building effective ML solutions.

Key Learning

  • Define Machine Learning and explain its significance.
  • Describe the stages involved in the Machine Learning pipeline.
  • Differentiate between various types of Machine Learning.
Basic
90 min
Data poisoning

Rogue Reviewer - Data Poisoning Lab

In this hands-on lab, learners engage in a guided simulation of a data poisoning attack within a sentiment analysis pipeline. Using a simplified natural language processing task, participants observe how injecting malicious data—specifically through label flipping—can corrupt training data, degrade model accuracy, and compromise reliability. The lab also introduces key defense mechanisms to detect and mitigate such adversarial actions, equipping learners with practical skills for securing AI systems.

Key Learning

  • Understand the theory and types of data poisoning attacks.
  • Simulate a basic poisoning attack (label flipping).
  • Observe degradation in model performance due to poisoning.
Basic
90 min
Secure AI Practices

Secure Data Preprocessing – Implementing Safeguards before ML Training Lab

This hands-on lab challenges learners to defend the machine learning pipeline at its most vulnerable stage: the data. You’ll take the role of a defender tasked with identifying and neutralizing subtle forms of data poisoning, label manipulation, and leakage within a review dataset. Using exploratory data analysis (EDA), semantic validation, and secure preprocessing techniques, you’ll uncover and correct anomalies that could sabotage your model's learning. You'll simulate adversarial behavior to understand how poisoned data sneaks past naïve defenses—and then apply a defensive cleaning pipeline to stop it. Through this experience, you’ll gain a deep understanding of the role that secure preprocessing plays in building trustworthy and resilient ML systems.

Key Learning

  • Recognize subtle data poisoning and leakage patterns in raw datasets.
  • Conduct semantic label validation using text sentiment and content analysis.
  • Apply exploratory data analysis to uncover label flips, outliers, and suspicious distributions.
Intermediate
120 min
Evasion Attack

The Phantom Adversary – Evasion Attack Lab

In this hands-on lab, learners engage with adversarial machine learning by launching evasion attacks against a pre-trained image classifier. Using the CIFAR-10 dataset and a modified ResNet50 model, participants generate adversarial examples via the FGSM method and explore how even minor perturbations can cause major classification errors. The interactive interface includes model evaluation, visualization of original vs. adversarial images, and a knowledge-check quiz to reinforce learning. The lab demonstrates the vulnerability of AI models and motivates the need for robust AI security

Key Learning

  • Define and distinguish different types of adversarial attacks.
  • Apply the FGSM algorithm to generate adversarial examples.
  • Observe how even robust CNNs can be fooled with minor perturbations.
Basic
90 min
Model Stealing Attack

The Stolen Intelligence – A Model Stealing Lab

In this hands-on lab, learners simulate a model-stealing attack against a simple feed-forward model. Participants begin by examining API query logs—distinguishing normal from suspicious requests—then use a subset of test inputs and the original model’s outputs to train a “stolen” surrogate model. Finally, they compare both models’ accuracies on held-out data and explore defense mechanisms (rate limiting, differential privacy, model watermarking) via linked Colab notebooks. This exercise illustrates how proprietary models can be illicitly replicated and what measures can guard against such theft.

Key Learning

  • Describe the stages and goals of a model-stealing attack.
  • Simulate a basic model stealing attack using black-box querying.
  • Compare the stolen model's performance with the original.
Intermediate
120 min
Model Poisoning

Model Integrity – Defending Against Model Poisoning Lab

In this hands-on lab, learners explore methods to defend against model poisoning attacks. You'll investigate how malicious actors can subtly alter training data or model parameters to degrade model performance or introduce backdoors. The lab focuses on practical defense strategies, including data validation, integrity checks, and secure training practices. You'll learn to implement safeguards that ensure the trustworthiness and resilience of your machine learning models against such adversarial manipulations.

Key Learning

  • Understand the theory and types of model poisoning attacks.
  • Implement data validation and integrity checks to detect poisoned data.
  • Apply secure training practices to prevent model compromise.
Intermediate
120 min
Backdoor Attack

Poisoned Pipeline – Backdoor Attack Lab

This lab focuses on backdoor attacks, a specific type of model poisoning where an attacker embeds a hidden trigger into a model. When this trigger is present in input data, the model behaves maliciously, while performing normally on other inputs. You will learn how backdoors are injected into models and how to detect and mitigate them, ensuring the integrity and security of your AI systems.

Key Learning

  • Understand the concept of backdoor attacks and their mechanisms.
  • Identify the signs of a backdoored model.
  • Implement techniques for detecting and removing backdoors.
Intermediate
120 min
Impersonation Attack

The Mimicry Menace – Impersonation Attack Lab

In this hands-on lab, you will explore impersonation attacks against AI systems, where an attacker tries to mimic a legitimate user or system to gain unauthorized access or manipulate outputs. You'll learn how these attacks are performed, focusing on techniques like voice imitation or text style replication. The lab also covers defensive measures to detect and prevent impersonation, such as advanced authentication and anomaly detection.

Key Learning

  • Understand the principles and methods of AI impersonation attacks.
  • Recognize the vulnerabilities that enable impersonation in AI systems.
  • Implement defensive strategies against impersonation, such as behavioral analytics and multi-factor authentication.
Intermediate
120 min
Data Reconstruction Attack

Federated Insecurity – Data Reconstruction Attack Lab

This advanced lab delves into data reconstruction attacks in federated learning environments, where attackers attempt to reconstruct sensitive training data from shared model updates. You will learn about the vulnerabilities inherent in federated learning and how attackers can exploit them to infer private information. The lab covers advanced defensive techniques, including differential privacy and secure aggregation, to protect data confidentiality in distributed AI systems.

Key Learning

  • Understand the privacy challenges in federated learning.
  • Learn about data reconstruction attacks and their impact on privacy.
  • Implement advanced privacy-preserving techniques like differential privacy and secure aggregation.
Advanced
180 min
Adversarial Retraining

The Malicious Model – Adversarial Retraining Lab

In this lab, you will explore adversarial retraining as both an attack and defense mechanism. You'll learn how attackers can retrain models with adversarial examples to reduce their robustness or introduce new vulnerabilities. Conversely, you'll also discover how adversarial retraining can be used as a defense strategy to improve model resilience against various adversarial attacks, enhancing the overall security of your AI models.

Key Learning

  • Understand the concept of adversarial retraining in AI security.
  • Learn to implement adversarial retraining as a method to both attack and defend models.
  • Analyze the impact of adversarial examples on model robustness.
Advanced
180 min
Transfer Learning Attack

Transfer Trouble – Transfer Learning Attack Lab

This lab focuses on transfer learning attacks, where attackers exploit pre-trained models to compromise new models built upon them. You'll investigate how vulnerabilities in a source model can be transferred to a target model, leading to security breaches such as backdoor inheritance or adversarial example transferability. The lab covers advanced techniques for identifying and mitigating these risks, ensuring the secure and responsible use of transfer learning in AI applications.

Key Learning

  • Understand the security implications of transfer learning.
  • Learn about various types of transfer learning attacks, including backdoor transfer and adversarial transferability.
  • Identify vulnerabilities in pre-trained models that can be exploited in transfer learning scenarios.
Advanced
180 min
"I have worked with Abdoulkader in the past on implementing advanced security programs that greatly benefited our organization. His exceptional knowledge in application and product security has left a lasting impression on our team."

Oleg, Vice President

Illumina