FLAIR Research Lab - Blog Posts

Benchmark June 2025

AH2AC2: Ad-Hoc Human-AI Coordination Challenge

AH2AC2 is a benchmark for evaluating how well AI agents can coordinate with humans in the complex, partially observable, cooperative game Hanabi, especially when access to human data is limited.

Continue Reading →

Reinforcement Learning March 2025

Fixing TD Part II: Overcoming the Deadly Triad

In Part I of this blog, we characterised the stability of TD through the TD Jacobian. In this part, we now build on this analysis to better understand the reasons for instability before proposing a surprisingly simple architectural solution that can stabilise TD.

Continue Reading →

Reinforcement Learning March 2025

Fixing TD Part I: Why is Temporal Difference Learning so Unstable?

In this first blog of a two-part series, we take a deep dive into understanding the challenges faced when developing stable TD algorithms and introduce a powerful tool to formally characterise TD's instability mathematically.

Continue Reading →

Security May 2024

PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition

We prompt the LLM to repeat its own output, which protects the model from adversarial attacks by avoiding the auto-regressive trap. We name this method PARDEN. PARDEN is particularly effective in the relevant regime of high True Positive Rate and low False Positive Rate.

Continue Reading →

Environment Design April 2024

Refining Minimax Regret for Unsupervised Environment Design

In summary, we show that minimax regret has a notable failure case when there are environments with high irreducible regret. Our solution concept can address this problem, and ReMiDi results in higher empirical performance in cases like these.

Continue Reading →

Multi-Agent November 2023

JaxMARL: Multi-Agent RL Environments and Algorithms in JAX

We present JaxMARL, a library of multi-agent reinforcement learning (MARL) environments and algorithms based on end-to-end GPU acceleration that achieves up to 12500x speedups.

Continue Reading →

FLAIR Blog

AH2AC2: Ad-Hoc Human-AI Coordination Challenge

Fixing TD Part II: Overcoming the Deadly Triad

Fixing TD Part I: Why is Temporal Difference Learning so Unstable?

PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition

Refining Minimax Regret for Unsupervised Environment Design

JaxMARL: Multi-Agent RL Environments and Algorithms in JAX