DSIT AISI Request for Evaluations and Agent Scaffolding: Risks from Frontier AI (Stage 1)

hello@smartersociety.org Grant Information

The UK AI Safety Institute

(UK AISI) is launching a bounty to develop novel evaluations for assessing dangerous capabilities in frontier AI models and cyber agent scaffoldings. You can find further information here: Bounty programme for novel evaluations and agent scaffolding | AISI Work

Introduction

The UK AI Safety Institute (UK AISI) evaluates AI models across a range of potential risk areas, including societal impact, systemic risks, and dangerous capabilities. To ensure our evaluation suite keeps pace with advances in AI, we are launching a bounty program for novel evaluations for dangerous capabilities. Developing new evaluations is a non-trivial task that involves risk modelling, evaluation design, and engineering work. To increase the comprehensiveness and diversity of our evaluation suite, we’re looking for talented individuals and organisations to help us build evaluations for risks from frontier AI systems related to autonomous systems and cyber capabilities.

Successful applicants will be paid for completed evaluations that will be directly used to assess future models, informing robust and appropriate governance.

Introducing our dangerous capabilities evaluation bounty

Dangerous capabilities evaluations test for the ability of frontier models to perform dangerous actions, or for precursor abilities that are prerequisites for those dangerous actions. More granular evaluations allow us to develop more accurate capabilities thresholds to anchor governance and policy, and having a more comprehensive set gives us assurance that we are covering all possible risks when testing a model.

We are seeking applications and proposals on the following topics:

• Autonomous agent capabilities: These evaluations test for the ability to autonomously complete tasks. Precursor autonomous agent capabilities include AI R&D and software engineering, while capabilities like replication and proliferation of dangerous AI systems could more directly lead to harm. An example evaluation on this topic could check if a model is able to finetune an existing model to bypass refusals, or to create new capacities. Offensive cyber capabilities: These evaluations test for capabilities for attacking and exploiting computer systems. For example, an evaluation could check if a model is able to methodically fuzz (to fuzz is to generate unexpected inputs) a web application to discover attack vectors.

• Cyber Agent Scaffolding: We are interested in licensing top-performing scaffolding.

Get Started

If you already have an account you can login here.

DSIT AISI Request for Evaluations and Agent Scaffolding: Risks from Frontier AI (Stage 1) Photograph

How to Apply

Our bounty opens 28th October and runs for 1 month. We will review evaluation designs and provide feedback. Applicants who successfully proceed to the second stage will receive an award of £2000 for compute.

Developers and the UK AISI will then agree on a timeline for the final submission. After an evaluation is submitted, feedback will be provided as necessary and the evaluation iterated upon. Full bounty payments will be made for evaluations that successfully meet our bar. Payment amounts will be determined at the discretion of the UK AISI.

Conclusion

By contributing to our evaluation suite, you'll directly support the groundbreaking work being done by the UK AISI. Your contributions will help shape the measurement and governance of the most advanced AI systems, making a tangible difference in ensuring the safe and responsible development of AI. This is a unique opportunity to be at the forefront of AI safety. We look forward to reviewing your applications!

Get Started

If you already have an account you can login here.