ELK Results

1st Place: $15,000

For a proposal identifying direct translators by penalizing large changes in output given changes in data quality.

For a proposal predicting the predictor’s
output given part of its initial state.

For a proposal extending the idea of penalizing excessive predictor activations by removing any random activations.

Jack Edwards, for an approach to distinguish translators from human simulators

Peter Berggren, for an approach utilizing a debate between multiple agents

Raymond Douglas, for a strategy penalizing the parts of a reporter that rely on human simulation

Ulisse Mini • Luke Bousfield, for a proposal drawing from shard theory, a novel idea in alignment research