Guidance

How can we help? – Researching & Writing

Research Pathway

Here’s a checklist to get you started. You can follow this path or not if you choose to do so, but we recommend following this pathway when working.

Finish exploring our website (duh!). There are valuable ideas and strategies written here that you can return to if necessary.
- Ask yourself: could I see myself working on ELK for at least three hours?
  - If yes, move on to step 2
  - If no, then you might not be the right fit for this project
- The information on our website could be helpful if you’re ever stuck – “back to the basics”
Read ARC’s technical report here. The 100+ pages seems daunting – if you need to, read it in small parts rather than all at once.
- While reading, we recommend jotting down ideas, thoughts, and comments
- This report has a LOT of technical jargon. It may be necessary to look up definitions for more words than you expect.
Familiarize yourself with the counterexamples listed here – these are what you’ll need to overcome in your proposal.
Read past examples of full proposals from ARC. This will give you an idea of what we’re looking for in a winning proposal.
- If you’re looking for help forming ideas, look at this post for sample scenarios
Write your proposal! Use our submission template, ideas you’ve written down, our guidance written below, and more to build your ideas.
- If you’re still stuck and don’t have ideas, continue researching (see the bottom of this page for more research resources) while jotting down anything that pops into your head

Strategy

Developing ideas is often the most difficult part of ELK, especially with so many working on the issue. While researching, we recommend a babble and prune approach, where you write down as many ideas as possible, no matter how “dumb” they seem. As you research, try to build a long list – write down modifications you might make, how to apply to the real world, connections, proposal hypotheses, etc. With this list, prune your ideas by cutting out those you think are less valuable, difficult to accomplish, or succumb to counterexamples. Keep this list though, as it’s often a good starting point if you’re stuck!

What should I do if I’m stuck?

Figure out what works best for you! Often, taking 1-2 days completely off of a project (even if it’s hard) and returning with a fresh mind can help reset your mind. Looking at new sources (e.g. old proposals) and asking another person for help are also good strategies to help. If you tried the babble and prune approach above, you can also go back to the ideas you pruned out and build off those. Even if you’re stuck, there’s always a way out!

Writing Guidelines

Below is a set of guidances modified from the Alignment Research Center to keep in mind when formulating ideas. This can be used in two ways: a) as a place to gain starting points to “babble” ideas, and b) as a checklist for your rough draft to check for errors. Some of this guidance might seem confusing – we recommend coming back to this section after your research.

Reward Function

It’s likely going to be much more difficult (though not impossible) to solve ELK with a comprehensive model, and in general, it’s a good idea to include a term in your reward function (i.e. the function that describes how the AI assesses something) that depends on the structure of your model. While we would be most interested in models that penalize “bad” AI behavior in any model, this is difficult to accomplish. If you wish to do so, it might be useful to limit “gaming” the reward function by minimizing the effects of Goodhart’s Law – making it difficult to increase reward value via non-aligned methods. One could do this by incorporating AI into the reward function and/or adding consistency checks, but we expect unique surprises in this category.

Complexity

In general, we recommend not to consider cases where the model becomes more complex or learns something new as it progresses. However, check that your proposal doesn’t incentivize extra complexity or further learning/modification – if it does, then you should address that case. Making a more complex model only complicates the issue, and ELK is already very difficult with a simpler algorithmic model.

Precision

In order to be considered, your proposal must specify a precise way to approach the problem. If the idea is too broad or informal, then it’s difficult to test your approach without wondering if we’re assessing it correctly. However, if you include a novel counterexample that applies to a broad category of ideas, then you may leave the solution fairly broad if your goal is to prove that the entire category does not work.

Translation

If your proposal seeks to directly describe the AI’s neural network rather than using human simulation to determine the validity of an outcome, you should include a specific reason why your model would prefer direct translation. For example, if your model would assess the SmartVault situation as a whole as “good” or “bad” rather than simulating asking a human if it were “good” or “bad,” include why that might be.

Competitors should read through the following resources, trying their best to get a relatively deep understanding of the content and writing rough ideas/comments.

Once you have done so, you should write up your proposal, keeping in mind the guidance above.

The Alignment Research Center, or ARC, has written a full document with their research strategies, various counterexamples, and technical statements of the problem. Those wanting to submit a solution should read through at least the main section of this report to effectively understand ELK. We recognize this length is daunting, but we expect it to be enjoyable to read for a large subset of our audience, and valuable for everyone.
In January, ARC announced a similar competition to ours – you’ll notice many similarities between our information and theirs. One key component of ARC’s announcement is their counterexamples, listed here, which you’ll need to read and understand completely. These are what you’ll need to overcome in your proposal.
A number of past proposals and overall strategies are listed on Paul Christiano and Mark Xu’s prize results post on the AI Alignment Forum. We encourage you to explore this page to find proposals you’d be interested in modifying for counterexamples.
Ryan Beck has also compiled a list of basic approaches to ELK and a number of “breaker” counterexamples to them. This can be useful to check your ideas with past solutions and to understand how a so-called “breaker” might try to attack your proposal.
Commonly, looking at others’ thinking on a subject is a useful way to formulate your own thoughts. If you wish to do so, you can explore posts on LessWrong or the AI Alignment Forum tagged with ELK.
Once you have an idea for a proposal, you can type it up using this template. Feel free to submit proposals even if you’re not confident that they meet our criteria—you may still win a prize if your idea is novel or unique in some way.