In reply to Plan Airborne:
The basic problem is that climbing is actually quite safe. Effects on safety will be in the margins and it's unlikely that you can recruit a large enough sample of volunteers (or observation time) to see significant changes in things that are crucial.
(FWIW, a German study from 2004 with lots of observations found no effect of gym "density" on safety. Even though noise reliably impairs performance in the laboratory and the field.)
Plus, there's the ethical problem that you can't watch your participants endangering themselves without intervening.
In other words, you need a proxy for safety.
Routine tasks are a good idea, but again, the basic probability of error is so low that you're unlikely to find anything which can be described as a safety issue. They'll be slower, but that'll be it.
Spotting errors planted by the experimenter again will require either some obscure stuff whose relevance is questionable or most likely show no difference (except search/deliberation time) again.
Another option would be using (simple) rescue scenarios such as those found in the Tyson & Loomis book. Either as an oral exam or a simulation. One advantage here is that for once time is self-evidently important (unlike tying in at the base of the crag) and by generating a stressful situation you're more likely to push people into the margins, where the differences are. And you're not testing well rehearsed (and hence stable under changing circumstances) behaviours, but require new, complex stuff.
The best option is probably a combination of all three. Construct two circuits and have people climb with someone else who has been carefully instructed by you. Stick to toprope or have a "silent" toprope backup belayer.
Two circuits because you want to do this within participants. i.e. test each person twice. Some starting with tired with variant A, other rested with variant A, others tired with variant B and the last group starting rested with variant B. That is, fully counterbalanced.
That way you can ignore sequential effects and still use each participant as his/her control. Which will increase statistical power. Of which you'll need every little bit you can get.