I don't want to sound defeatist but, are we asking for an accuracy that is unattainable? Is it too hard to map the CCDs?
Would we need to ensure that the spec falls onto the same pixels each time? Maybe each pixel has a different responce to each wavelength. So if the spec image is moved between two exposures it would give a different response.
A single dimention pixel array should be easier to calibrate.
One way to check this maybe to allow the spec to drift up across the camera array and measure the levels along the extended Ha line for example. Do they change along the Ha line? (Illustrated by Terry's image)
ie I think we need to get the image to land on the same pixels each time.
I hope this makes sense.