2009-04-08

Spirit Sol 94

One of the things I tried to do better than on any of my previous projects when developing RoSE was: write better error messages. I've developed a whole theory about this. They need to make sense at the user level (as opposed to the programmer level), should tell someone not intimately familiar with the software what's wrong and whether they can fix it -- and, if they can fix it, how.

On my way to work, Brian calls me about one of the error messages where I followed almost none of those rules. What the hell was I thinking? Fortunately, there's only one thing the error message can mean, and I'm able to fix the problem over the phone in less than a minute. But I should have done it right in the first place.

I've come in to do another TV interview. This one's with a couple of students from Glendale Community College; the best production in the class will apparently be shown on CNN. (And the interviewer is a beautiful young gray-eyed blonde, so that's nice.) This time I do much better; I talk a little too fast, but I keep it at the right level, and I'm relaxed and open and confident. I think it helps that we're standing, so I feel a little less trapped than I did during the seated interview the other day. It also helps that this isn't my first TV interview, so I feel a little more like I know what I'm doing. I did well, and I'm happy about it.

Today also brings a continuation of an email exchange I've been having with Susan Kurtik. In my last message to her, I'd written, "'Weekend?' What is this ... 'weekend'?" Today she wrote back: "A 'weekend' is when I use my computer at home to do work instead of the one at JPL."

[Speaking of which, this is where I started taking my days off, uh, off. The flight software upload was going on during this period anyway. So the next entry will be Sol 99's, on April 13.]

3 comments:

bert said...

Scott,

Thanks for reiterating the importance of good error messages! At one point, I got so fed up I blogged about it:
http://blog.netherlabs.nl/articles/2006/09/08/the-perfect-error-message

changcho@hotmail.com said...

"They need to make sense at the user level (as opposed to the programmer level), should tell someone not intimately familiar with the software what's wrong and whether they can fix it -- and, if they can fix it, how."

So true - yet, programmers (and I am one) lots of time don't have time for detailed error messages and one just writes a quick message that can only make sense to the programmer, b/c, well you'll fix it later. Of course, later never comes because of lack of time/resources/etc. It is good practice to take the time to do it right the first time.

Scott Maxwell said...

@changcho Here's an approach that works pretty well for me. While I'm in the thick of coding, I just write any slapdash error message; doesn't matter what. But I flag it with a FIXME. When I'm checking my code in, I do a separate pass where I grep for the FIXMEs and replace the slapdash error messages with good ones.

The main advantage of this for me is that I don't have to keep switching perspectives -- from "how do I explain this to a computer" to "how do I explain this to a human" and back. I often find that a jarring transition. It does require some discipline, though, because -- as you say -- there's always pressure to move on to the next feature.

One criterion I left out of my description of good error messages: where appropriate, they should also tell you what the software already tried to do. For example, the message shouldn't just say "configuration file not found," it should say where the code looked for that file -- including whether any relevant environment variable was set. When done well, this can be of tremendous help to the user who's trying to propitiate your code.

I didn't do a perfect job at this (or at anything else) when developing RoSE, but I have had users spontaneously thank me for the good quality of RoSE's error messages! So I think I must have done all right.