2009-01-23

Spirit Sol 20

Jeng says it's very bad. No word from Spirit.

I stop by the sequencing MSA anyway, just out of habit. Frank's there.

"No news is bad news?" I ask.

"That's correct," he says.

"What can we do?"

He shrugs, but with a worried expression. "I'd be doing it."

He's set up RSVP to show Spirit as it was the last time we got word from it, with the sun, Earth, Phobos, and Deimos wheeling around it in real time.

The daily press conference comes on. Pete Theisinger and Richard Cook are earning their paychecks. It doesn't look like there's a single cause. They've run the same sequences in the testbed that were running on Spirit, and nothing goes wrong, which suggests it's a hardware problem. Spirit is acting erratically, making some comm passes -- though cutting them short -- and missing others altogether. The data it's sending back, when it sends anything, hasn't been much help so far.

Randy Lindemann comes in to watch the press conference with us. We got a beep[1] ten minutes ago, he says. This is really good: the computer's OK, and a bunch of things in a line from the computer to the antenna have to be OK as well.

Henry Stone also drops by, characterizing the beeps as "good and solid." The spacecraft is telling us it's in a fault mode, but we don't know which mode or why. We sent a comm window[2] but did so close to Earth-set, so we may get no data. Spirit is sick but alive. Drama queen!

Just as they're ending their statements, Pete and Richard get the news about the beep. Obviously relieved, they report it to the press. This is good: instead of "Spirit Dead," the headlines will read "NASA Hears From Spirit."

I go hang out in the SMSA for a little while, but there's not much news. We might have uplinked the comm window during the Odyssey comm pass, in which case Spirit wasn't listening. We wait for it anyway. Earth sets. There's no real chance we'll get data now, but the DSN continues to listen for a few minutes, in case a signal is on its way. Chris Voorhees says: "It's actually amazing in some sense that it took 19 days for this to happen. ... So we just declare a PORTism[3], clear the errors, and move on, right?" We keep waiting. Nothing.

But we're commandable at 7.8125 bps. It was a good day in its way. They tell the subsystems people to go home, report back at 7PM. They don't need rover drivers at all at this point; we can go home until further notice.

Screw that. My spacecraft needs me. I go to the 430B conference room. They're running down a list of things that need doing, and most of them are things I don't know enough about to help with. I listen patiently for something I might be able to help with. Arthur Amador notices me and asks if I've seen an ISA[4] that got filed last night. I haven't, so I check my email. Saina Ghandchi reported what sounds like a weird bug in RoSE, so I check it out. I can't find any sign of the reported bug, with good reason as it turns out: the problem is not because of a bug in RoSE after all. (Though I have a nasty moment when it looks like there really is a bug, and I realize it maybe could have produced problems for the spacecraft, maybe the problems they're seeing now.) They run a script to prepare the day's master sequence, and they ran it twice instead of once. The resulting file looked weird when they opened it in RoSE, so RoSE got the blame. OK by me: I like bug reports that turn out like that. I return to the assemblage of the worried.

They're still listing things that need to be done. It's amazing how little I can help. Then they mention something that's right up my alley. We compose our commands in a MER-specific XML-based format called RML, and then we convert RML to a legacy format called SSF, and convert that to another format called SCMF, the binary format that goes to the rover. They want to be sure that what went to the rover is what we think we sent, so we need to compare sol 17's RML, SSF, and SCMF files to ensure that they're consistent with each other. They're actually talking about getting someone to go through the files manually, comparing hundreds of commands and their arguments. Insane. That's a job for a computer if I ever saw one. I tell them I'll take care of it.

I could have sworn there was already software to decode SCMFs, but I look around and ask around, and can't find it. So I'll have to write it myself. I spend the rest of my day reverse-engineering the format. It's fun, actually, and the kind of thing I haven't done in a long time -- staring at a pile of 1s and 0s and figuring out what it all means, then encoding that understanding in the software I'm writing. I get to the point where I don't need the software to do the job any more; even in the binary formats, I can find commands and their arguments by eye. I feel just like a Mars rover.

By the end of the day my SCMF decoder is working, more or less. It still has a number of minor problems, plus two that really stand out. First, the software doesn't understand the format used for floating-point numbers, but that will be easy enough to fix; I just haven't had time for it yet. The other problem is more serious. SCMFs are divided into chunks called "messages," and while my decoder works for commands within a message, it doesn't understand where messages begin and end -- because I don't, either. It looks like there's a field in the message that tells you how long the rest of the message is, but it's inconsistent -- the number in that field is larger for large messages, shorter for short ones, more or less in proportion to the actual message length. But not exactly, and software demands exactitude.

Well, there was a story in the morning's newspaper about a study that proves what we all knew: sleep helps you fix problems like this. I decide to make my own personal contribution to the field of sleep research by going home and getting nine hours of sleep.



Footnotes:

[1] A radio signal.

[2] That is, we told Spirit when to call home.

[3] Our name for an error that crops up during an operational readiness test. When it happens in the testbed, you cheat: learn the lesson, fix the problem as fast as you can, and move on, so that you can continue getting value out of the test. We had lots of PORTisms during our many pre-landing tests; by contrast, surface operations had generally gone amazingly well.

[4] Incident/Surprise/Anomaly report. It's sort of a generalized bug report, but not just for software problems; it includes hardware problems, manual errors in operations, and so on.

3 comments:

italianguy said...

Scott, really nice to see you on the video "Five Years on Mars"!
(http://www.jpl.nasa.gov/video/index.cfm?id=795).
You are really great, but is quite fun because of this sort of "media-mania" that you let us know from the very first bunch of posts!
Thanks man!

changcho said...

So, all this talk about 'comm passes' from Odyssey, or perhaps MGS...How useful to you guys would an areostationary satellite be (parked above the equator roughly at the same longitude where Spirit lies?)

Anonymous said...

@Changcho: an areostationary satellite wouldn't be nearly as useful. It gives you altitude, but once the Earth is eclipsed by Mars you only get a little while longer from orbit than from the surface.

Compare that to a satellite that's whipping around the planet, can hear the rover as it passes overhead, and then can then get around to the other side and relay the data back to Earth.