For a brief period, the kernel tried to deal with gamma rays corrupting the processor cache

Authored by blogs.msdn.microsoft.com and submitted by kirbyfan64sos

At one point, the following code was added to the part of the kernel that brings the system out of a low-power state:

; ; Invalidate the processor cache so that any stray gamma ; rays (I'm serious) that may have flipped cache bits ; while in S1 will be ignored. ; ; Honestly. The processor manufacturer asked for this. ; I'm serious. ; invd

I'm not sure what the thinking here is. I mean, if the cache might have been zapped by a stray gamma ray, then couldn't RAM have been zapped by a stray gamma ray, too? Or is processor cache more susceptible to gamma rays than RAM? The person who wrote the comment seems to share my incredulity.

Less than three weeks later, the INVD instruction was commented out. But the comment block remains.

In case we decide to resume trying to deal with gamma rays corrupting the the processor cache, I guess.

Bonus chatter: One of my colleagues wasn't part of this specific change, but recalled that these sorts of strange-sounding requests were not uncommon, especially for early processor steppings. The workaround was removed once the problem was fixed in microcode or in a later processor stepping.

leif_erikson503 on November 21st, 2018 at 05:41 UTC »

I worked in high performance computing for a year and a half. A guy I saw speak at a conference who ran a gargantuan super computer for one of the national labs said that bits get flipped in the memory of those machines by cosmic radiation about a dozen times per day. Error correcting codes prevent anything bad from happening, and allow them to count events like this.

__j_random_hacker on November 21st, 2018 at 00:34 UTC »

In the Raymond Chen articles I remember, what would have happened next is: Some dipshit writes a wildly popular game that depends on gamma rays flipping bits in cache, forcing MS to write a hack into the next 10 versions of Windows that detects the presence of this program and simulates random bit flips just to keep it running.

FlyingRhenquest on November 21st, 2018 at 00:07 UTC »

Back in the OS/2 days I got a support question from a guy on the forums who was trying to do some satellite software. There was an API call that would allow him to adjust the time down to milliseconds, but whenever he tried to adjust milliseconds, the time would be wrong the next time he checked. Turns out the OS/2 kernel monitored two interrupts to keep track of that. There was an interrupt that rolled around every 22ms that it would use to increment the millisecond counter and a 1 second periodic interrupt.

Turns out, the system could occasionally not process the 22ms timer if it was doing something else when that interrupt rolled around, so it would just zero out the ms timer when the 1 second periodic interrupt hit.

Filed an APAR on it that got closed "Working as designed." :/