[Cialug] Crashing with errors in mcelog

David Champion dave at dchamp.net
Tue Mar 3 10:16:11 CST 2009


Daniel A. Ramaley wrote:
> On 2009-03-03 at 09:32:52, Matthew Nuzum wrote:
>   
>> Regarding crashes coming in pairs, is it possible the reason for the
>> second crash is a warm-boot vs. cold-boot problem? For example, I've
>> seen in several instances where a computer will not properly reset
>> itself on warm-boot (reboot command or ctrl+alt+del or etc) and crash
>> very shortly after boot. However if you hit the power button and give
>> the computer 30s of rest then it works.
>>     
>
> I've seen situations similar to what you describe, where warm and cold 
> boots differ in their result. In my case the machine crashes too 
> totally for a warm boot to be possible, so i reboot by hitting the 
> reset button on the front of the case. I didn't power cycle it. But i 
> do let the BIOS RAM checks run to completion (does that zero out the 
> RAM?). Hitting the reset button in most cases *should* be equivalent to 
> a power outage, but i know it isn't *entirely* identical. The hard 
> drives keep spinning for one thing, and i'm guessing miscellaneous 
> device memory (such as drive controllers, graphics card, sound card 
> buffer) might not be reset the same.
>
> After the second boot the machine seems to run fine for awhile. Several 
> months ago it had this double crash problem, and then it was fine until 
> this weekend. I figured i'd have a few more months again, but then it 
> did it this morning. Arrrgh.
>
> I hope the problem turns out to be something relatively cheap and easy 
> to fix, like RAM. All the components in the machine are name-brand and 
> i've had it running 24/7 for about a year and a half though, so i'm not 
> sure why it would start having trouble now.
>   
A warm boot and cold boot are pretty different. I've had situations 
where a device's firmware had to be initialized by booting into Windows 
to load it, then doing a warm boot into Linux, because the Linux driver 
wouldn't load the firmware. Don't recall off hand what that was, maybe a 
scsi controller, or a modem, and that was probably more than 10 years 
ago. :)

There have been security notices about viruses that can survive a warm 
boot, by loading into a higher memory location in RAM.

I'm sure there's probably a utility to look through RAM for interesting 
things. When I was taking mainframe programming classes at DMACC, it was 
always interesting when you crashed a program and got a hex dump of your 
memory space, and it contained un-initialized memory from someone 
elses's stuff. Sometimes you'd get some or all of another student's code 
or data. Of course, you'd get into a lot of trouble if they caught you 
doing this on purpose...

-dc




More information about the Cialug mailing list