[Cialug] Suse upgrades and grub issues

Matt Patterson matt at usrlocal.com
Thu Jul 24 13:35:25 CDT 2008


See inline...

On Thu, 24 Jul 2008, Josh More wrote:

> I've never experienced this particular problem on any of the SUSE
> servers we have.
> I *have*, however, experienced something similar on a laptop.
>
> 1) Are you using /boot/ directory or a partition?  SUSE prefers
> dedicated /boot/ partitions.
>

Its a mix.  Typically we try and do a dedicated /boot partition but it is 
not always the case.  Grub in theory shouldn't care.  The only thing that 
it should care about is that the parition, whether it contains just /boot 
or all of / is the partition it should boot from.


> 2) If the /boot/ partition fills up, SUSE can have interesting failure
> states.
>

So far we haven't run into a full /boot partition.



> I was able to fix the laptop problem by using LILO, though it should
> also have been fixable by changing how GRUB was installed.  The problem
> in that case was the kernel having the driver for the hard drives, but
> initrd not knowing about it.  You might also be able to fix these by
> rolling your own initrd.
>
>

I'm kind of curious to see if we simply add an additional check of:

grub-install --recheck /dev/sda2

To see if that would show our errors.  If not, we can simply run the grub 
command manually that suse runs and we should see the error right away. 
that command, for those that may have missed it on the previous email is:

# grub

grub> setup --stage2=/boot/grub/stage2 (hd0) (hd0,1)

And you should see something that states it succeeded.

-Matt



>>>> Matt Patterson <matt at usrlocal.com> 07/24/08 10:58 AM >>>
> Ok,
>
> I'm not sure how many people run SuSE on here but I seem to have a
> recurring issue that is really starting to irk me.
>
> We only have a few SLES servers and a bunch of opensuse servers and I
> see
> the issue on both platforms.  Which makes sense as one is based off the
>
> other.
>
> The issue happens when installing the kernel patches.  I
> would say that I have a 90% failure rate with this.  Not with the
> installation of the kernel itself, but the grub updates that happen in
>
> relation to the kernel updates.  I have had issues where the menu.lst
> file is corrupted or the bigger, more popular error is that the stage2
>
> file becomes corrupt and when the server reboots, it will go into a
> reboot loop.
>
> I've opened a ticket before with Novell for one of the SLES servers and
>
> provided a b0rked stage2 file for them to look at.  Nothing
> really ever came out of this incident since I was able to get the
> system
> up on my own.  For the most part, I have been able to restore the
> servers
> by booting a CD in rescue mode and untar a backup of the /boot
> directory
> and everything is fine in the world.  But this is rather annoying when
>
> you are trying to patch servers that are hundreds of miles away.
>
> So the question is...am I alone with these issues?  I'm not running
> anything out of the ordinary as far as I can tell.  IBM x3550 servers
> for
> the most part, and a variety of opensuse 10.1, opensuse 10.2 and SLES
> 10
> sp1.
>
> Anyone have any ideas here?   I'm honestly thinking about removing grub
>
> from all the servers and installing lilo and hoping that it resolves
> this
> issue.  Though I think that the real underlying issue is just that YaST
> is
> not handling an error condition correctly and moving on like nothing
> has
> happened.
>
>  -Matt
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> http://cialug.org/mailman/listinfo/cialug
>
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> http://cialug.org/mailman/listinfo/cialug
>
>


More information about the Cialug mailing list