Discussion:
Netbooting Trouble with 712/100
a***@zadzmo.org
2013-02-19 05:08:23 UTC
Permalink
Hello all:

I'm trying to netboot an HP 712/100 but it's giving me trouble. This network is
having no trouble booting sparc64, alpha, and amd64 machines but this HP just
won't go.

The typical attempt looks like this on the wire:

23:55:44.355576 0.0.0.0.68 > 255.255.255.255.67: xid:0x500eae02 [|bootp]

23:55:44.363792 192.168.102.1.67 > 192.168.102.8.68: xid:0x500eae02
Y:192.168.102.8 S:192.168.102.1 [|bootp] (DF)

23:55:46.364620 0.0.0.0.68 > 255.255.255.255.67: xid:0x500eae02 secs:2 [|bootp]

23:55:46.372827 192.168.102.1.67 > 192.168.102.8.68: xid:0x500eae02 secs:2
Y:192.168.102.8 S:192.168.102.1 [|bootp] (DF)

23:55:51.373736 0.0.0.0.68 > 255.255.255.255.67: xid:0x500eae02 secs:7 [|bootp]


If I run 'search' on the boot admin menu of the HP, it gives me the IP address
of the DHCP server, so I know it's receiving traffic. It just won't "go" for
some reason, and I'm not seeing any error or diagnostic messages outside of
'boot failed'.

Any ideas or well known gotchas?
Nick Hudson
2013-02-19 09:02:36 UTC
Permalink
On 02/19/13 05:08, ***@zadzmo.org wrote:
> Hello all:
>
> I'm trying to netboot an HP 712/100 but it's giving me trouble. This network is
> having no trouble booting sparc64, alpha, and amd64 machines but this HP just
> won't go.

Here's an entry I have to netboot my b160l

host stan {
# HP/9000 b160l
hardware ethernet 00:60:b0:18:95:14;
filename "b160.lif";
#filename "C7120023.frm";
next-server arthur;
server-name arthur;
option root-path "/usr/exports/stan/root";
fixed-address stan;
}


>
> The typical attempt looks like this on the wire:
>
> 23:55:44.355576 0.0.0.0.68 > 255.255.255.255.67: xid:0x500eae02 [|bootp]
>
> 23:55:44.363792 192.168.102.1.67 > 192.168.102.8.68: xid:0x500eae02
> Y:192.168.102.8 S:192.168.102.1 [|bootp] (DF)
>
> 23:55:46.364620 0.0.0.0.68 > 255.255.255.255.67: xid:0x500eae02 secs:2 [|bootp]
>
> 23:55:46.372827 192.168.102.1.67 > 192.168.102.8.68: xid:0x500eae02 secs:2
> Y:192.168.102.8 S:192.168.102.1 [|bootp] (DF)
>
> 23:55:51.373736 0.0.0.0.68 > 255.255.255.255.67: xid:0x500eae02 secs:7 [|bootp]

I guess the reply isn't liked.
> If I run 'search' on the boot admin menu of the HP, it gives me the IP address
> of the DHCP server, so I know it's receiving traffic. It just won't "go" for
> some reason, and I'm not seeing any error or diagnostic messages outside of
> 'boot failed'.

It should report the tftp server where the kernel is held.


> Any ideas or well known gotchas?
>
>
Nick
a***@zadzmo.org
2013-02-20 06:12:27 UTC
Permalink
On Tue, 19 Feb 2013 09:02:36 +0000 Nick Hudson <***@gmx.co.uk> wrote

Thanks for the fast reply, Nick!


> > If I run 'search' on the boot admin menu of the HP, it gives me the IP
> > address of the DHCP server, so I know it's receiving traffic. It just won't
> > "go" for some reason, and I'm not seeing any error or diagnostic messages
> > outside of 'boot failed'.
>
> It should report the tftp server where the kernel is held.
>

This was the critical hint I needed, I found the issue in the DHCP server
settings and got it to get it's bootloader.

I'm getting a different issue now, however. If I do nothing, the bootloader
spits out this error message:

open lf0a:netbsd Input/output error

..and tries again on an endless loop. Manually typing in 'lf0a:netbsd' usually
fails with the same, but adding one of the prompted options (like -c) goes just
a bit further:

Start @ 0x2000000 [1=0xf0e000-0xf0e45c]
btlb info: minsz=128, maxsz=16384
btlb fixed: i=0, d=0, c=8
btlb varbl: i=0, d=0, c=0


This was with trying the SYSNBSD file from 6.0.1; results with trying
netinstall.lif and from SYSNBSD from 6.0 were similar.
Nick Hudson
2013-02-20 23:42:47 UTC
Permalink
On 02/20/13 06:12, ***@zadzmo.org wrote:
> On Tue, 19 Feb 2013 09:02:36 +0000 Nick Hudson <***@gmx.co.uk> wrote
>
> Thanks for the fast reply, Nick!
>
>
>>> If I run 'search' on the boot admin menu of the HP, it gives me the IP
>>> address of the DHCP server, so I know it's receiving traffic. It just won't
>>> "go" for some reason, and I'm not seeing any error or diagnostic messages
>>> outside of 'boot failed'.
>> It should report the tftp server where the kernel is held.
>>
> This was the critical hint I needed, I found the issue in the DHCP server
> settings and got it to get it's bootloader.
>
> I'm getting a different issue now, however. If I do nothing, the bootloader
> spits out this error message:
>
> open lf0a:netbsd Input/output error


This looks like a truncated file on the tftp server to me. Or maybe too
many errors on download?
>
> ..and tries again on an endless loop. Manually typing in 'lf0a:netbsd' usually
> fails with the same, but adding one of the prompted options (like -c) goes just
> a bit further:
>
> Start @ 0x2000000 [1=0xf0e000-0xf0e45c]
> btlb info: minsz=128, maxsz=16384
> btlb fixed: i=0, d=0, c=8
> btlb varbl: i=0, d=0, c=0
>
>
> This was with trying the SYSNBSD file from 6.0.1; results with trying
> netinstall.lif and from SYSNBSD from 6.0 were similar.
>

Same (truncate download)

Nick
a***@zadzmo.org
2013-02-27 06:53:28 UTC
Permalink
On Wed, 20 Feb 2013 23:42:47 +0000 Nick Hudson <***@gmx.co.uk> wrote

> >
> > ..and tries again on an endless loop. Manually typing in 'lf0a:netbsd'
> > usually fails with the same, but adding one of the prompted options (like
> > -c) goes just a bit further:
> >
> > Start @ 0x2000000 [1=0xf0e000-0xf0e45c]
> > btlb info: minsz=128, maxsz=16384
> > btlb fixed: i=0, d=0, c=8
> > btlb varbl: i=0, d=0, c=0
> >
> >
> > This was with trying the SYSNBSD file from 6.0.1; results with trying
> > netinstall.lif and from SYSNBSD from 6.0 were similar.
> >
>
> Same (truncate download)
>

I've learned a lot about the TFTP protocol these last few days...

1) Solaris 10's TFTP server implementation and the HP712 aren't compatible. the
HP wants an option that Solaris doesn't implement.

2) Using Wireshark during boot, it appears the entire LIF file is being
transferred.

3) After downloading the LIF onto another system using tftp, the md5sum matches
what's on the tftp server.

..and I still get that 'btlb' output mentioned above just before a hard hang.
No I/O error messages though.

It seems to be loading the kernel, and then something goes wrong before the
kernel runs. I doubt it's a hardware problem as I get a working install of
HP-UX 10.20 if I boot from disk; It even starts X11 with CDE.
Nick Hudson
2013-03-04 07:55:45 UTC
Permalink
On 02/27/13 06:53, ***@zadzmo.org wrote:
> On Wed, 20 Feb 2013 23:42:47 +0000 Nick Hudson <***@gmx.co.uk> wrote
>
>>> ..and tries again on an endless loop. Manually typing in 'lf0a:netbsd'
>>> usually fails with the same, but adding one of the prompted options (like
>>> -c) goes just a bit further:
>>>
>>> Start @ 0x2000000 [1=0xf0e000-0xf0e45c]
>>> btlb info: minsz=128, maxsz=16384
>>> btlb fixed: i=0, d=0, c=8
>>> btlb varbl: i=0, d=0, c=0
>>>
>>>
>>> This was with trying the SYSNBSD file from 6.0.1; results with trying
>>> netinstall.lif and from SYSNBSD from 6.0 were similar.
>>>
>> Same (truncate download)
>>
> I've learned a lot about the TFTP protocol these last few days...
>
> 1) Solaris 10's TFTP server implementation and the HP712 aren't compatible. the
> HP wants an option that Solaris doesn't implement.
>
> 2) Using Wireshark during boot, it appears the entire LIF file is being
> transferred.
>
> 3) After downloading the LIF onto another system using tftp, the md5sum matches
> what's on the tftp server.
>
> ..and I still get that 'btlb' output mentioned above just before a hard hang.
> No I/O error messages though.
>
> It seems to be loading the kernel, and then something goes wrong before the
> kernel runs. I doubt it's a hardware problem as I get a working install of
> HP-UX 10.20 if I boot from disk; It even starts X11 with CDE.
>
>
>

Can you post the full boot output and try to add the debug flag (-d) to
the boot. Hopefully,
we can set breakpoints to see how far it gets.

Nick
a***@zadzmo.org
2013-03-07 03:24:29 UTC
Permalink
On Mon, 04 Mar 2013 07:55:45 +0000 Nick Hudson <***@netbsd.org> wrote

>
> Can you post the full boot output and try to add the debug flag (-d) to
> the boot. Hopefully,
> we can set breakpoints to see how far it gets.
>

It's not even getting that far. I'm 99% certain that the kernel, if loaded,
isn't being executed.

As a test I tried OpenBSD 5.2's LIF image and got multiuser without any
trouble. Digging further, I found that netinstall.lif from version 4.0 in the
NetBSD archives also boots without trouble. I suspect a bootloader bug.


It may or may not be relevant, but I'm not running this system with a serial
console - it's got keyboard and mouse. This is why I haven't posted the full
output. I tried to photograph the screen with my smartphone but it's too blurry
to read. I'll be (hopefully) borrowing a friend's DSLR for an unrelated task
soon and can send it to you then.

In the meantime, I'll try to find the latest NetBSD LIF image that boots.
Loading...