Archive for May, 2008

USA travel!

I’m in the US right now, for a work conference and some meetings. It’s been an interesting trip already: a non-eventful flight over (when I was expecting it to be really bad having not flown economy to the US before, yes I know I’m spoilt), and a fascinating and very enjoyable tour of Los Angeles yesterday with a local limo driver. The technical program of the conference starts tomorrow, and it looks like a very intensive programme. More details as they happen!

Edit: Technically I have flown economy to the United States: when Susan and I went to Hawaii we were on economy. I should have said that I haven’t flown economy to the mainland US before. Or maybe I shouldn’t have said anything… ;) Anyway, it was far-and-away the longest economy class flight I’ve ever done and I must say it wasn’t anywhere near as bad as I was imagining it.

Tags: ,

Zeroshell redux

I wrote about Zeroshell, and how I thought it was pretty great. I still do, but it hasn’t taken centre-stage in my network configuration like I thought it would. I’ve had to tone down my raves about some of its integrated features as well.

The fact that it hasn’t taken centre-stage is possibly as much to do with VMware’s bogus clock-drift problems as anything, as I haven’t dedicated hardware to my Zeroshell instance yet (I could keep it running virtual, but some of the things I want to do with it will make more sense if it’s a separate machine). VMware Server takes another barb for its handling of VLAN tagging (but to be fair that might be the Linux 8021q module works). It seems that if you have any VLAN definitions on a network card, VMware won’t get to see any VLAN tags on that NIC. You can get a guest attached to a bridged interface to see the real VLAN tags, but only if Linux has not got any VLAN awareness over that NIC.

Alright, so enough ragging on VMware. I have Zeroshell attached to the networks it needs and all is fine. Except that I can’t actually change anything! The web interface that I spoke so highly of originally is actually very restricted in some areas. One of these is in the RADIUS server, and it bit me badly when I decided I’d use Zeroshell’s RADIUS server to authenticate access to the Web interface of my Linksys switch. Turns out that the Linksys firmware expects a particular attribute to appear in the response from the RADIUS server.

The fact that Linksys don’t document this anywhere is not Zeroshell’s fault, but that there is no interface allowing me to do updates to the records above what Zeroshell uses for its own applications is a bit of an issue. It means that instead of a Zeroshell box potentially becoming the hub of administration functions, it is in danger of becoming just another little vertical application server that doesn’t integrate.

Having said that, the backend for most (all?) authentication data is LDAP so a tool like PHPLDAPAdmin might be usable to extend the base records. But, arguably, I shouldn’t have to do that! It is still beta software though, so improvements and enhancements will be made.

The other area that it’s a bit lacking in is monitoring/graphing. Okay sure, I’d probably integrate Zeroshell into the rest of my Cacti setup, but it would be nice if Zeroshell did like other router distos and had a pre-built statistics/graphing page.

Zeroshell is still my pick (I revisited pfSense and fixed the problem updating, but to me it doesn’t have enough function to justify running its own hardware), but it’s just not quite the bees-knees it was when I first saw it.

Tags: , ,

When Upgrades Go Wrong

I’m running Debian on a Linksys NSLU2 storage device, and it works really well in general. So well in fact that a lot of the time I forget the thing is even there! It’s sitting in the garage minding its own business, serving out video and music files, and storing backups of the other systems in the house. Just occasionally, however, the thought pops into my head to run a system update over it — a habit I’ve gotten into for the Gentoo systems in the house, but “the Slug” usually misses out. About a fortnight ago however I decided to do the “apt-get shuffle”. Timing, as they say in sport and comedy, is everything.

I’ve become fairly complacent about system updates. All the distros I use now have got excellent tools for keeping everything up-to-date, and for making sure that things don’t go wrong in the process. It’s all just software, however, and it’s all too easy for something to get missed or for a bug to creep in. One such bug that did exactly that is this one. Unreported at the time I did my update, it rendered my Slug unbootable after the update I gave it.

It took me a day to realise that the Slug was off the network. The failure of the nightly backups was my first clue. Next was the inability to stream any of the media files stored on it. For the next week, on-and-off, I tried a dozen things in an attempt to get it working again. I finally arrived at a process that used the Debian Installer firmware image as a way to get a running system onto the device, allowing me to then access the hard disk and try and reflash earlier kernel and initrd images to it.

I started trying to work on the boot disk, but I couldn’t see it for some reason. Then I discovered that the power supply of the USB2 disk enclosure that holds it was playing up! Now, I had two problems–was one related to the other? Was my boot problem just a hard disk problem all along? Turns out that the power supply failure was a coincidence–replacing the power supply got the disk working again but made no improvement in the bootup scenario.

The NSLU2 boots differently to a PC. On a PC, the BIOS locates some boot code on a storage device and executes that, which usually is a program like LILO or GRUB that has more intelligence and (in the case of GRUB) a way to interact with it. These boot loader programs then load in the kernel and start executing it. With the NSLU2, however, the kernel and the “initial root device” are written into the flash memory of the device–they more-or-less are the BIOS.

On a PC, if there’s a problem with the kernel or initrd you can generally select another one from a list. Worst-case would have you installing the hard-disk in a different PC and fixing the problem from there. On a NSLU2, however, any problem with the kernel or initrd can’t be fixed by changing the hard disk because the kernel and initrd aren’t read from the hard disk but from the flash memory instead. There’s also no option for selecting another kernel, since the NSLU2 is a “headless” device with no console (besides, there’d be no room in the flash memory for two copies of kernel and initrd).

Once I’d been able to get my Slug booting (by writing out a previous version of a kernel and initrd) I was going to leave it alone… but curiosity got the better of me. I’d suspected a bad update to the utility that generates the initrd, and sure enough an “apt-get update && apt-get upgrade” revealed a pending update to the initramfs-tools package. Google led me then to the above bug report. With fingers crossed I did the update, reflashed, and rebooted… successfully!

The Slug is now back in its usual place, quietly going about its business of entertaining us and keeping critical data safe. I might at least think twice before doing a kernel update on the poor beast in future though!

Tags: , , , ,

Laptop hard disk replacement, part one

A couple of weeks ago I had bootup problems with my old Sony laptop. I had replaced the hard disk in it last year (February), and everything was pointing to another busted hard disk. First time I’d had a machine outlive two hard disks! :(

Sure enough, I put a different disk in the laptop and it worked, and the original disk in a USB caddy failed (but only after working successfully a couple of times, leading me to think it was a transient problem and reassemble the laptop, at which point it failed again… sigh).

Through persistence and determination (and a couple of goes in the freezer) I managed to get a copy of the disk onto another drive. I then went shopping, but decided to check the warranty on the dud drive: lo-and-behold, it still had nearly four years of a five year warranty to run. Better yet, unlike the Western Digital I had to send at my own cost to Singapore for replacement, Seagate have an address in Australia that can be used.

Sod it, I said, anything more than the original 80GB (since for less than what I paid for the 80GB a year ago I’m looking at 160GB or more!) is wasted on this particular machine, so I completed the RMA, found a box to pack the drive in, and sent it off.

The address in Australia is a mail forwarder to Seagate in Singapore. I had to keep that in mind when I checked their order status page, which a week later was still showing “awaiting your return”. Nevertheless, it wasn’t long before the page changed to “shipped”. Looking a bit closer I could see that my 80GB drive must have put on a bit of weight on the way to its birthplace, as Seagate was sending me a 100GB drive in return!

Having left Singapore last Thursday the drive arrived on Monday, but due to work commitments (plus having to fix the Slug first) I wasn’t able to do anything with it until today. Stay tuned for the recovery exercise…