Stratix 8000 IOS recovery (or how dd saved the day)
Oct.15, 2010, under Networks, Sysadmin
I’ve been playing with some Stratix 8000 switches lately – if you’ve never come across them they are built for heavy duty environments and are a result of a collaboration between Rockwell and Cisco. They run a Catalyst OS so if you’ve used a Cisco switch you’ll be in familiar territory. During my work with them I somehow ended up with a corrupt IOS following an upgrade and the switch would no longer boot, giving the console error message below
... mifs[7]: 684 files, 26 directories mifs[7]: Total bytes : 64094208 mifs[7]: Bytes used : 11614208 mifs[7]: Bytes available : 52480000 mifs[7]: mifs fsck took 60 seconds. ...done Initializing Flash. done. Loading "flash:/ies-lanbase-mz.122-50.SE2/ies-lanbase-mz.122-50.SE2.bin"...flash:/ies-lan base-mz.122-50.SE2/ies-lanbase-mz.122-50.SE2.bin: magic number mismatch: bad mzip file Error loading "flash:/ies-lanbase-mz.122-50.SE2/ies-lanbase-mz.122-50.SE2.bin" Interrupt within 5 seconds to abort boot process. Boot process failed... The system is unable to boot automatically. The BOOT environment variable needs to be set to a bootable image.
No problem I thought, I’ll just use Rommon and suck down a clean image from a TFTP server. How wrong I was! These switches don’t have Rommon, instead they have their own boot OS which bizarrely doesn’t seem to support any kind of networking whatsoever. It can format the filesystem and do basic file operations, but thats it as far as I can tell. You quickly find yourself stuck with no way to upload an image, and the scant documentation unhelpfully suggests that you reset your switch to factory defaults. If you follow this advice you now have a switch with no config and still a corrupt IOS. There doesn’t appear to be any documentation at all about the strange little OS you find yourself stuck in, so its time to experiment.
Plugging the flash card into my Windows machine showed that it wasn’t formatted in a way that Windows could read it, so you can’t copy an image that way. Formatting it as FAT resulted in a strange situation where both Windows and the switch could write to the flash card, but neither could see the others files. Unfortunately I didn’t have easy access to a Linux machine to see if it was readable on there, I needed another way to get the right data onto the card. I did have other working Stratixes, so I had the idea of cloning a working flash card. You can’t do this natively in Windows so I had to find a Windows version of the well known Linux tool, dd.
dd is a very low level tool that copies data at a block level. It doesn’t see files or folders or even disk formats, it just sees the raw bits. The plan was to make an image of a working flash card, and then dump that onto the failed one. In theory you should end up with a perfect clone, and this way Windows doesn’t need to be able to read the disk format. I used the tool as follows, first listing the available drives, second making an image of a good flash card and finally writing that image onto the corrupt one:
D:\Programs\dd>dd --list
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
Win32 Available Volume Information
[snip]
\\.\Volume{43371b24-d6a0-11df-b040-005056c00008}\
link to \\?\Device\HarddiskVolume10
removeable media
Mounted on \\.\l:
[snip]
D:\Programs\dd>dd if=\\.\l: of=stratix.img
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
125440+0 records in
125440+0 records out
D:\Programs\dd>dd if=stratix.img of=\\.\l:
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
125440+0 records in
125440+0 records out
D:\Programs\dd>
Happily it worked flawlessly, the cloned flash card contained an exact copy of the working IOS and I was able to get my switch working again. I’d love to know the manufacturer’s recommended restore method, but as is often the case the documentation is lacking.