Sysadmin
Stratix 8000 IOS recovery (or how dd saved the day)
by rxtx on Oct.15, 2010, under Networks, Sysadmin
I’ve been playing with some Stratix 8000 switches lately – if you’ve never come across them they are built for heavy duty environments and are a result of a collaboration between Rockwell and Cisco. They run a Catalyst OS so if you’ve used a Cisco switch you’ll be in familiar territory. During my work with them I somehow ended up with a corrupt IOS following an upgrade and the switch would no longer boot, giving the console error message below
... mifs[7]: 684 files, 26 directories mifs[7]: Total bytes : 64094208 mifs[7]: Bytes used : 11614208 mifs[7]: Bytes available : 52480000 mifs[7]: mifs fsck took 60 seconds. ...done Initializing Flash. done. Loading "flash:/ies-lanbase-mz.122-50.SE2/ies-lanbase-mz.122-50.SE2.bin"...flash:/ies-lan base-mz.122-50.SE2/ies-lanbase-mz.122-50.SE2.bin: magic number mismatch: bad mzip file Error loading "flash:/ies-lanbase-mz.122-50.SE2/ies-lanbase-mz.122-50.SE2.bin" Interrupt within 5 seconds to abort boot process. Boot process failed... The system is unable to boot automatically. The BOOT environment variable needs to be set to a bootable image.
No problem I thought, I’ll just use Rommon and suck down a clean image from a TFTP server. How wrong I was! These switches don’t have Rommon, instead they have their own boot OS which bizarrely doesn’t seem to support any kind of networking whatsoever. It can format the filesystem and do basic file operations, but thats it as far as I can tell. You quickly find yourself stuck with no way to upload an image, and the scant documentation unhelpfully suggests that you reset your switch to factory defaults. If you follow this advice you now have a switch with no config and still a corrupt IOS. There doesn’t appear to be any documentation at all about the strange little OS you find yourself stuck in, so its time to experiment.
Plugging the flash card into my Windows machine showed that it wasn’t formatted in a way that Windows could read it, so you can’t copy an image that way. Formatting it as FAT resulted in a strange situation where both Windows and the switch could write to the flash card, but neither could see the others files. Unfortunately I didn’t have easy access to a Linux machine to see if it was readable on there, I needed another way to get the right data onto the card. I did have other working Stratixes, so I had the idea of cloning a working flash card. You can’t do this natively in Windows so I had to find a Windows version of the well known Linux tool, dd.
dd is a very low level tool that copies data at a block level. It doesn’t see files or folders or even disk formats, it just sees the raw bits. The plan was to make an image of a working flash card, and then dump that onto the failed one. In theory you should end up with a perfect clone, and this way Windows doesn’t need to be able to read the disk format. I used the tool as follows, first listing the available drives, second making an image of a good flash card and finally writing that image onto the corrupt one:
D:\Programs\dd>dd --list
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
Win32 Available Volume Information
[snip]
\\.\Volume{43371b24-d6a0-11df-b040-005056c00008}\
link to \\?\Device\HarddiskVolume10
removeable media
Mounted on \\.\l:
[snip]
D:\Programs\dd>dd if=\\.\l: of=stratix.img
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
125440+0 records in
125440+0 records out
D:\Programs\dd>dd if=stratix.img of=\\.\l:
rawwrite dd for windows version 0.6beta3.
Written by John Newbigin <jn@it.swin.edu.au>
This program is covered by terms of the GPL Version 2.
125440+0 records in
125440+0 records out
D:\Programs\dd>
Happily it worked flawlessly, the cloned flash card contained an exact copy of the working IOS and I was able to get my switch working again. I’d love to know the manufacturer’s recommended restore method, but as is often the case the documentation is lacking.
Netbackup tape inventory
by rxtx on May.05, 2010, under Sysadmin
One of the strange things about the netback Windows GUI is that theres no way to manually inventory a stand alone tape drive. To do this you need a bit of command line knowledge. You’ll mainly need to do this to import media from other servers or media which has been used previously in different backup software. For a stand alone drive the command is as follows, but you can use this on libraries too.
(Standalone drive inventory)
vmphyinv {-n drive_name | -u device_number} [-h device_host]
[-non_interactive] [-verbose]
C:\Program Files\VERITAS\Volmgr\bin>vmphyinv.exe -u 2 -h tapesvr
Proposed Change(s) to Update the Volume Configuration ===================================================== Logically add new media BE????. Logically update EMM database, if required.
Update volume configuration? (y/n) n: y
Added new media BE0000 on host tapesvr. Added media ID BE0000 to EMM database.
C:\Program Files\VERITAS\Volmgr\bin>
You get the device number from the activity monitor->drives screen. Once you run this command Netbackup will start to read the images on the tape, you can see this on the catalog->results screen. Once this is complete the media will appear on the catalog->search screen ready for the phase 2 import.
Windows 7 command line USB partitions
by rxtx on Mar.19, 2010, under Sysadmin
There is a very annoying issue in Windows, in that it doesn’t let you have more than one partition on a USB drive. There was a workaround for this in XP, but I haven’t been able to get it working in the newer versions. In addition if you have a multi partition USB device and try to use Windows to format it via disk management, you will run into more difficulties where it can only manage the first partition. I can’t help with the first problem, but here is how you solve the second.
Building a 2008 R2 template VM
by rxtx on Mar.12, 2010, under Sysadmin
Building a template is something you don’t do very often – you tend to do it once and then forget about it. Today I had to make a Windows 2008 R2 template VM for VMware, but luckily I found a handy guide which had done all the hard work for me.
If you use it yourself its worth looking at each setting and asking if it applies to your environment. Some of the settings will probably be set by your GPOs anyway once you add machines deployed from it to your domain, other bits are just slightly anal and unnecessary.
As a side note 2008 R2 only supports 64 bit processors, so make sure your environment is capable fo this before you proceed.
Running a command on every machine in the domain
by rxtx on Feb.24, 2010, under Security, Sysadmin
This post on pauldotcom is a handy way of running a command line instruction on every machine in the domain. Ideally you’d use group policy for this kind of thing, but its still useful to know.
Resetting your DRAC
by rxtx on Feb.24, 2010, under Sysadmin
I had an issue today with a Dell remote access card (DRAC). This is a card which you get in Dell servers, and is used to perform remote managment. In some situation it can be better than other remote access methods, since it gives you access to the console from boot (so you can view any BIOS messages) and can be used to power on the server remotely. At least thats the idea – in this particular case the card was running very slowly and the remote power on functionality wasn’t working. This isn’t great when you’ve just turned off a server which you don’t have physical access to. Luckily we can solve this by SSHing onto the DRAC and running a reset command. There are actually quite a lot of things we can do from the SSH interface:
login as: root
root@192.168.100.12's password:
Dell Remote Access Controller 5 (DRAC 5)
Firmware Version 1.40 (Build 08.08.22)
$ racadm help
help [subcommand] -- display usage summary for a subcommand
arp -- display the networking ARP table
clearasrscreen -- clear the last ASR (crash) screen
clrraclog -- clear the RAC log
clrsel -- clear the System Event Log (SEL)
config -- modify RAC configuration properties
coredump -- display the last RAC coredump
coredumpdelete -- delete the last RAC coredump
fwupdate -- update the RAC firmware
getconfig -- display RAC configuration properties
getniccfg -- display current network settings
getraclog -- display the RAC log
getractime -- display the current RAC time
getsel -- display records from the System Event Log (SEL)
getssninfo -- display session information
getsvctag -- display service tag information
getsysinfo -- display general RAC and system information
gettracelog -- display the RAC diagnostic trace log
ifconfig -- display network interface information
netstat -- display routing table and network statistics
ping -- send ICMP echo packets on the network
racdump -- display RAC diagnostic information
racreset -- perform a RAC reset operation
racresetcfg -- restore the RAC configuration to factory defaults
serveraction -- perform system power management operations
setniccfg -- modify network configuration properties
sslcertview -- view SSL certificate information
sslcsrgen -- generate a certificate CSR from the RAC
testemail -- test RAC e-mail notifications
testtrap -- test RAC SNMP trap notifications
version -- display the version info of RACADM
vmdisconnect -- disconnect virtual media connections
vmkey -- perform virtual media key operations
usercertview -- view user certificate information
$
To reset the DRAC, we need the racreset command. This will re-initialise the DRAC and after a minute or so everything should be working again
CCIE count drops again
by rxtx on Feb.09, 2010, under Sysadmin
Each month Cisco publish the worldwide CCIE count, which shows how many people have gained certifications over the last month. However it is possible to do a little maths and get a fuller picture – in this case that the numbers are dropping.
The CCIE consists of two parts, the first is a written exam which tests basic knowledge and after that you do a day long lab exam. The lab exam is considered to be the hardest of the two, with most people requiring multiple attempts. Once you have the certification, you just need to pass the written exam every few years to keep it.
Without any input from those who didn’t recertify its hard to work out why they didn’t bother. Change in job role could account for some but it seems unlikely that this would account for the full 61. Is it just that now there are more people with it, the CCIE is less highly regarded?
CCNP track updated
by rxtx on Jan.27, 2010, under Sysadmin
Every now and then Cisco update their exam tracks, and this time its the CCNP’s turn. Personally, I think the CCNP is hands down the most useful Cisco qualification to have if you work with WAN and LAN networks on a regular basis. The CCNA is too basic to be of much practical use, and the CCIE is great if you do networks full time but today people tend to expect you to know more than one area.
If you are unfamiliar with the CCNP, the previous track consisted of four exams which can be briefly summed up as follows: BSCI (routing), BCMSN (switching), ONT (QoS + wireless), and ISCW (everything else – VPNs, DSL, MPLS, security). The new track is three exams.
The changes are very interesting – I always saw the core of this track as being routing and switching and Cisco seem to be acknowledging that with the first two exams, ROUTE and SWITCH. If you delve a bit deeper into the actual exam topics you can see that they’ve actually cut a lot of the content which isn’t routing or switching out. ROUTE looks to be basically the BSCI exam, with a very small coverage of the VPN and DSL topics from ISCW. SWITCH is the BCMSN with a bit of security. The third exam is TSHOOT, which is aligning with new CCIE track by adding a dedicated troubleshooting element.
Personally I’m 50/50 about the changes. Cisco seem to be trying to make each track very specific with no overlaps (the current CCNP has some overlap with the CCVP, CCSP and CCIP), and while I can see why they would want to do this I think it will produce less rounded engineers at the end of it. If you do the current CCNP you come out of it knowing a lot about routing and switching, and enough about everything else that you can work out most issues after a little research. Its kind of the jack of all trades qualification, which you might expect based on the acronym. With the changes it is turning more into the CCR&SP. However I do like is the inclusion of the troubleshooting section since just setting equipment up in the first place is only the start of your job, you then have to go and support it.
Luckily I got my CCNP just last year so I’m not affected by the changes, but candidates who are halfway through theirs can either continue with the current track (until July), or substitute BSCI and BCMSN exams they have already completed for ones on the new track. More info on this here
New Microsoft security technologies
by rxtx on Jan.26, 2010, under Sysadmin
We’ve just had a visit from some Microsoft guys who were going over their new offerings, and on paper it looks very impressive. They seem to be moving to fill in all the holes which previously required 3rd party applications, and it all integrates nicely with existing MS infrastructures.
One of the most interesting things is that they have finally come up with their own AV solution, which uses multiple existing engines plus one of their own. I’m also pretty happy that there is finally an IPS solution (built into TMG, which is roughly the replacement to ISA). They are also jumping on the ‘cloud’ bandwagon and providing outsourced Exchange spam filtering and mail archiving.
All this stuff is either out now or coming out pretty soon, so it will be interesting to see if it holds up to competition once it gets in the wild.
Apcupsd PowerChute in Linux
by rxtx on Jan.23, 2010, under Sysadmin
If you have ever had to install a PowerChute agent on a Linux machine, you have probably come across the problem that a Java based GUI is required to configure it.Many Linux servers don’t have a GUI and you may not want to have to worry about setting up Java. If this is the case, we can turn to apcupsd to help with our problem. The example commands here work on RHEL5, if you have a different distribution your files may be in different places. (continue reading…)