Friday, July 17, 2009

atreus service restored

atreus is now connected and running again. At some point soon I may need to reboot it to confirm that this will go more smoothly after a power cycle in future, but I will announce that in the usual way.

atreus status

The data centre atreus was in was hit by a power failure last night. It has since been moved to another data centre in the same ISP (big thanks to Jon Morby and the fido.net guys for their work here).

There is an issue with the primary kernel image not being functional, so it's being brought up on the secondary image, which apparently still works fine on the ground. It isn't yet correctly plugged back into the network, which should happen in the next few minutes, at which point I can log in and figure out what's going on, and fix the kernel rebooting problem. Even on a power cycle, with hardware RAID and a reasonably resilient fs, we shouldn't have any further problems as a result of this outage.

Thursday, July 16, 2009

Current downtime

Some time between about 1945 and 2115 BST today (16th July), atreus became unreachable, apparently from everywhere. At this stage, given what I'm seeing off the network, I think it's most likely that we've had a kernel crash, although there are other possibilities.

There isn't much I can do right now, but I'll investigate this further tomorrow morning.