Sunday, August 16, 2009

Service restored: treat with caution

atreus is running again, but an attempt to upgrade the kernel failed for reasons as yet undetermined (I'm finding that about 50% of kernel upgrade attempts fail at the moment, and am really missing the days of linux 2.2).

Logging from during the problem period suggests that it occurred due to memory exhaustion; I believe this then just caused the system to grind to a halt in the usual fashion. It doesn't appear to be an external attack, which means it might happen again. I'll monitor things over the next few days, and if I'm really lucky and have been a good boy, Linux will give us a new stable kernel that can boot on my hardware.

Update

atreus had to be hard rebooted (possibly because of a SYN flood attack, although I'm unconvinced); it was completely unresponsive on terminal. It then took most of two hours to verify the RAID set, and hopefully now will be able to complete booting. With luck we'll be back up under 24 hours after we went down.

atreus status

atreus is currently unavailable on all services; I'm waiting for an engineer from my ISP to reach the site so we can investigate.