• John Walker posted an update in the group Group logo of UpdatesUpdates 4 months, 3 weeks ago

    2020 July 10

    Around 14:56 UTC, the site stopped responding to any requests
    from the outside, even ICMP ECHO (ping).  The site thus appeared
    to be hard down, like the crash we experienced on 2020-06-30.  I
    even tried logging in to Fourmilab, which is in the same AWS
    data centre, and was unable to ping RB.  While I was preparing
    to bring up the AWS EC2 console and try rebooting the instance,
    it "got better" and everything returned to normal.  The
    WordPress dashboard showed nothing out of the ordinary and the
    site responded normally.  I checked /var/log/messages around the
    incident and it was completely nominal.
    My only guess is that some Dilbert (or Fritz) in the Frankfurt
    AWS data centre was fiddling with cables and briefly
    disconnected the server running our site from their internal
    backbone.  There were no persistent consequences from the
    outage, which lasted less than two minutes.  I ran a Garback
    just in case something further happened so as not to lose recent
    firewall additions.  I guess you can say the site was briefly
    "on the Fritz".