Site Downtime (May 2016)

This forum is for Administrators to post news concerning the site and forums.
User avatar
Kison
Kison
.GIFted
User avatar
User avatar
Kison
.GIFted
.GIFted
Posts: 6714
Joined: January 22, 2007

Site Downtime (May 2016)

Post Post #0 (isolation #0) » Tue May 17, 2016 1:29 am

Post by Kison »

Here's a brief recap of what I know:

Mafiascum's server went down unexpectedly on Saturday. Mith may have more details on why. All I know is the server came back up in the early AM today. Unfortunately it seems like it was shut down abruptly. This caused a lot of the database tables to become corrupt, which is why the board was only partially available until a few minutes ago. We had to run some repair operations which took a fair amount of time. Those are now complete, so we should
(famous last words)
be up and running without any data loss.

Questions? Let us know here.
User avatar
Kison
Kison
.GIFted
User avatar
User avatar
Kison
.GIFted
.GIFted
Posts: 6714
Joined: January 22, 2007

Post Post #35 (isolation #1) » Tue May 17, 2016 12:24 pm

Post by Kison »

In post 7, Kublai Khan wrote:Are there any truths to the reports of catnip being found in the server rooms?
Tiger nip.
In post 10, inte wrote:shutup nerds i expect to be compensated for this downtime
I've got like a piece of gum. Want it?
In post 15, BNL wrote:Also apparently there was a temporary forum where MS people went to when the site was down, and I was uninformed about it? :(
Yeah, I think the biggest lesson here actually is that we just are not well prepared for complete server unavailability like we witnessed:
  • There was some latency in notifying the host of the issue because mith wasn't aware and my method of contacting him wasn't ideal. We've resolved that.
  • The fallout shelter had registration shut off so many people couldn't get in.
  • Not many people know about the fallout shelter to begin with.
We'll want to resolve these issues to make another similar event less impactful. My thoughts:

- Fix the spam issues in the fallout shelter & make sure registrations are always open. Even better, figure out a way to sync the primary userbase with the fallout userbase on a nightly basis.
- Point an easy to remember subdomain to the fallout:
fallout.mafiascum.net
or something similar.
- Better communication via social media.
User avatar
Kison
Kison
.GIFted
User avatar
User avatar
Kison
.GIFted
.GIFted
Posts: 6714
Joined: January 22, 2007

Post Post #37 (isolation #2) » Tue May 17, 2016 1:05 pm

Post by Kison »

It would work no problem. DNS isn't impacted by our server going down, and you can point two subdomains to separate locations.
User avatar
Kison
Kison
.GIFted
User avatar
User avatar
Kison
.GIFted
.GIFted
Posts: 6714
Joined: January 22, 2007

Post Post #43 (isolation #3) » Tue May 17, 2016 4:01 pm

Post by Kison »

Well if we do everything I said in 35 we could still have another crash like this, it'd just help minimize the pandemonium. Under our current hosting situation it's not really feasible to prevent(though we've done pretty well so far). Higher fault tolerance would be a lot more achievable if we were hosted somewhere like AWS since we can spin up and shut down machines as needed and more easily create redundancy. If one of your machines crashes you can just start it back up yourself, or kill it completely and set up a new one without having to wait on anyone. I just need to get around to pricing it out to see if it's something we can afford.

Return to “News”