Hacker News new | ask | show | jobs
BitBucket down with guru mediation notice (bitbucket.org)
24 points by CoreDev 5015 days ago
3 comments

Maybe you should wait until they are down longer than 15 minutes to make a big deal about it?
They've been down for significantly longer for myself and several other people on #bitbucket on Freenode.
They've been down for at least two hours now.
Seems to be up now, anyhow.

For those who didn't get the reference, and failed to see it "live" since the site is working: http://en.wikipedia.org/wiki/Guru_Meditation.

Basically, it's an Amiga retro thing. Which is awesome.

The screenshot of the error page before: http://s7.directupload.net/images/120919/69zd4jog.png
Still down for me (Italy)
I can confirm that for germany too. Maybe only europeans are affected?
Still down. India.
and? (nice to know they use varnish though)
The fact that there doesn't seem to have been any formal response from bitbucket/atlassian is quite disappointing to me. I can live with a few hours of downtime, but the lack of communication violates my trust in the company. Frankly, the "All systems up." message on http://status.bitbucket.org/ feels like a slap in the face.
You're right. We can do better. Here are the full details which are also posted on our blog (http://blog.bitbucket.org/2012/09/19/post-mortem-on-our-avai...)

Earlier today at 2am San Francisco time Bitbucket experienced about three hours of 500 error page responses for users attempting to access the user newsfeed and repository overview pages. The outage was caused by a kernel panic on our Redis server, which is responsible for pages that display recent events related to a user. We are very sorry for the inconvenience this outage has caused.

After rebooting the Redis server, the index that Redis uses to serve the newsfeed content was found to be corrupt, which caused certain pages on Bitbucket to fail. For users accessing pages deeper into the site, such as pull requests, commit views, wikis and issues the site continued to work as expected. During this time Git and Mercurial access continued to work over both HTTP and SSH. After identifying the cause of the problem, we turned off the newsfeed for all of Bitbucket bringing an end to the 500 errors.

With the newsfeed temporarily disabled, we began investigating the corruption problem and discovered a forum post (https://groups.google.com/forum/?fromgroups=#!topic/redis-db...) with instructions and a repair tool to fix the corrupted index. We then used the instructions to repair the index and restore full service to Bitbucket.

During this outage we have identified areas for improvement and are implementing changes to the way we manage the operations of Bitbucket:

1, Improve our escalation procedures so that the response times are faster during non-office hours 2, Update the Bitbucket codebase so we do not have the dashboard and repo overview fail when Redis becomes unavailable 3, Increase the number of tests that status.bitbucket.org performs triggering our automatic phone alert system

We are very sorry for the inconvenience this outage has caused.

Cheers, Justen Bitbucket product manager

kudos fot the "lesson learned" part of the story
The git backend is still up. Can commit to a private repo. Edit: Actually I can browse repos on the site and can view my account.

It just seems to be the home page. Everything else I've tested is working for me.

git backend working also for me (Italy)