MetaChat REGISTER   ||   LOGIN   ||   IMAGES ARE OFF   ||   RECENT COMMENTS




artphoto by splunge
artphoto by TheophileEscargot
artphoto by Kronos_to_Earth
artphoto by ethylene

Home

About

Search

Archives

Mecha Wiki

Metachat Eye

Emcee

IRC Channels

IRC FAQ


 RSS


Comment Feed:

RSS

13 September 2006

Why is Metafilter down every morning? Is it post-partum depression?
Or did my IT department finally block it?
posted by thirteenkiller 13 September | 10:33
Is it post-partum depression?

The JRuns.
posted by trondant 13 September | 10:34
Looks like a bad day for downtime everywhere.

My blog is out, as well as my (Dream)host.

Even Slashdot is giving a 503 error.
posted by grabbingsand 13 September | 10:38
It's been down the entire UK working day. Does this mean that, since no adverts have been served today, Matt's just lost 1/365th of his salary? That sucks.
posted by blag 13 September | 11:01
Matt said a while back that adding new sections (like music. and jobs.) sometimes causes inexplicable server crashes, and that they usually "organically" work themselves out.

Although it seems to me like it's been getting worse recently, not better. Yesterday and today were the first times it's stretched into the US morning...
posted by muddgirl 13 September | 11:05
I think he's trying out MetaFilterPM™
posted by chococat 13 September | 11:11
Matt said a while back that adding new sections (like music. and jobs.) sometimes causes inexplicable server crashes, and that they usually "organically" work themselves out.


That makes absolutely no sense. Not the "inexplicable" and certainly not the "organically".
posted by orthogonality 13 September | 11:11
Don't look at me, ortho. I didn't say it.
posted by muddgirl 13 September | 11:16
status.metafilter.com says, "There was a hardware failure, hopefully coming back online later today."

metafilter.ishavingamassage.com says, "metafilter is getting a massage and will return soon."
posted by thatwhichfalls 13 September | 11:23
I wondered what organic products Matt was smoking when he had that explanation.

I'm beginning not to miss it though.
posted by birdherder 13 September | 11:23
Well, 'organically' might just mean that he, as an organic entity, does troubleshooting when problems arise.

I think you probably take Matt a little tooooooo literally orthogonality.

[Is it just me or is metachat flowing like treacle too?]
posted by peacay 13 September | 11:29
peacay said: [Is it just me or is metachat flowing like treacle too?]
Yes. Also, are you on weboggle sometimes? I've seen someone with this username there.
posted by frecklefaerie 13 September | 11:38
weboggle

Qu'est-ce que c'est que ça?
posted by peacay 13 September | 11:59
That - "Sometimes these things just work themselves out organically" line. I'd go along with that. I think computers are so complicated these day that they do exhibit "organic" style complex behaviours.

I have more of a tendency to treat bugs only as bugs if they happen more than once. I've seen so much weirdness that can crop up a couple of times, and then seemingly, magically fix themselves while you're trying to trace them. Don't ask me why.

metachat: Looks like the recent shortages may be related to the server I keep my VPS on. The ISP is going to do a bit of upgrading (woot - more downtime), and we'll see if that sorts things out. Just hang on in there folks. I am working on it.
posted by seanyboy 13 September | 12:00
Somedays I don't notice the burning bulb of Metafilter is a soft glow like a chap on his wedding night stuck somewhere in Wales or Bangkok, but lately the problem is annoying to say the least.

The servers and/or code need a nudge. Otherwise, I will be stuck feeling like my 5 bucks to participate in one of the greatest communities ever --- is super worthwhile.
posted by funmonkey1 13 September | 12:05
Matt said a while back that adding new sections (like music. and jobs.) sometimes causes inexplicable server crashes, and that they usually "organically" work themselves out.


Sometimes I think maybe expansion and new features and such shouldn't be at the top of the Metafilter to-do list and a better priority would be improving basic site functionality like changing passwords and having a reliable server. But it's Metafilter; if people want gimmicks like "uptime" they can go to Fark.
posted by thirteenkiller 13 September | 12:19
thirteen - matt's addressed that a couple times, too. Is one of the FAQ questions, "Why is MeFi down?"

Ach, I think I sound like I'm coming down really anti-mathowie. I think he's doing a great job (much better than I could do) with what is essentially a home-built 30,000 person playpen.
posted by muddgirl 13 September | 12:26
Yeah, for a non-programmer, matt's doing a great job.
posted by orthogonality 13 September | 12:32
It's not my fault!!!
posted by jrun 13 September | 13:06
C'mon, Metafilter! What am I supposed to do all day? Work? Fuck!
posted by Fuzzy Monster 13 September | 13:25
I think maybe expansion and new features and such shouldn't be at the top of the Metafilter to-do list and a better priority would be improving basic site functionality

Actually, the site functionality had been fine lately until the jobs site went live. I'm not sure if Matt is even planning to add a change password feature and I don't think I've seen it come through feature requests in MetaTalk for a while, maybe ask there?

The site has been hard down -- as in won't come back after a CF reboot and didn't come up after a server reboot -- since about four this morning EST and Matt has a trouble ticket in with the colo facility that the MeFi box is at.
posted by jessamyn 13 September | 13:28
It was the ghost of the jabber server.
posted by appidydafoo 13 September | 16:04
Thanks for the update, Jess.
posted by blag 13 September | 17:58
Not that I'm any kind of System Adminstrator-type, but I echo seanyboy's comment here. Sometimes, problems with computers just seem to go away by themselves and never come back. It's only when they keep appearing that I do anything. I agree that it is probably the complexity of modern computers causing it - a single machine does so many different tasks all at once and there are so many possible reasons for things to go wrong that to try and duplicate the exact combination and/or sequence of events that caused the problem is nearly impossible unless it recurs by itself (while someone is watching/logging exactly the right events).
posted by dg 13 September | 18:02
As I hear stuff I'll put it on the MeFi wiki. It's looking like this evening, California time at this point for an uptime estimate.
posted by jessamyn 13 September | 18:10
Go here if you want to be emailed when MeFi comes back up.
posted by cillit bang 13 September | 19:15
Yeah, for a non-programmer, matt's doing a great job.

Actually, no. For a programmer, matt's doing a decent job of server administration. But we've found out in the past that it's never been his strongest suit.

Anyway, I doubt he does root-cause analysis or anything like that. It's more like, "Hmm, maybe the log files got too big, so I'll delete them."

This time it might be more than that, like a disk failure or something.

I'm not sure if Matt is even planning to add a change password feature and I don't think I've seen it come through feature requests in MetaTalk for a while, maybe ask there?

Yeah, I think he convinced everybody not to ask for that one anymore ...
posted by stilicho 13 September | 19:25
Metafilter is down as much as it is, because it isn't more important to anybody that it be up more. That may sound harsh towards User #1, but I don't intend it that way, and it's a fact of life.

Professional IT guys talk about "five 9's reliability" as some kind of commercial benchmark, meaning that systems which handle transactions/money/communications commercially are generally expected to be operable 99.999% of the time. To get to that vaunted standard, you can only have about 5 & 1/2 minutes of downtime a year, so obviously, rebooting a server once a year would drop it below "five 9's" reliability, given that many servers take about that long to perform a graceful shutdown & reboot. So, to get "five 9's" means you have to have at least 2 servers, either of which can support the user load, and some way of switching between them instantly while keeping everything on both machines totally in sync. It can be done, but it's way more expensive than running a single server (or Web + database servers) hobby site.

So, with the outages of the last several days, MeFi is heading for something like 98 or even 97% reliability, and that might still be pretty good, compared to some services, but it's clearly not "mission critical" service level, because it's not built that way. Could it be?

The answer to that isn't going to come clear until Matt decides he wants reliability more than he wants control of the site, and until he can attract competent people to take on a reliability engineering mission. But The Blue is Matt's, and until/unless he changes his mind enough to change the structure, or somebody makes him an economic offer he can't refuse, 90% reliability may be good enough.

Because if it were $1 million, or $10 million important enough to someone, it could be "five 9's" reliable. Put your money where your mouth is, if it's that important to you, and make Matt an offer.
posted by paulsc 13 September | 19:56
Meta-DTs
posted by caddis 13 September | 20:20
Damn, it looks like Matt is having to become quite the server engineer:

http://status.metafilter.com/2005/10/sweet_jesus_the.html
posted by scarabic 13 September | 21:39
Pinging the server works now... that's a good sign.
posted by gsteff 13 September | 21:44
You are not authorized to view this page
You do not have permission to view this directory or page using the credentials that you supplied because your Web browser is sending a WWW-Authenticate header field that the Web server is not configured to accept.
--------------------------------------------------------------------------------

Please try the following:

Contact the Web site administrator if you believe you should be able to view this directory or page.
Click the Refresh button to try again with different credentials.
HTTP Error 401.2 - Unauthorized: Access is denied due to server configuration.
Internet Information Services (IIS)


My Mefi links bar button is set for lofi.mefi and I just tried and got that page in response. Somebody please explain what that means in English dogs and cats could understand. Is it something on my side or are things still screwy ?
posted by y2karl 13 September | 22:58
Nothing is wrong on your side. It just means that some machine is up on that IP address, but that evidently there's still quite a bit of configuration to do. Especially considering that mefi runs on the apache web server, and that error message came from microsoft's iis.
posted by gsteff 13 September | 23:04
It's psychosomatic - you need a lobotomy. I'll get a saw.
posted by dg 13 September | 23:10
Is this better?
Seeing this instead of the website you expected?

This page is here because the site administrator has changed the configuration of this web server. Please contact the person responsible for maintaining this server with questions. The Apache Software Foundation, which wrote the web server software this site administrator is using, has nothing to do with maintaining this site and cannot help resolve configuration issues.
posted by dg 14 September | 00:56
MeCha suggestion box? || Dumbco Update:

HOME  ||   REGISTER  ||   LOGIN