MetaChat REGISTER   ||   LOGIN   ||   IMAGES ARE OFF   ||   RECENT COMMENTS




artphoto by splunge
artphoto by TheophileEscargot
artphoto by Kronos_to_Earth
artphoto by ethylene

Home

About

Search

Archives

Mecha Wiki

Metachat Eye

Emcee

IRC Channels

IRC FAQ


 RSS


Comment Feed:

RSS

14 September 2005

MeFi's downtime Apologies if this has been explained in detail before: could someone explain to this non-technical user why MeFi is down with such frequency? I've done some front-end web design freelance work for friends, but that's the extent of it. It's the server-end of things that I confess almost total ignorance. What is it about MeFi's particular architecture which makes it so unreliable? What would improve it? I don't mean it so much as a gripe then I do a genuine technical curiousity. Of all the sites I hit on a daily basis, MeFi seems to be down twice as much as any of them. Why is this not true of other sites?
Twice as much as all of them put together.
posted by dreamsign 14 September | 12:03
The code that runs Metafilter is held together with bubblegum and pipe cleaners, and Matt (God love him) isn't the most technical tech there is out there.

What probably happens is that certain back-end programs crash due to memory errors or other errors, and like a house of cards, the server comes tumbling down.

Many have offered in the past to assist Matt in migrating the metafilter database to something inherently more stable, but Matt, for reasons only he knows, has been very resistant.

People have literally thrown buckets of money to the cause, to no avail.
posted by crunch 14 September | 12:08
Most of the time I don't care that it happens and I've sort of migrated away from metafilter anyway, for most of the reasons that other here mention, but I still like to visit and comment sometimes. Recently it's really started to piss me off, despite all the "it's matts site and he can run it the way he wants and wadda ya want it's free" type comments. He enjoys tremendous perks and noteriety for the site (and he could make a bunch more dough off it and he knows it) and if he's such a big internet reputation economy guy he should acknowledge that and show some fucking pride. That's the Irish/Swedish immigrant in me talking, I can just see my grandpa saying " I would rather burn this house down than have one floorboard out of true."
posted by Divine_Wino 14 September | 12:15
MetaFilter's unreliability -- more than a bit of joke now, really -- is to me part of its charm.
and, if MeFi worked well, we wouldn't have MetaChat. so yay for Jrun!

and jenleigh, when the site comes back to life, search MetaTalk, this has been discussed very often and tech-savvy users have given plenty of opinions.
as crunch said, the gist of it is, MeFi needs extensive work to become as reliable as most of the other popular sites out there. Matt apparently seems reluctant to take certain steps to fix it. but the site remains quite popular even if it crashes all the time, so more power to him. I think it was quonsar who said MeFi will never be really fixed until Matt sells it. it's probably true, at this point. he'd have fixed it already, if this weren't the case (the late-2004 newbies money didn't do anything, mostly, to improve the site's reliability)

posted by matteo 14 September | 12:37
This Metafilter, it vibrates?
posted by Cryptical Envelopment 14 September | 12:52
no, it doesn't. That's the problem
posted by puke & cry 14 September | 13:20
Well, then by another one!
posted by Cryptical Envelopment 14 September | 13:28
What probably happens is that certain back-end programs crash due to memory errors or other errors, and like a house of cards, the server comes tumbling down.

Those are the kinds of technical details I'm interested in hearing more about.
posted by jenleigh 14 September | 14:11
The irony that Carl can keep Plastic running better than Matt can with MeFi is, well, something. Words fail me.
posted by mischief 14 September | 14:25
It runs on a windows box.

It uses Jrun.

From comments Matt has made, it uses a less than optimal database platform (MySQL?).
posted by orthogonality 14 September | 14:42
I'm pretty sure it's MS SQL
posted by dodgygeezer 14 September | 14:43
What probably happens is that certain back-end programs crash due to memory errors or other errors, and like a house of cards, the server comes tumbling down.

Those are the kinds of technical details I'm interested in hearing more about.
database-driven websites are simple. and then there's metafilter.

first strike: windows platform. i almost need say no more at that point, but i will because i'm, ya know, quonsar.

Cold Fusion: expensive, complex, finicky, commercial windows server-side programming language. slow. clunky. resource hog. PHP on Unix would do everything it does, reliably and faster and using fewer resources. i'm baffled by his choices.

Jrun: more, expensive, finicky, commercial software. i don't even know what the fuck it does, or why it runs on the machine. it's an additional layer of complexity on what was already a house of cards. i think Jrun is used to communicate with the database. PHP handles database communication and doesn't require another application layer to do it. and keeps running. so i am baffled at his choices.

Database: i have no clue what he uses. if it's MS SQL, well, there ya go. more microsoft shit. more layers of complexity piled on top of a teetering bamboo hut. PostGreSQL is a proven, open source database which could easily handle the mefi load.

Unix, PHP, PostGreSQL == high volume high availability site.
Windows, Cold Fusion, JRun, and MS SQL == metafilter.

it's the reason he looks like this:
≡ Click to see image ≡
posted by quonsar 14 September | 15:13
"People have literally thrown buckets of money to the cause, to no avail."

That is quite a picture.
posted by sarah connor 14 September | 15:21
Hey look, there's a fire. Let's throw buckets of money on it!
posted by eekacat 14 September | 16:45
I think ColdFusion may have been the best choice for Matt when he started the site (in 1999, right?). I don't thnk PHP/MySQL was widely used then, and even if it was, ColdFusion might've been a better match for Matt's background.

MetaFilter started out as a personal project shared by a small group of people, and wasn't designed from the ground up to handle the load it (sometimes) does now.

JRun is a "reliable server" [!] that provides "superior performance and reliability" [!]. I'm not sure why MetaFilter needs both ColdFusion and JRun.

The JRun FAQ page doesn't list "How come JRun crashes like a mofo?" as an FAQ.
posted by kirkaracha 14 September | 17:47
I really couldn't care less about the technical stuff, more out of ignorance than anything.

What boggles my mind is that people who know what they're doing (allegedly, anyway) have practically begged Matt to let them work on the system. He's been offered money towards fixing the problems. And yet he keeps refusing any help. Matt says he's had someone working on it. A few tweaks here and there have been done with Matt's assurance that everything should be hunky-dory. Well, MeFi continues to crash.

Regarding Matt selling MeFi: one - I don't think that'll ever happen. I think he'd shut it down first. And two - if it does happen, it's the death of MeFi. I've been part of other communities (and yes, MeFi is a community) and any major upheaval, like a sale, has always been the end of it. It may stumble on in some sort of wounded manner, but it'll never be the same.

As said above, we do have one thing to thank Jrun for and that is MetaChat.

*group grope hug*
posted by deborah 14 September | 18:15
cold fusion was originally written in c++. it was rewritten in java (which isn't completely crazy - it made unicode support easier, for example, and j2ee was seen as the future back then) at around the same time (2001ish) that allaire, who developed it, was bought by macromedia.

moving to java/j2ee also meant that the cold fusion core could focus on one thing - generating web pages. the rest of the things a server side application does (connecting to databases, receiving requests, implementing transactions, managing object and session lifetimes) could be left to the j2ee server. again, this was a not unreasonable idea.

jrun is a minimal j2ee environment (actually a servlet server, something like tomcat) that can provide the support environment for cold fusion.

i used cold fusion way back in the day (c++ incarnation) and it was nice enough, although commercial (but, iirc, reasonably priced). so matt choosing it wasn't such a bad move.

anyway, that's the history, but not an explanation of why it crashes. i guess three things:

1 - the emphasis on java technology means that it's not going to be that happy on microsoft oses. people who use java typically (although by no means always) use unixes of some kind.

2 - the jrun container that cold fusion is sitting in is a fairly simple thing. for heavy duty work you really want a good, solid j2ee server. jboss, weblogic, etc (i hate those things, but i suspect they are better than jrun).

3 - matt can't code for toffee. for example, look at how each sub-site has its own implementation (bugs will typically only appear on one, small details differ on each, etc). that's a pretty basic error for a programmer to make - if you do this for a living you pretty soon learn that to generalise, generalise, generalise.

note that while 3 is something of a personal criticism, it's pretty clear that matt does have what it takes to build a successful community. whatever that is, it's a damn site rarer than the ability to write clean code. i would hapily trade some of my ability to regularly indent java code for some of his diplomatic abilities. so i wouldn't feel too sorry for him...
posted by andrew cooke 14 September | 21:16
apart from the typos and random words, i suspect that is a bit too technical.

basically, cold fusion makes web pages, but it relies on a container (more software) to help it do so. these containers are now big business, and the java standard for them is called j2ee. jrun is a baby j2ee implementation. sort of.
posted by andrew cooke 14 September | 21:20
What has always amazed me is that MS didn't just send in a swat team of elite programmers to fix the nasty a long time ago. They spends millions on advertising, so it seems like this would be a drop in the bucket to head off such bad PR.

Anyway... You know what other site is down for me a lot? Flickr. I don't even go there -that- often, but it seems to be down about one out of four or five times I try to go there. Right now, for example.
posted by taz 15 September | 00:13
taz - jrun isn't an MS product, it's Macromedia.

This might be interesting: I remember about a year or so ago, Metafilter was going through some fairly regular crashes (which was a more unusual occurance in those days). At about the same time I was asked to look at a very flaky Windows web server. Interestingly the behaviour of both was very similar, and the problem with the server I was looking at was that it hadn't been patched and so was wide open to attack.

A pretty obvious problem with a simple solution.

When I mentioned this in Metatalk Matt gave an answer that, while not explicitly, seemed to indicate that he hadn't patched his server for a while.

That's madness. If you're running a Windows box exposed to the world you have to patch regularly.

And now my point: It doesn't matter what OS he runs because he clearly isn't keeping on top of his server. You could run MeFi on pretty much any OS, but if you're not doing all the boring and fiddly maintenance stuff that keeps servers from dying then you're going to have regular crashes.
posted by dodgygeezer 15 September | 04:43
macromedia! yeah... all those "m" things confuse me. *exits, blushing*
posted by taz 15 September | 05:11
Yeah, about the patches, back in the day the site got hacked by someone who fortunately only wanted to tag the site. It turned out that mathowie and delfuego (who's a bit better at the server stuff, shoulda known better) had reinstalled Windows without realizing that the data store that reports your patched status ... will happily report that you're patched, when you aren't, if you reinstall Windows underneath it. Cute. So mathowie thought he was up to date when he was basically running unpatched Windows Server.

Anyway.
posted by stilicho 15 September | 05:43
Thanks guys. This clears up some questions I had. I would've checked back in earlier but I've been down with the flu.
posted by jenleigh 15 September | 10:07
I use ColdFusion extensively for Intranet development and as long as you run it on Linux in conjunction with MySQL it's rock-solid. MeFi could be moved to this platform with only minimal changes to the code required (where/if physcial file paths are referenced and SQL syntax differences for example) and it'd be vastly more stable than it is now.

It's kind of unfair to blame ColdFusion in this instance (although there's some truth in the criticism of JRun). MeFi's problems are a combination of poor coding and poor choice of server OS and database.
posted by NeonSurge 15 September | 11:48
Oh and Apache > IIS goes without saying of course.
posted by NeonSurge 15 September | 11:49
matt IS running apache, since oh, maybe a year ago.
posted by quonsar 15 September | 21:41
Fresh Roy Orbison || The Exorcist Convention.

HOME  ||   REGISTER  ||   LOGIN