MetaChat REGISTER   ||   LOGIN   ||   IMAGES ARE OFF   ||   RECENT COMMENTS




artphoto by splunge
artphoto by TheophileEscargot
artphoto by Kronos_to_Earth
artphoto by ethylene

Home

About

Search

Archives

Mecha Wiki

Metachat Eye

Emcee

IRC Channels

IRC FAQ


 RSS


Comment Feed:

RSS

07 June 2005

Lately, I’ve been having stability problems with one of my servers. By Matt Haughey
Lately? Right this second.
posted by gramschmidt 07 June | 19:19
Well, that's something.
posted by mudpuppie 07 June | 19:20
Self-explanatory
posted by blag 07 June | 19:30
every single day since May 31st? i hope he's getting refunded for the downtime.
posted by amberglow 07 June | 19:31
Most of them seem to be very short outages (~20mins) so I'm guessing that they're reboots. Still, 7 reboots today (so far) shows that something is screwed.
posted by blag 07 June | 19:40
I'm glad to see this. I know Matt is taking the whole thing seriously, but it's good to see him acknowledge it, even if obliquely. Somehow especially if obliquely.
posted by omiewise 07 June | 19:54
They (and Matt) make that sound like it's a big deal. It's generous of those guys, but the tech doesn't seem very impressive. For example, I'd monitor from two distant locations, both physically and virtually, to form a triangle with the monitored site.
posted by kmellis 07 June | 20:05
well, of course you would.
posted by quonsar 07 June | 21:54
Wow, I knew there were some big problems as of late, but that log is frightening. Things were fine and then somewhere around the end of April they fell apart, and then somewhere around the end of May all Hell broke loose.
posted by caddis 07 June | 21:56
huh. why haven't we seen this before? not that it would matter, but you think it would have come up.
posted by puke & cry 07 June | 22:19
the uptime monitor, that is, not the article.
posted by puke & cry 07 June | 22:21
Quoted from the linked article: " I’ve got a bullet-proof uptime monitor checking it every few minutes to see if it’s online."

I was sure this was going to be followed by "they're called the Metafilter Membership, and they email me (or post to any number of satelite webistes) every time the server goes down for more than a few seconds"....
posted by anastasiav 07 June | 22:25
I ping the websites of my stalkees every 4 seconds.

Anastasiav, you need to talk to your hosting co. I'm losing like .4% of my pings.
posted by mr_crash_davis 07 June | 22:43
"I’ve got a bullet-proof uptime monitor checking the bullet-riddled, artillery-blasted heap of nuclear slag i call my site every few minutes to make sure it’s offline."
posted by quonsar 07 June | 22:49
I'm convinced Schrödinger's cat lives in the mefi server. Or uses it as a litterbox. Or something.
posted by trondant 07 June | 23:35
^Best explanation ever.
posted by taz 08 June | 00:29
This is a cheap way to see if the host box is available (which is important) but so what? It gives no view into any other reason for the site to be down. What if it’s network latency or outage? What if you ISP is falling? What if you are generating unneeded DNS queries and bogging down the whole network segment? Poor DB schema, poor queries, weak middleware, memory leaks, all cause outages. How about something that tells you why queries fall and Jrun runs for its life. It’s an OK tool but does not really do much.
posted by arse_hat 08 June | 01:19
again
posted by arse_hat 08 June | 01:34
i'm actually about equal parts stunned and amused that matt posted that.
posted by quonsar 08 June | 01:43
i like the multitaskiness of it
posted by ethylene 08 June | 02:10
I like the duality of it.
posted by arse_hat 08 June | 02:14
that was always the nifty thing about the old cia
multitasking
posted by ethylene 08 June | 02:20
the new cia is multithreaded.
posted by quonsar 08 June | 02:22
i love having old daily shows be my main news source, but is this newt for prez thing for real?
posted by ethylene 08 June | 02:23
But with an either/or dichotomy can the centre hold? Wither mefi?
posted by arse_hat 08 June | 02:26
*smashes your top hat
pulls your cravat*
posted by ethylene 08 June | 02:28
I’m shocked and appalled.
And a little intrigued.
posted by arse_hat 08 June | 02:31
do tell
posted by ethylene 08 June | 02:31
Erm, so if this thing checks for the existence of a text file but it won't check if the application's up or down. That's right isn't it? Or is the server unable to serve files when JRun goes tits-up?
posted by dodgygeezer 08 June | 04:39
I would imagine that the server could still serve text files even when JRun is down: Apache is still running. These are just server outages - JRun crashes are a whole 'nother layer of fuckups on top of this.

Only one way to find out though, and we'll no doubt be able to do it soon...

posted by blag 08 June | 06:00
My irony meter crashed. Jrun strikes again.
posted by dhruva 08 June | 06:06
Too much!
posted by dabitch 08 June | 09:21
Metachat: when JRun goes tits-up.
posted by redfisch 08 June | 12:13
"I would imagine that the server could still serve text files even when JRun is down"

I'm under the impression that all the logged-in pages are dynamically generated, at least. I think the front page for non-logged in members is generic and cached where APache can just serve it. Maybe not.

My preference about this kind of thing, you might even call it an obsession, is that as much content as possible, including most of what seems to the end user to be dynamically generated, be cached at the webserver and not require any (or require only minimal) processing by the application layer or, especially, the database. The company I worked for whose software does high-end, enterprise level CMS and application server stuff had this design as part of its architecture from the very beginning. But even if this isn't a central strategy built into the software, it can still be a strategy the site designer can pursue as much as possible. I have the strong impression that Matt has only minimally, and after the fact, optimized metafilter in this way. Thus, mefi is extremely vulnerable to any of the layers failing. And they do, often. But it's telling that it's the application layer that fails most often, the infamous jrun error. That's Cold Fusion's fault. I simply don't believe that Cold Fusion is inherently robust for very high-traffic applications. But my bias towards high-end, enterprise class software that runs into five, six, or seven figures, would of course lead me to assert so.
posted by kmellis 08 June | 16:38
I'm under the impression that all the logged-in pages are dynamically generated, at least.
Sure. But my question was, does a text file (this one) get served via JRun? The reason I ask is because I believe it doesn't, in which case it's a pretty lousy test of whether the server's opperational or not.

I should read up more about caching - the little I've read recently has been in reference to MediaWiki, but my needs are only for a small departmental intranet so I've not been tempted to read further.
posted by dodgygeezer 08 June | 16:59
"But my question was, does a text file (this one) get served via JRun?"

God, I hope not. Surely not. But maybe.
posted by kmellis 08 June | 22:54
Trends in change lately? || The Cost of Corruption

HOME  ||   REGISTER  ||   LOGIN