Page 1 of 1

Unable to open site.

Posted: Thu Jan 13, 2011 8:16 am
by dn29626
What happened over the last several days with this site?
Is there an email available to enquire of site status?

Posted: Thu Jan 13, 2011 8:05 pm
by asavage
You want to know why the site went down? Isn't it enough that it's back? ;)

I lost the backup drive a couple of months ago, so nightly backups haven't been happening to a separate drive for some time.

Back in August or September, I bought four matched boxes (HP DC7100s, circa 2005) to replace the 1999-vintage hardware that I built to host the data back then. Over the years I average one HDD failure per year, and now after so much uptime various fans and power supplies are reaching end-of-life and giving me trouble, so when I had an opportunity to buy a bunch of newer, matched hardware, I did so.

Bought some new, big HDDs too. And software to upgrade the base OS.

However, life intrudes, and much of this has just been sitting in the server room. The "new" hardware is nowhere near ready to deploy.

Then, last week the main server's data drive developed four bad blocks, and that shut down the site. It took me several nights to get that rectified. That drive is still in use as I type. I *do* have backups to a separate drive running nightly again, though, but only of the site and not of the supporting data or pics.

Since I'm not ready to migrate everything to the "new" hardware, I have to keep the old hardware in place. So, I bought a couple of brand-new WD 500GiB PATA HDDs today (one for the server, one for the backup). But since almost everything I built back in the day is SCSI, I found I don't have any of the 80-wire PATA cables these things require (only 40-wire cables), so I can't add the new HDDs tonight. Perhaps tomorrow night. I don't like IDE, never have, but as a stopgap this is the cheapest "new drive" solution I can put into place on the non-SATA-supporting antique servers. I hope.

All this is by way of saying that I *am* working on the various problems, and that it's failing hardware that's to blame. When I built all this stuff, I was a lot younger, had more time to indulge these hobbies, and I bought very good quality stuff. However, 24/7 operation for eleven years is really pushing it.

The "new" DC7100s aren't new either, but have a lot less hours on them. I'll have to strip out all the fans and replace them, as I am hearing too much noise from them. The drives will be all new. The CPUs are about three times faster, but I don't really need more HP on the server end of things; nothing that happens here requires more than the 1Ghz Athlons I bought over a decade ago. The DSL outbound isn't fast enough to stress the old server's throughput, not even close. But my distributed.net parallel processing numbers are going to go up by a whole bunch :) And the old Athlons are very power hungry; I use them to heat that end of the house! The DC7100s require a lot less cooling -- I hope.

I trust that this gives you some insight into why the site is not always up 100% of the time. Even with thousands invested in hardware (UPS batteries are one of my biggest recurring expenses), this isn't a mission-critical sort of operation. I have received ONE donation in the past 14 months, so this is all out-of-pocket and then there's the time involved in just maintenance. I try to keep up with it, but there are days when I have to call it a night and head for bed; 5am comes early.

Expect more downtime. Be patient. Be well.

Posted: Thu Jan 13, 2011 8:45 pm
by dn29626
Thank you.
My fear is of this site shutting down permanently, if it happened without notice that would leave us hanging and therefore be even worse.

If you know you are going to be down for a period of time, an advance notice would be nice. If a notice did go out, i apologise for overlooking it.

Posted: Fri Jan 14, 2011 5:53 am
by asavage
The site is never down for more than 10 minutes, if I have to do something planned, and for those short outages, I don't give notice.

Every six months or so, the server software hiccups and dies. I usually find out about that within a day.

Naturally, when hardware fails, I can't give any notice.

The only time I tell people about a pending outage is when I move the hardware to another location, which has happened three times since 2000.

I could set up a redirect of the site to another webserver with a page, with an ETA, but that's just a PITA for a hobbiest-type setup like mine. More things to maintain.

Posted: Fri Jan 14, 2011 8:12 am
by dn29626
I understand.

Is there an email to contact you with to ask in the future? If you do not want it here, email it to me direct.

dttk44@bellsouth.net

Posted: Fri Jan 14, 2011 8:51 pm
by asavage
dn29626 wrote:Is there an email to contact you with to ask in the future?
It's in this post.

Any time the site is down, feel free to email. As I don't check the site every day anymore, I appreciate the heads up.

Posted: Sat Jan 15, 2011 8:13 pm
by asavage
I had the site down most of today, still trying to replace the main data drive, not there yet. I'll be working on it tomorrow, too but I'm done for tonight.

Posted: Tue Jan 18, 2011 8:57 pm
by asavage
After a grueling weekend, working fulltime, and two more nights after work, I believe that the failing data HDD in the main server is replaced and working well.

While SCSI has its teething problems and interoperability concerns, I had that all sorted out over a decade ago, with multiple HDDs on multiple host adapters. I have a whole stack of Sun external enclosures that have served me well for many years.

But, SCSI is now become a niche product, and SATA offers a much better price/performance ratio these days. I'll hate to leave SCSI behind, but the writing is on the wall.

I've purchased three SATA HDDs for the new systems, and two modern new PATA HDDs as temporary replacements for the old systems. What I did the past four days was to manage to replace three old SCSI HDDs with one of the new PATA HDDs.

The rest of the week, I'll be doing the same kind of task to the backup box, the target of the nightly backups.

The summary of the problem I had getting the new PATA HDD to work on the old hardware is that the old hardware is supposed to be able to do UDMA66, but apparently something isn't up to that speed, and dialing it back to UDMA33 after 16 hours work, finally got the hardware cooperating.

Then I fought software for the next couple of days. Ask anyone who's been there: mixing SCSI & IDE wasn't something you did lightly, a decade ago.

When I get the backup box updated to the new PATA HDD, and the rest of the (non-nissandiesel) data migrated onto the new drives, I'll finally retire the Sun external enclosures and piles of drives.

The new drive is much, much quieter :)
Speed is not really an issue. I can always move more data than the DSL will ship.

Anyway -- no more significant downtime is on the horizon. I'll be taking the box down for up to a half-hour when I tidy up the wiring and put the panels back on the main server, sometime in the next few days. The rest of the work here will not be on that world-facing box, so nothing should "show" to any of you.

At some point in the next few months, all of this will move to the newer boxes, but no timeline is firm on that yet.

Off to bed . . .

THANKS, AL

Posted: Thu Jan 20, 2011 10:47 am
by Cmdr. Ron
T H A N K S , A L !!

:D THANKS for your determination & long hours maintaining & restoring site operations :!:

I don't know where I'd find some of the data you & the guys freely share if NissanDiesel wasn't.
That encouraging data means diagnosing & repairing baffling problems on my 28 yr old SD22 Daily Driver.

You persevered.
So did I. Not much choice, considering the investment.

The Leather Diaphragm in the Pneumatic Governor served for 27 years.
The last straw on the camel was hidden, but showed-up in full regalia for the site outage last week.
I had no idea why my noble steed was down, but found-out courtesy of NissanDiesel.
Southwest Diesel shipped . . . Cmdr. Ron will ride again!

T H A N K S , a g a i n , A L !

Is there a Hillman thread?

Re: THANKS, AL

Posted: Fri Jan 21, 2011 6:15 am
by asavage
Cmdr. Ron wrote:Is there a Hillman thread?
While I have a Hillman (Rootes) site, there's no forum there. There's a good Yahoo! group, though. I haven't looked at it in years.

Posted: Tue Jan 25, 2011 6:08 pm
by asavage
I don't know why, but the MySQL server (software that runs on the web server, and is a database manager) was totally screwed up by the time I got home. I restarted it, it cleaned itself up, and seems to be working again. Weird, and I don't know why it did it.

I might have to look at upgrading that software . . .

----------------

The backup box now has the new PATA HDD, and nightly mirroring is happening again -- Yea!

I have one more data drive to look at for migration, and then I'm packing up all the SCSI stuff. The only SCSI hardware that'll still be in use are my B-size scanners, and I'm not replacing them! Flatbed scanners that will scan two 8.5x11" papers (or an opened book) in one pass are damned expensive.