What's new
  • Happy Birthday ICMag! Been 20 years since Gypsy Nirvana created the forum! We are celebrating with a 4/20 Giveaway and by launching a new Patreon tier called "420club". You can read more here.
  • Important notice: ICMag's T.O.U. has been updated. Please review it here. For your convenience, it is also available in the main forum menu, under 'Quick Links"!

Server Problems

Skip

Active member
Veteran
As you may have noticed the site was down today. Our server lost all it's data thanks to a tech problem. We are restoring the data at this time, but it looks like the gallery was NOT backed up as it should've been (you can thank our hosting company for that, as well as today's server problem!).

I really doubt we can restore all those images because although they exist on the server we lost the file names & directory structure, so it's really screwed.

I apologize to all of you, and we'll see what can be done to get as many images back as possible.

Hang in there folks, these things do happen (more than once to me it seems... :( )

BTW if you can't upload images, PM me with the error message & I'll fix it for you!
 
G

Guest

Shit man I was counting on this server to be backed up......I never save pics on my personal computer.

Damn
 

groo

New member
I find it amazing that so many so-called ISP's have no backup policy, no restore policy, no hot-failover servers, nothing. It's like they're all run by kids with a server CD and absolutely no actual data center or business experience.

On the other hand, I guess that's what the industry deserves for pushing tech wages lower than that of a freakin' plumber.
 

Skip

Active member
Veteran
Actually we ARE supposed to be backed up! In fact when I was in their office signing the contract for this server 2 months ago I specifically asked them to do the backup right then! We PAID for the backup, but they apparently never did it.

We do have the images files from up to March 2, when we switched servers, but it might mean we lose all since...

At least we didn't lose the database!

Nobody here is happy about this, I assure you!

Why wouldn't you keep a copy of your images? How else you gonna post them on OG?
 
G

Guest

Those damn tech guys. It's always something except their fault.

Great to have it back up and running but that's a real bummer about the galleries. Oh well.

Texas Kid
 

HOT CARGO

The Best Is Yet To Come
Veteran
O My GOD!!!!!

even the PMs are lost. i got confirmation on them but my inbox is empty

this down time make me go nuts hahaha
hopefully will get the problem fix.

peace

HC
 

Skip

Active member
Veteran
PMs should be fine. Mine are. Only images since march are the problem... They are working on recovering what they can, but I don't hold much hope.

This whole problem (which affected many other websites as well) was caused by one wrong keystroke on the unix command line...

I would go and murder the dude who did this right now, if he wasn't busy fixing the sites...

At least this site is up! Most of the others are still down and one may never come back... :(
 

melnibone_ca

New member
tape backups can be finicky

tape backups can be finicky

some servers that I worked on ran some fairly generic backup scripts to tar up the filesystems, and dump them to scsi dds4 tapes.

The script also took car to verify that tar could read back the file index from the tape...at least in theory

tar -tvvf /dev/st0 would faithfully read back the index but, tar -xvvf failed. That sucked. Thankfully, we had other backups to fall back to.

Backup policy was Daily backups Mon, wed, thurs, fri.

A weekly backup was done tuesday night, and taken off site, round-robin fashion.

A monthly backup was done on the 28th of every month, and taken offsite round-robin.

We also had a hotspare system to rsync the entire disk image of 6-10 critical servers to a central machine (nightly cronjob) so we always had at least yesterdays data...

This was also handy if we wanted to take a machine offline for maintenance or repair. As the hotspare had the entire filesystem for each server on separate partitions over multiple disks, a little filesystem magic in /etc/fstab, and the bootloader conf in grub.conf, and we had a hotspare backup server online in no more than a reboot.

We werent quite big enough to worry about raid arrays, but we thought about it. Can't comment on that, sorry....

Anyway, I hope you get your stuff sorted out, and I feel for the poor tech who mistyped an rm -rf foo *

that hurts. I think every unix admin has had a similar experience at some point. :-/
 

Skip

Active member
Veteran
Hey everyone, thanks for taking this in stride. I wish I could!

We're talking about doing something to help compensate for the loss of images, and of course your time and energy.

Teflon & I were discussing how to improve the image integration into the forums (we agree not to be like OG). Anyway, we'll make it up to you kind folks somehow.

There is an upside to this. In fixing this problem the techs got to swap out our RAID controller for a top o' the line new one. This evidently also solved the bottleneck with the two processors that caused a problem.

So we should be blisteringly fast now...(hopefully!)
 

Skip

Active member
Veteran
melnibone_ca
Hey thanks for the empathy! They were supposed to back it up to a server they use just for backups, so it would've been on a hard drive & taken only 10 minutes or so to restore all the sites. Instead it's probably going to take a couple of days, with much stuff missing.

The tech told me he was installing an upgraded control panel. He'd installed a test copy and it worked fine, so he went to delete it and wrote out the wrong directory in the command line (one keystroke off I guess).

That's what I don't like about Unix. Pretty unforgiving and not interactive enough. It should've warned...

"You SURE you want to delete every domain on this server, IDIOT?"

But I guess that computers aren't quite there yet...

Although it WAS human error, not the computer.

And in this tech's defense, he did fess up immediately and was still up at 1am working on the server.
 

Skip

Active member
Veteran
LOL!

Notice the time difference! The two posts immediately below this one were actually posted before (I had to figure out why mine was posting above theirs after I'd read them).

Turns out the tech's have just installed the new RAID system into our server (with the wrong time set on it).
 
G

Guest

Damn, Oh well, I guess it just be's that way sometime...
Glad we are back up and running...
Good thing My pics are backed up elsewhere, whew!
 
G

Guest

Sharp_Pain said:
Shit man I was counting on this server to be backed up......I never save pics on my personal computer.

same here... oh well nothing too great of mine was lost... I'm off to my digicam. time to start stocking my gallery again

and my PM's from yesterday were gone earlier... but other ones are still there... whatever
 

Einsteinguy

Member
Bummer Skip

Bummer Skip

It's only data.
We all know this is a new site so working out the bugs is part of it.
Hate when you find out backup isn't working is when you need it.

Seen it a few times and it's not pretty , it is always a good idea to test backup before you need it!


Einstein

:D
 

groo

New member
Skip said:
Why wouldn't you keep a copy of your images? How else you gonna post them on OG? [/B]

I keep all my images and can easily repost them.

My beef is with so-called tech companies who don't follow through on their SLA. I'm in the industry myself, and have had to deal with far too many disaster recoveries that cost long, long hours because of incompetent data center management.

Sure the company got all kinds of penalty fees paid and service discounts because of the screwups, but that didn't end up in my pocket and it wouldn't have given me back the lost nights anyhow.

Their incompetence makes the rest of the tech industry look bad.
 

groo

New member
Skip said:
melnibone_ca
That's what I don't like about Unix. Pretty unforgiving and not interactive enough. It should've warned...

"You SURE you want to delete every domain on this server, IDIOT?"

But I guess that computers aren't quite there yet...

Although it WAS human error, not the computer.

And in this tech's defense, he did fess up immediately and was still up at 1am working on the server.

If you think Windows, AS/400, or any other system is more "secure" when the system admin reconfigs the server, you are seriously fooling yourself.

Even if the system did have such checks, you can bet that the first thing an admin does is disable the dialogs and warnings that keep them from doing the job.

It's one thing to have to click the warning once when you do something on a home PC. It's quite another when you have to do it 30-40 times per day, every day. :)

Yes, typos happen -- I don't blame the tech for a typo. I blame the organization for not verifying their backups. Most large sites I've worked do a full system restore from archives twice a year, using the hot failover servers. The sole purpose is to make sure that the backups are archiving the information needed to run the business.

I had to restore an entire project once because of a typo in a job script. Mistakes happen, which is why there is no excuse for a data center that does not follow their backup schedule or verify their process.

(I probably read more pissed off than I actually am. Comes from being one of the people who has to identify the cause of screwups like that and make sure they don't happen again. Among other high-stress aspects of the job. ;) )
 

Skip

Active member
Veteran
The backups were never setup (someone failed to do that job), so again that too, was human error.

I'm really surprised people lost posts or PMs cause they did recover the database, and there shouldn't be anything missing, unless you managed to post while the system was being restored (and I warned on the home page about not posting).
 

Latest posts

Latest posts

Top