Jump to content
Heritage Owners Club
Sign in to follow this  
Administrator

The Good, The Ugly and The Bad

Recommended Posts

The Good:  The HOC is back up!

The Ugly: The HOC server dropped a RAID card late last night or some time early this morning. The NOC crew was so kind as to set aside their donuts and coffee before hustling in to the data center to resolve this hardware issue. They even kicked off a restore of the physical drives, out of the kindness of their hearts.

Alas, it was not to be. Apparently, two of the drives in the RAID set were corrupted. But, wait, there's more!

It also turns out that automated backups were running. That's Good! But, they weren't running succesfully. That's Bad! But, I was able to recover from the last known good backup. That's Good! But, the last known good run was overnight on 7 JUN 2020. That's Bad! Yes, yes it is. To whit...

The Bad: All data between 7 JUN 2020 and today was lost.

Share this post


Link to post
Share on other sites
Posted (edited)

Ouch! Thank You for all your efforts. I still think we should have some kind of annual fee for the sight. Think about it. 

After thinking about it, the way this year is going this is not unusual.

Edited by skydog52
  • Upvote 1

Share this post


Link to post
Share on other sites

The Great and Powerful Oz saves the day again !!   

I was just noticing the dates on the posts when I saw this post.   

The beloved Heritage Owners Club is back online and many , many thanks from those that hold it in the highest esteem... !! 

147b.jpg

Share this post


Link to post
Share on other sites

Thanks, Admin, for all of the firefighting!

  • Like 1

Share this post


Link to post
Share on other sites

It couldve been worse, could of lost a lot more data and I'd be tully again.

Thanks for your work here. It is appreciated.

Share this post


Link to post
Share on other sites
Posted (edited)

Didn't see this until just now.  This explains why few threads disappeared.

Edited by DetroitBlues

Share this post


Link to post
Share on other sites

thanks Admins

Share this post


Link to post
Share on other sites

Thanks for getting things up and running, Mr Admin!

I guess its really true.... there are only two types of hard drives:   Those that have crashed and those that are going to crash!

Glad to see the HOC back up and running, and sorry to see that some of the "pearls of wisdom" were lost.   

Alas,  we shall go on!!!!!

  • Like 1

Share this post


Link to post
Share on other sites

Which country should we blame for  the raid!!!

Never mind, that would be politics. 

Share this post


Link to post
Share on other sites
12 minutes ago, ElNumero said:

Which country should we blame for  the raid!!!

Never mind, that would be politics. 

That would also be RAID, as in Redundant Array of Independent Disks; although in this case, it seems that the redundancy turned out to be rather redundant.  

Share this post


Link to post
Share on other sites

So, unlike the old archived HOC posts, which can still be Googled, did this latest issue wipe out the time frame noted, never to be restored?  It's not in a Cloud somewhere, floating around in cyberspace? 

Share this post


Link to post
Share on other sites

RAID works great until all the drives are on one controller and the controller itself fails, slowly, leading to corruption of the underlying disks. This is why I don't do technical hands-on crap any more. Or...do I?

@Gitfiddler the HOC database that drives the forum only has (had) data up through 7 JUN. Anything else may or may not have been archived by Google, Bing or the Wayback Machine (archive.org).  Truly, it is regrettable, and I apologize to all those whose postings have been lost, as Roy Batty said, "...like tears in rain."

Share this post


Link to post
Share on other sites
27 minutes ago, Administrator said:

RAID works great until all the drives are on one controller and the controller itself fails, slowly, leading to corruption of the underlying disks. This is why I don't do technical hands-on crap any more. Or...do I?

@Gitfiddler the HOC database that drives the forum only has (had) data up through 7 JUN. Anything else may or may not have been archived by Google, Bing or the Wayback Machine (archive.org).  Truly, it is regrettable, and I apologize to all those whose postings have been lost, as Roy Batty said, "...like tears in rain."

No worries.  We are lucky to have you!!!  :drink_mini:

Share this post


Link to post
Share on other sites

Admin,  Thank you so much for the endless effort you put into keeping us connected.  While I visit often, I am not a frequent poster.  I do feel an important connection to the membership here.  I never leave without gaining some kind of insight, and more importantly, wearing a smile.  You and all the good HOC folks are a very special breed.  

Thanks again!

  • Like 2

Share this post


Link to post
Share on other sites

Sir, you deserve a ton of credits and thank you's for all the work you do. So I thank you. I very seldom post, but I log in almost every day.

Share this post


Link to post
Share on other sites

Mr. Admin, sir,

Any chance that mirroring would be a better way to protect the contents of the disk storage?  (And yes, I realize the data isn't necessarily on disk drives any more.)  

Not long ago I searched for offsite storage with some sort of automated backup routine.  The machine I currently use has 2TB of solid-state storage, and I wanted offsite backup for my (thousands of) photo files.  One place I contacted offered 2TB of backup storage for the princely sum of $8 per month.  When I look back on my decades of AS400 hardware management, on which mirroring was absurdly expensive, $8 per month seems like it's pretty well free.  Trouble was, their service was to mirror my storage, with every add/update immediately reflected on their drives.  They had no automated backup method.  Not what I was after, so I passed.  But I was shocked at how inexpensive disk (sorry, data) protection can be these days.

 

Share this post


Link to post
Share on other sites

@LK155 The backups (monthly and weekly full, daily incrementals) are stored in Amazon S3. Fairly cost effective and fast transfer rates due to the data center we're in being on a dedicated link to a nearby NAP. The issue with losing data was due to the backups apparently running, but failing silently (meaning: I never received a notice they'd failed, which I suspect must somehow be related to the corruption on the underlying disks...perhaps the backup script or the SMTP or IMAP daemon were FUBAR?).   All part of the fun and exciting world of system administration I suppose.

Once upon a time, in a previous life, I was tasked with setting up business continuity/disaster recovery (BC/DR) for a health care provider. When I asked for the RTO and RPO constraints, I was told, "immediately" and "zero loss". When I asked for the budget allocation, I was told, "No additional budget." So...real time replication of data as it writes in the production data center with zero latency across the network to the remote site (impossible due to the laws of physics) and at no additional cost. Got it. That was all AS400 and RS6000 hardware, right at the dawn of IBM GEO/HACMP. Next time we meet, I'd be happy to share the gory details and predictable outcome over a nice roja or cabernet. 

 

Share this post


Link to post
Share on other sites

yay, computers!

:D

and thx again for all the work you put into keeping this place up!

Share this post


Link to post
Share on other sites

Geez, Mr. Admin, it's pretty obvious I've been out of the picture for too long, as I can only relate to two of the terms you used in your reply:  cabernet and NAP.

 

  • Like 1

Share this post


Link to post
Share on other sites
15 hours ago, Administrator said:

Once upon a time, in a previous life, I was tasked with setting up business continuity/disaster recovery (BC/DR) for a health care provider. When I asked for the RTO and RPO constraints, I was told, "immediately" and "zero loss". When I asked for the budget allocation, I was told, "No additional budget." So...real time replication of data as it writes in the production data center with zero latency across the network to the remote site (impossible due to the laws of physics) and at no additional cost. Got it. That was all AS400 and RS6000 hardware, right at the dawn of IBM GEO/HACMP. Next time we meet, I'd be happy to share the gory details and predictable outcome over a nice roja or cabernet. 

You weren't by any chance working for this guy, were you?   Somehow the "do it right now,  perfectly,  and with no resources" sound very familiar.

Pointy-haired_Boss.png

Share this post


Link to post
Share on other sites

So close to my overposting record.... Oh well!  I’ve had this issue several times over in the telecom world.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...