Jump to content
Heritage Owners Club

The Good, The Ugly and The Bad


Administrator

Recommended Posts

The Good:  The HOC is back up!

The Ugly: The HOC server dropped a RAID card late last night or some time early this morning. The NOC crew was so kind as to set aside their donuts and coffee before hustling in to the data center to resolve this hardware issue. They even kicked off a restore of the physical drives, out of the kindness of their hearts.

Alas, it was not to be. Apparently, two of the drives in the RAID set were corrupted. But, wait, there's more!

It also turns out that automated backups were running. That's Good! But, they weren't running succesfully. That's Bad! But, I was able to recover from the last known good backup. That's Good! But, the last known good run was overnight on 7 JUN 2020. That's Bad! Yes, yes it is. To whit...

The Bad: All data between 7 JUN 2020 and today was lost.

Link to comment
Share on other sites

Ouch! Thank You for all your efforts. I still think we should have some kind of annual fee for the sight. Think about it. 

After thinking about it, the way this year is going this is not unusual.

Link to comment
Share on other sites

The Great and Powerful Oz saves the day again !!   

I was just noticing the dates on the posts when I saw this post.   

The beloved Heritage Owners Club is back online and many , many thanks from those that hold it in the highest esteem... !! 

147b.jpg

Link to comment
Share on other sites

Thanks for getting things up and running, Mr Admin!

I guess its really true.... there are only two types of hard drives:   Those that have crashed and those that are going to crash!

Glad to see the HOC back up and running, and sorry to see that some of the "pearls of wisdom" were lost.   

Alas,  we shall go on!!!!!

Link to comment
Share on other sites

Which country should we blame for  the raid!!!

Never mind, that would be politics. 

Link to comment
Share on other sites

12 minutes ago, ElNumero said:

Which country should we blame for  the raid!!!

Never mind, that would be politics. 

That would also be RAID, as in Redundant Array of Independent Disks; although in this case, it seems that the redundancy turned out to be rather redundant.  

Link to comment
Share on other sites

RAID works great until all the drives are on one controller and the controller itself fails, slowly, leading to corruption of the underlying disks. This is why I don't do technical hands-on crap any more. Or...do I?

@Gitfiddler the HOC database that drives the forum only has (had) data up through 7 JUN. Anything else may or may not have been archived by Google, Bing or the Wayback Machine (archive.org).  Truly, it is regrettable, and I apologize to all those whose postings have been lost, as Roy Batty said, "...like tears in rain."

Link to comment
Share on other sites

27 minutes ago, Administrator said:

RAID works great until all the drives are on one controller and the controller itself fails, slowly, leading to corruption of the underlying disks. This is why I don't do technical hands-on crap any more. Or...do I?

@Gitfiddler the HOC database that drives the forum only has (had) data up through 7 JUN. Anything else may or may not have been archived by Google, Bing or the Wayback Machine (archive.org).  Truly, it is regrettable, and I apologize to all those whose postings have been lost, as Roy Batty said, "...like tears in rain."

No worries.  We are lucky to have you!!!  :drink_mini:

Link to comment
Share on other sites

Admin,  Thank you so much for the endless effort you put into keeping us connected.  While I visit often, I am not a frequent poster.  I do feel an important connection to the membership here.  I never leave without gaining some kind of insight, and more importantly, wearing a smile.  You and all the good HOC folks are a very special breed.  

Thanks again!

Link to comment
Share on other sites

Mr. Admin, sir,

Any chance that mirroring would be a better way to protect the contents of the disk storage?  (And yes, I realize the data isn't necessarily on disk drives any more.)  

Not long ago I searched for offsite storage with some sort of automated backup routine.  The machine I currently use has 2TB of solid-state storage, and I wanted offsite backup for my (thousands of) photo files.  One place I contacted offered 2TB of backup storage for the princely sum of $8 per month.  When I look back on my decades of AS400 hardware management, on which mirroring was absurdly expensive, $8 per month seems like it's pretty well free.  Trouble was, their service was to mirror my storage, with every add/update immediately reflected on their drives.  They had no automated backup method.  Not what I was after, so I passed.  But I was shocked at how inexpensive disk (sorry, data) protection can be these days.

 

Link to comment
Share on other sites

@LK155 The backups (monthly and weekly full, daily incrementals) are stored in Amazon S3. Fairly cost effective and fast transfer rates due to the data center we're in being on a dedicated link to a nearby NAP. The issue with losing data was due to the backups apparently running, but failing silently (meaning: I never received a notice they'd failed, which I suspect must somehow be related to the corruption on the underlying disks...perhaps the backup script or the SMTP or IMAP daemon were FUBAR?).   All part of the fun and exciting world of system administration I suppose.

Once upon a time, in a previous life, I was tasked with setting up business continuity/disaster recovery (BC/DR) for a health care provider. When I asked for the RTO and RPO constraints, I was told, "immediately" and "zero loss". When I asked for the budget allocation, I was told, "No additional budget." So...real time replication of data as it writes in the production data center with zero latency across the network to the remote site (impossible due to the laws of physics) and at no additional cost. Got it. That was all AS400 and RS6000 hardware, right at the dawn of IBM GEO/HACMP. Next time we meet, I'd be happy to share the gory details and predictable outcome over a nice roja or cabernet. 

 

Link to comment
Share on other sites

15 hours ago, Administrator said:

Once upon a time, in a previous life, I was tasked with setting up business continuity/disaster recovery (BC/DR) for a health care provider. When I asked for the RTO and RPO constraints, I was told, "immediately" and "zero loss". When I asked for the budget allocation, I was told, "No additional budget." So...real time replication of data as it writes in the production data center with zero latency across the network to the remote site (impossible due to the laws of physics) and at no additional cost. Got it. That was all AS400 and RS6000 hardware, right at the dawn of IBM GEO/HACMP. Next time we meet, I'd be happy to share the gory details and predictable outcome over a nice roja or cabernet. 

You weren't by any chance working for this guy, were you?   Somehow the "do it right now,  perfectly,  and with no resources" sound very familiar.

Pointy-haired_Boss.png

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...