Lemmy, what are some of your "oh shit" work stories?

posted 11 months ago

Dio9sys@lemmy.blahaj.zone

asklemmy@lemmy.ml

163 commentshide report

Sort:

Hot Top Controversial New Old

[ - ]

sour@kbin.social

9 points

11 months ago

no work related but am overfill sink with water changer because forgot to remove drain cover

is flood

am get in trouble also ._.

permalink

report

[ - ]

Seven@startrek.website

2 points

11 months ago

Deleted by creator

permalink

report

[ - ]

dan@upvote.au

4 points

11 months ago

Looks like your comment posted twice.

permalink

report

parent

[ - ]

Seven@startrek.website

2 points

11 months ago

Thanks, I’ll give this one the chop!

permalink

report

parent

[ - ]

mkhopper@lemmy.world

35 points

11 months ago

Strap in friends, because this one is a wild ride.

I had stepped into the role of team lead of our IS dept with zero training on our HP mainframe system (early 90s).
The previous team lead wasn’t very well liked and was basically punted out unceremoniously.
While I was still getting up to speed, we had an upgrade on the schedule to have three new hard drives added to the system.

These were SCSI drives back then and required a bunch of pre-wiring and configuration before they could be used. Our contact engineer came out the day before installation to do all that work in preparation of coming back the next morning to get the drives online and integrated into the system.

Back at that time, drives came installed on little metal sleds that fit into the bays.
The CE came back the next day, shut down the system, did the final installations and powered back up. … Nothing.
Two of the drives would mount but one wouldn’t. Did some checking on wiring and tried again. Still nothing. Pull the drive sleds out and just reseat them in different positions on the bus. Now the one drive that originally didn’t mount did and the other two didn’t. What the hell… Check the configs again, reboot again and, success. Everything finally came up as planned.

We had configured the new drives to be a part of the main system volume, so data began migrating to the new devices right away. Because there was so much trouble getting things working, the CE hung around just to make sure everything stayed up and running.

About an hour later, the system came crashing down hard. The CE says, “Do you smell something burning?” Never a good phrase.
We pull the new drives out and then completely apart. One drive, the first one that wouldn’t mount, had been installed on the sled a bit too low. Low enough for metal to metal contact, which shorted out the SCSI bus, bringing the system to its knees.

Fixed that little problem, plug everything back in and … nothing. The drives all mounted fine, but access to the data was completely fucked,
Whatever… Just scratch the drives and reload from backup, you say.

That would work…if there were backups. Come to find out that the previous lead hadn’t been making backups in about six months and no one knew. I was still so green at the time that I wasn’t even aware how backups on this machine worked, let alone make any.

So we have no working system, no good data and no backups. Time to hop a train to Mexico.

We take the three new drives out of the system and reboot, crossing all fingers that we might get lucky. The OS actually booted, but that was it. The data was hopelessly gone.

The CE then started working the phone, calling every next-level support contact he had. After a few hours of pulling drives, changing settings, whimpering, plugging in drives, asking various deities for favors, we couldn’t do any more.

The final possibility was to plug everything back in and let the support team dial in via the emergency 2400 baud support modem.
For the next 18 hours or so, HP support engineers used debug tools to access the data on the new drives and basically recreate it on the original drives.
Once they finished, they asked to make a set of backup tapes. This backup took about 12 hours to run. (Three times longer than normal as I found out later.)
Then we had to scratch the drives and do a reload. This was almost the scariest part because up until that time, there was still blind hope. Wiping the drives meant that we were about to lose everything.
We scratched the drives, reloaded from the backup and then rebooted.

Success! Absolute fucking success. The engineers had restored the data perfectly. We could even find the record that happened to be in mid-write when the system went down. Tears were shed and backs were slapped. We then declared the entire HP support team to be literal gods.

40+ hours were spent in total fixing this problem and much beer was consumed afterwards.

I spent another five years in that position and we never had another serious incident. And you can be damn sure we had a rock solid backup rotation.

(Well, there actually was another problem involving a nightly backup and an inconveniently placed, and accidentally pressed, E-stop button, but that story isn’t nearly as exciting.)

permalink

report

[ - ]

DudeDudenson@lemmings.world

22 points

11 months ago

Imagine the difference trying to get that kind of support these days. Especially from HP

permalink

report

parent

[ - ]

mkhopper@lemmy.world

8 points

11 months ago

No kidding. Where I’m working now, it takes an HP CE over a week just to bring out a new hot swappable drive after we jump through a number of request hoops.

permalink

report

parent

[ - ]

ExLisper@linux.community

14 points

11 months ago

I once pushed a git commit with youtube link as the commit message. Nothing terrible, some completely random video. Still, it looked really weird in the commit history. Turns out you can edit this if you have access to the server and I did have access to the server.

One time in the same company I found a random youtube link in the middle of a java class. Yes, it was still compiling. No I didn’t commit it.

permalink

report

[ - ]

tetris11@lemmy.ml

6 points

11 months ago

Urgh. I sadly do this all the time

Interactive rebase, amend the commit message for your commit, continue the rebase, and force push.

Thank heavens for Magit which simplifies this process.

permalink

report

parent

[ - ]

Skye@lemmy.world

2 points

11 months ago

Emacs!

permalink

report

parent

[ - ]

Rolivers@discuss.tchncs.de

9 points

11 months ago

What’s wrong with that? I’d put a rickroll in there without regrets.

permalink

report

parent

[ - ]

ExLisper@linux.community

4 points

11 months ago

Depends on the company I guess. But yeah, people would probably just laugh at me for being careless.

permalink

report

parent

[ - ]

DudeDudenson@lemmings.world

2 points

11 months ago

I put a “These are not the droids you’re looking for” meme as the architecture documentation for one of our apps on GitHub. My boss approved the PR

permalink

report

parent

Asklemmy

!asklemmy@lemmy.ml

Create post

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it’s welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
!lemmy411@lemmy.ca: a community for finding communities

_Icon _by _{@Double_A@discuss.tchncs.de}

Community stats

9.2K
Monthly active users
5.9K
Posts
319K
Comments

Community stats

Community moderators