So here's a story of, by far, the weirdest bug I've encountered in my CS career.(threadreaderapp.com)

posted 6 months ago

Along with @maciejwolczyk we’ve been training a neural network that learns how to play NetHack, an old roguelike game, that looks like in the screenshot. Recently, something unexpected happened.

Sort:

Hot Top Controversial New Old

[ - ]

ArbitraryValue@sh.itjust.works

51 points

6 months ago

Their problem:

So apparently NetHack has a mechanic that slightly changes how the game plays every time it’s full moon according to your system clock

The model wasn’t trained on a full moon. They had a system to set up the environment for replicable results but it didn’t include modifying the system time.

It reminds me of another bug with the system time, which a friend of mine encountered. He was working on hardware and he was getting a lot of units that worked fine at the factory, immediately failed at the client’s location, and then worked again when they were returned to the factory. It turned out that when these machines were turned on, their embedded OS automatically queried some server to update the current time. The client’s internet connection had such high latency that the server’s response only came back after the machine was already in use. This generated a huge delta-t value that triggered the sanity checks and shut the machine down. The factory had a much lower-latency connection and so the race condition could never be replicated there.

As for the weirdest bug I ever encountered myself: a compiler generating bad machine code. I have often said that the worst part of programming is that the computer always does exactly what you tell it to, but that was the one and only time in twenty years that the computer actually didn’t.

permalink

report

[ - ]

tsonfeir@lemmy.world

7 points

6 months ago

Their problem was not understanding the game ;)

permalink

report

parent

[ - ]

TJA!@sh.itjust.works

45 points

6 months ago

That reminds me of The case of the 500-mile email

permalink

report

parent

[ - ]

Doubletwist@lemmy.world

2 points

6 months ago

That was the first thing I thought of.

permalink

report

parent

[ - ]

nomad@infosec.pub

19 points

6 months ago

Reminds me of a production bug we could not replicate for the life of me.

The condition could logically not be reached. impossible.

Turns out, in production we had two threads per process, and one would monkey patch a function in the shared process space with a non multi threading safe locking mechanism.

That took several days to find.

permalink

report

[ - ]

dohpaz42@lemmy.world

15 points

6 months ago

This is a common problem when testing time-based software. And similarly why it’s difficult to test database-drive software. You have to put a lot of effort into setting up a good environment for testing and genuinely understand the software and its dependencies.

permalink

report

[ - ]

tal@lemmy.today

23 points

6 months ago

Not my bug and not CS, but I think that the most-difficult bug(s) I’ve read about is the American Mark 14 torpedo in World War II. A combination of constrained budget for testing before the war, extreme inability to meet supply (and thus provide some for testing), difficulties in observing the things in production in operation (it’s a torpedo, and the target probably isn’t too amenable to you looking at the thing if it doesn’t work well), secrecy, cutting-edge technology, and several other problems, a number of modes of operation (including both a contact and proximity magnetic fuze), and including multiple bugs that had a tendency to mask or affect each other, including specifically:

A tendency to run deeper than set (and sometimes go too deep and not hit or detect a ship)
A tendency to bend a critical pin on impact if the torpedo impacted a ship at something like right angles, but not at an angle; if bent, the torpedo would not detonate.
Testing that happened in the Atlantic, but with most use in the Pacific. It turns out that Earth’s magnetic field is not uniform, and varies enough to throw off magnetic fuzes and cause premature explosions or non-explosions.

…led to the US fighting a war that was heavily-naval, where the main weapon for sinking major ships was the torpedo…but where that torpedo wasn’t really very functional for something like 18 months of fighting.

Wikipedia has a somewhat longer version.

This long explanation is probably the best I’ve read.

permalink

report

[ - ]

slurpyslop@kbin.social

34 points

6 months ago

it was the WEIRDEST bug in our chess ai you guys

the pawn captured another pawn that was NEXT TO IT

like what’s going on there

permalink

report

[ - ]

This is fine🔥🐶☕🔥@lemmy.world

21 points

6 months ago

Holy hell

permalink

report

parent

[ - ]

yukijoou@lemmy.blahaj.zone

7 points

6 months ago

New response just dropped

permalink

report

parent

[ - ]

SharkAttak@kbin.social

7 points

6 months ago

It makes sense for the pawn:“But it’s right here! Why shouldn’t I kill it?!”

permalink

report

parent

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

Community stats

18K
Monthly active users
12K
Posts
553K
Comments

Our Rules

Approved Bots

Community stats

Community moderators