Avatar

Morphit

Morphit@feddit.uk
Joined
7 posts • 169 comments
Direct message

You can still catch the error at runtime and do something appropriate. That might be to say this update might have been tampered with and refuse to boot, but more likely it’d be to just send an error report back to the developers that an unexpected condition is being hit and just continuing without loading that one faulty definition file.

permalink
report
parent
reply

A page fault can be what triggers a catch, but you can’t unwind what a loaded module (the Crowdstrike driver) did before it crashed. It could have messed with Windows kernel internals and left them in a state that is not safe to continue. Rather than potentially damage the system, Windows stops with a BSOD. The only solution would be to not allow code to be loaded into the kernel at all, but that would make hardware drivers basically impossible.

permalink
report
parent
reply

The driver is in kernel mode. If it crashes, the kernel has no idea if any internal structures have been left in an inconsistent state. If it doesn’t halt then it has the potential to cause all sorts of damage.

permalink
report
parent
reply

I don’t think the kernel could continue like that. The driver runs in kernel mode and took a null pointer exception. The kernel can’t know how badly it’s been screwed by that, the only feasible option is to BSOD.

The driver itself is where the error handling should take place. First off it ought to have static checks to prove it can’t have trivial memory errors like this. Secondly, if a configuration file fails to load, it should make a determination about whether it’s safe to continue or halt the system to prevent a potential exploit. You know, instead of shitting its pants and letting Windows handle it.

permalink
report
parent
reply

This doesn’t really answer my question but Crowdstrike do explain a bit here: https://www.crowdstrike.com/blog/technical-details-on-todays-outage/

These channel files are configuration for the driver and are pushed several times a day. It seems the driver can take a page fault if certain conditions are met. A mistake in a config file triggered this condition and put a lot of machines into a BSOD bootloop.

I think it makes sense that this was a preexisting bug in the driver which was triggered by an erroneous config. What I still don’t know is if these channel updates have a staged deployment (presumably driver updates do), and what fraction of machines that got the bad update actually had a BSOD.

Anyway, they should rewrite it in Rust.

permalink
report
parent
reply

Does anyone know how these Cloudstrike updates are actually deployed? Presumably the software has its own update mechanism to react to emergent threats without waiting for patch tuesday. Can users control the update policy for these ‘channel files’ themselves?

permalink
report
parent
reply

The switches do suck but they can usually be revived with contact cleaner. If you open the mouse you can spray around the switch plunger or better yet, pop off the top half of the switch case and spray the contact directly. That completely cleared up the double click on my G402 and even revived an old MX510 that was missing clicks.

permalink
report
parent
reply

Or if the government sends you the social security numbers of every teacher in the state. Then you’re a hacker for responsibly disclosing the issue:
Missouri gov. calls journalist who found security flaw a “hacker,” threatens to sue

permalink
report
parent
reply

So build concentrated solar power and store the heat for after the sun sets. Bonus - thermal power plant turbines give inertia to the grid, which photo-voltaics don’t.

permalink
report
parent
reply

Privacy Enhancing Technologies. Some obvious things giving anonymity and plausible deniability but also zero-knowledge proofs and such.

permalink
report
parent
reply