The problem is that we're more or less stuck with this class of problem unless w...

thechao · on Dec 1, 2024

Larabee was fun to program, but I think it'd have an even worse time hardening memory sideband effects: the barrel processor (which was necessary to have anything like reasonable performance) was humorously easy to use for cross-process exfiltration. Like... it was so easy, we actually used it as an IPC mechanism.

wheybags · on Dec 1, 2024

> it was so easy, we actually used it as an IPC mechanism.

Can you elaborate on that? It sounds interesting

thechao · on Dec 1, 2024

Now you’re asking me technical details from more than a decade ago. My recollection is that you could map one of the caches between cores — there were uncached-write-through instructions. By reverse engineering the cache’s hash, you could write to a specific cache-line; the uc-write would push it up into the correct line and the “other core” could snoop that line from its side with a lazy read-and-clear. The whole thing was janky-AF, but way the hell faster than sending a message around the ring. (My recollection was that the three interlocking rings could make the longest-range message take hundreds of cycles.)

rincebrain · on Dec 1, 2024

Sure, absolutely, there's large numbers of additional classes of side effects you would need to harden against if you wanted to eliminate everything, I was mostly thinking specifically of something with an enormous number of cores without the 4-way SMT as a high-level description.

I was always morbidly curious about programming those, but never to the point of actually buying one, and I always had more things to do in the day than time in past life when we had a few of the cards in my office.

lukan · on Dec 1, 2024

"if Intel or AMD announced a new gamer CPU tomorrow that was 3x faster in most games but utterly unsafe against all Meltdown/Spectre-class vulns, how fast do you think they'd sell out"

Well, many people have gaming computers, they won't use for anything serious. So I would also buy it. And in restricted gaming consoles, I suppose the risk is not too high?

izacus · on Dec 1, 2024

Also, many games today outright install rootkits to monitor your memory (see [1]) - some heartbleed is so far down the line of credible threats on a gaming machine that its outright ludicrous to trade off performance for it.

[1]:https://www.club386.com/assassins-creed-shadows-drm-wants-to...

formerly_proven · on Dec 1, 2024

Consoles are hardened very well to prevent homebrew, cracking, cheating etc.

dcow · on Dec 1, 2024

But this class of vuln is about data leaking between users in a multi-user system.

alexvitkov · on Dec 1, 2024

They're a pain in the ass all around. Spectre allowed you to read everything paged in (including kernel memory) from JS in the browser.

To mitigate it browsers did a bunch of hacks, including nerfing precision on all timer APIs and disabling shared memory, because you need an accurate timer for the exploit - to this day performance.now() rounds to 1MS on firefox and 0.1MS on Chrome.

This 1MS rounding funnily is a headache for me right as we speak. On a say 240Hz monitor, for video games you need to render a frame every ~4.16ms -- 1ms precision is not enough for accurate ticker -- even if you render your frames on time, the result can't be perfectly smooth as the browser doesn't give an accurate enough timer by which to advance your physics every frame.

wongarsu · on Dec 1, 2024

Isn't it rather about data leaks between any two processes? Whether those two processes belong to different users is a detail of the threat model and the OS's security model. In a console it could well be about data leaks between a game with code-injection vulnerability and the OS or DRM system.

sim7c00 · on Dec 1, 2024

you mean those consoles that can attack the rest of your devices and your neighbours via its wireless chips?

ggu7hgfk8j · on Dec 1, 2024

Speculation attacks enables code running on the machine to access data it shouldn't. I don't see how that relates to your scenario.

magicalhippo · on Dec 1, 2024

We already have heterogeneous cores these days, with E and P, and we have a ton of them as they take little space on the die relative to cache. The solution, it seems to me, is to have most cores go brrrrrr and a few that are secure.

Given that we have effectively two browser platforms (Chromium and Firefox) and two operating systems to contend with (Linux and Windows), it seems entirely tractable to get the security sensitive threads scheduled to the "S cores".

ggu7hgfk8j · on Dec 1, 2024

The main security boundary a modern computer upholds is web vs everything else, including protecting one webpage from another.

So I think it should be the javascript that should run on these hypothetical cores.

Though perhaps a few other operations might choose to use them as well.

nine_k · on Dec 1, 2024

Also all the TLS, SSH, Wireguard and other encryption, anything with long-persisted secret information. Everything else, even secret (like displayed OTP codes) is likely too fleeting for a snooping attack to be able to find and exfiltrate it, even if an exfiltration channel remains. Until a better exfiltration method is found, of course :-(

I think we're headed towards the future of many highly insulated computing nodes that share little if anything. Maybe they'd have a faster way to communicate, e.g. by remapping fast cache-like memory between cores, but that memory would never be uncontrollably shared the way cache lines are now.

astrange · on Dec 1, 2024

That's a secure enclave aka secure element aka TPM. Once you start wanting security you usually think up enough other features (voltage glitching prevention, memory encryption) that it's worth moving it off the CPU.

Dylan16807 · on Dec 1, 2024

That's a wildly different type of security. I just want to sandbox some code, not treat the entire world as hostile.

tiberious726 · on Dec 1, 2024

Eh, the TPM is a hell of a lot less functional than security processor on a modem arm board. You can seal and unseal based on system state, but once things are unsealed, it's just in memory

mjevans · on Dec 1, 2024

I agree at a gut / instinct level with that thought.

SINGLE thread best and worst case have to be the same to avoid speculation...

However for threads from completely unrelated domains could be run instead, if ready. Most likely the 'next' thread on the same unit, and worry about repacking free slots the next time the schedule runs.

++ Added ++

It might be possible to have operations that don't cross security boundaries have different performance as operations within a program's space.

An 'enhanced' level of protection for threads running a VM like guest code segment (such as browsers) might also be offered that avoids higher speculation operations.

Any operation similar to a segmentation fault relative to that thread's allowed memory accesses could result in forfeit of it's timeslice. Which would only leak what it should already know anyway, what memory it's allowed to access. Not the content of other memory segments.

gpderetta · on Dec 2, 2024

> However for threads from completely unrelated domains could be run instead

This introduces HT side channel vulnerabilities. You would have to static partition all caches and branch predictors.

Also this is more or less how GPUs work. It is great for high throughput code, terrible for latency sensitive code.

nottorp · on Dec 1, 2024

> how fast do you think they'd sell out?

10-20 min, depending on how many they make :)

int0x29 · on Dec 1, 2024

Itanium allegedly was free from branch prediction issues but I suspect cache behavior still might have been an issue. Unfortunately it's also dead as a doornail.

HPsquared · on Dec 2, 2024

I wonder how that would play with anti-cheat systems, if the system has known vulns.

brokenmachine · on Dec 2, 2024

>if Intel or AMD announced a new gamer CPU tomorrow that was 3x faster in most games but utterly unsafe against all Meltdown/Spectre-class vulns, how fast do you think they'd sell out?

I do realize that gamers aren't the most logical bunch, but aren't most games GPU-bound nowadays?

hn3er1q · on Dec 2, 2024

Not a gamer but I would guess it depends on the graphics settings. At lower resolutions, and with less lighting features, etc. one can probably turn a GPU bound game into a CPU bound game.

brokenmachine · on Dec 2, 2024

I believe it does, that's why they test gaming CPUs at 1080p even when using the fastest GPU.

jorvi · on Dec 1, 2024

Also, a good chunk of these vulnerabilities (Retbleed, Downfall, Rowhammer, there's probably a few I'm forgetting) are either theoretical, lab-only or spear exploits that require a lot of setup. And then the leaking info from something like Retbleed mostly applies to shared machines like in cloud infrastructure.

Which makes it kind of terrible that the kernel has these mitigations turned on by default, stealing somewhere in the neighborhood of 20-60% of performance on older gen hardware, just because the kernel has to roll with "one size fits all" defaults.

nine_k · on Dec 1, 2024

If you know what you're doing, you do something like this: https://gist.github.com/jfeilbach/f06bb8408626383a083f68276f... and make Linux fast again (c).

If you don't know what kernel parameters are and what do they affect, it's likely safer to go with all the mitigations enabled by default :-|

jorvi · on Dec 1, 2024

Yeah, I personally have Retbleed and Downfall mitigations disabled, the rest thankfully doesn’t severely affect my CPU performance.

Appreciate sharing the gist though!

DSingularity · on Dec 1, 2024

I don’t think you are thinking of this right. One bit of leakage makes it half as hard to break encryption via brute force. It’s a serious problem. The defaults are justified.

I think things will only shift once we have systems they ship with fully sandboxes that are minimally optimized and fully isolated. Until then we are forced to assume the worst.

jorvi · on Dec 1, 2024

> I don’t think you are thinking of this right. One bit of leakage makes it half as hard to break encryption via brute force.

The problem is that you need to execute on the system, then need to know which application you’re targeting, then figure out the timings, and even then you’re not certain you are getting the bits you want.

Enabling mitigations For servers? Sure. Cloud servers? Definitely. High profile targets? Go for it.

The current defaults are like foisting iOS its “Lockdown Mode” on all users by default and then expecting them to figure out how to turn it off, except you have to do it by connecting it to your Mac/PC and punching in a bunch of terminal commands.

Then again, almost all kernel settings are server-optimal (and even then, 90s server optimal). There should honestly should be some serious effort to modernize the defaults for reasonably modern servers, and then also have a separate kernel for desktops (akin to CachyOS, just more upstream).

DSingularity · on Dec 2, 2024

Maybe so but I think most users are going to be vulnerable to likely under-estimate their security-sensitivity than to over-estimate. On top of that security profiles can change and perhaps people won’t remember to update their settings to meet their current security needs.

These defaults are needed and if the loss is so massive we should be willing to embrace less programmable but more secure options.