Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> leveraging fundamental assumptions about high-performance CPU design.

I believe the generalized fix is to restore the entire CPU state after a mispredict. You’d either need to add an extra copy of the entire processor state (tens of megabits) for every simultaneous predict you support ($$$) or keep track of how to revert all changes and revert them one at a time ($, slow).



This is harder than it seems, because once cache is deleted you can't just un-delete it, you'd have to go back to memory and pull it again.

Only the "extra copy of processor state" thing is really viable. You have to have a speculative cache and buffer in reads that only get flushed to the main cache once they're confirmed to be valid, which is enormously complicated. This facility already exists for writes, but now it needs to exist for reads too.

GP is absolutely correct that this is a fundamental assault on processor design as we know it, the speculative execution concept is going back to the drawing board for a major re-think.


I can’t help wondering what igodard’s day is like so far...


"I'm walking on sunshine..."


Wasn't Intel's transactional CPU Memory the solution, but it also failed to to bugs?

Sorry for quoting wikipedia, but I'm not at school, hah! [1]

'''' TSX provides two software interfaces for designating code regions for transactional execution. Hardware Lock Elision (HLE) is an instruction prefix-based interface designed to be backward compatible with processors without TSX support. Restricted Transactional Memory (RTM) is a new instruction set interface that provides greater flexibility for programmers.[13]

TSX enables optimistic execution of transactional code regions. The hardware monitors multiple threads for conflicting memory accesses, while aborting and rolling back transactions that cannot be successfully completed. Mechanisms are provided for software to detect and handle failed transactions.[13]

In other words, lock elision through transactional execution uses memory transactions as a fast path where possible, while the slow (fallback) path is still a normal lock. ''''

[1] https://en.wikipedia.org/wiki/Transactional_Synchronization_...


Wasn't Intel's transactional CPU Memory the solution, but it also failed to to bugs?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: