But wouldn't that novel query optimization still be explained somewhere in a paper using concepts derived from an existing body of work? It's going to ultimately boil down to an explanation of the form "it's like how A and B work, but slightly differently and with this extra step C tucked in the middle, similar to how D does it."
And an LLM could very much ingest such a paper and then, I expect, also understand how the concepts mapped to the source code implementing them.
> And an LLM could very much ingest such a paper and then, I expect, also understand how the concepts mapped to the source code implementing them.
LLM don't learn from manuals describing how things works, LLM learn from examples. So a thing being described doesn't let the LLM perform that thing, the LLM needs to have seen a lot of examples of that thing being perform in text in able to perform it.
This is a fundamental part to how LLM work and you can't get around this without totally changing how they train.
And an LLM could very much ingest such a paper and then, I expect, also understand how the concepts mapped to the source code implementing them.