More

Inufu · 2025-10-15T22:47:18 1760568438

For EU / UK: https://geizhals.eu/?cat=hde7s&offset=0&sort=r&hloc=at&hloc=...

Works for an amazingly large set of products!

Inufu · 2025-10-04T10:03:43 1759572223

In that case the CLI only needs to import the plugin that defined that sub command, not all plugins?

simonw · 2025-10-04T10:46:49 1759574809

It doesn't know which plugin defines a subcommand until it imports the plugin's module.

I'm happy with the solution I have now, which is to encourage plugin authors not to import PyTorch or other heavy dependencies at the root level of their plugin code.

peterfirefly · 2025-10-04T14:04:04 1759586644

> It doesn't know which plugin defines a subcommand until it imports the plugin's module.

That might be considered a design mistake -- one that should be easy to migrate away from.

You won't need to do anything, of course, if the lazy import becomes available on common Python installs some day in the future. That might take years, though.

Inufu · 2025-09-28T16:50:20 1759078220

Author here.

The standard way to do this is Reinforcement Learning: we do not teach the model how to do the task, we let it discover the _how_ for itself and only grade it based on how well it did, then reinforce the attempts where it did well. This way the model can learn wildly superhuman performance, e.g. it's what we used to train AlphaGo and AlphaZero.

Inufu · 2025-09-28T15:43:56 1759074236

Author here.

The argument is not that it will keep growing exponentially forever (obviously that is physically impossible), rather that:

- given a sustained history of growth along a very predictable trajectory, the highest likelihood short term scenario is continued growth along the same trajectory. Sample a random point on an s-curve and look slightly to the right, what’s the most common direction the curve continues?

- exponential progress is very hard to visualize and see, it may appear to hardly make any progress while far away from human capabilities, then move from just below to far above human very quickly

hnlmorg · 2025-09-28T16:40:53 1759077653

My point is that the limits of LLMs will be hit long before we they start to take on human capabilities.

The problem isn’t that exponential growth is hard to visualise. The problem is that LLMs, as advanced and useful a technique as it is, isn’t suited for AGI and thus will never get us even remotely to the stage of AGI.

The human like capabilities are really just smoke and mirrors.

It’s like when people anthropomorphisise their car; “she’s being temperamental today”. Except we know the car is not intelligence and it’s just a mechanical problem. Whereas it’s in the AI tech firms best interest to upsell the human-like characteristics of LLMs because that’s how they get VC money. And as we know, building and running models isn’t cheap.

tim333 · 2025-09-30T23:08:22 1759273702

>the limits of LLMs will be hit long before we they start to take on human capabilities

Against that you have stuff like Deepmind getting gold in the International Collegiate Programming Contest the other week, including solving one problem where "none of the human teams, including the top performers from universities in Russia, China and Japan, got it right" https://www.theguardian.com/technology/2025/sep/17/google-de...

There's kind of a contradiction that they are nowhere near human capabilities while also beating humans in various competitions.

hnlmorg · 2025-10-01T14:24:33 1759328673

I don’t see that as a contradiction but I do appreciate how some might.

You can train anything to be really good at specialised fields. But that doesn’t mean they’re a good generalist.

For example:

you can train anything child to memorise the 10 times table. But that doesn’t mean they’re can perform long division.

Being an olympic-class cyclist doesn’t mean you’re any good as an F1 driver nor swimming nor Fencing.

Being highly specialised usually means you’re not as good at general things. And that’s as true for humans as it is for computers.

tim333 · 2025-10-01T15:28:52 1759332532

Though in your examples cyclists can learn to drive as humans have similar abilities.

I'll give you current GPT stuff has it's limitations - it can't come fix your plumbing say and pre-trained transformers aren't good at learning things after their pre-training but I'm not sure they are nowhere near human capabilities such that they can't be fixed up.

hnlmorg · 2025-10-01T16:24:32 1759335872

You cannot use an LLM to solve mathematic equations.

That’s not a training issue, that’s a limitation of a technology that’s at its core, a text prediction engine.

tim333 · 2025-10-01T16:58:20 1759337900

Yet if you look at deepmind getting gold in the IMO it seems quite equationish.

Questions and answers: https://storage.googleapis.com/deepmind-media/gemini/IMO_202...

tim333 · 2025-09-28T17:48:00 1759081680

There is no particular reason why AI has to stick to language models though. Indeed if you want human like thinking you pretty much have to go beyond language as we do other stuff too if you see what I mean. A recent example: "Google DeepMind unveils its first “thinking” robotics AI" https://arstechnica.com/google/2025/09/google-deepmind-unvei...

hnlmorg · 2025-09-28T20:04:59 1759089899

> There is no particular reason why AI has to stick to language models though.

There’s no reason at all. But that’s not the technology that’s in the consumer space, growing exponentially, gaining all the current hype.

So at this point in time, it’s just a theoretical future that will happen inevitably but we don’t know when. It could be next year. It could be 10 years. It could be 100 years or more.

My prediction is that current AI tech plateaus long before any AGI-capable technology emerges.

tim333 · 2025-09-29T17:06:15 1759165575

Yeah, quite possible.

solid_fuel · 2025-09-29T08:50:15 1759135815

That's a rather poor choice for an example considering Gemini Robotics-ER is built on a tuned version of Gemini, which is itself an LLM. And while the action model is impressive, the actual "reasoning" here is still being handled by an LLM.

From the paper [0]:

> Gemini Robotics 1.5 model family. Both Gemini Robotics 1.5 and Gemini Robotics-ER 1.5 inherit Gemini’s multimodal world knowledge.

> Agentic System Architecture. The full agentic system consists of an orchestrator and an action model that are implemented by the VLM and the VLA, respectively:

> • Orchestrator: The orchestrator processes user input and environmental feedback and controls the overall task flow. It breaks complex tasks into simpler steps that can be executed by the VLA, and it performs success detection to decide when to switch to the next step. To accomplish a user-specified task, it can leverage digital tools to access external information or perform additional reasoning steps. We use GR-ER 1.5 as the orchestrator.

> • Action model: The action model translates instructions issued by the orchestrator into low-level robot actions. It is made available to the orchestrator as a specialized tool and receives instructions via open-vocabulary natural language. The action model is implemented by the GR 1.5 model.

AI researchers have been trying to discover workable architectures for decades, and LLMs are the best we've got so far. There is no reason to believe that this exponential growth on test scores would or even could transfer to other architectures. In fact, the core advantage that LLMs have here is that they can be trained on vast, vast amounts of text scraped from the internet and taken from pirated books. Other model architectures that don't involve next-token-prediction cannot be trained using that same bottomless data source, and trying to learn quickly from real-world experiences is still a problem we haven't solved.

[0] https://storage.googleapis.com/deepmind-media/gemini-robotic...

rmwaite · 2025-09-30T11:41:48 1759232508

My problem with takes like this is it presumes a level of understanding of intelligence in general that we simply do not have. We do not understand consciousness at all, much less consciousness that exhibits human intelligence. How are we to know what the exact conditions are that result in human-like intelligence? You’re assuming that there isn’t some emergent phenomenon that LLMs could very well achieve, but have not yet.

hnlmorg · 2025-09-30T12:58:46 1759237126

I'm not making a philosophical argument about what human-like intelligence is. I'm saying LLMs have many weaknesses that make in incapable of performing basic functions that humans take for granted. Like count and recall.

I go into much more detail here: https://news.ycombinator.com/item?id=45422808

Ostensibly, AGI might use LLMs in parts of it's subsystems. But the technology behind LLMs doesn't adapt to all of the problems that AGI would need to solve.

It's a little like how the human brain isn't just one homogeneous grey lump. There's different parts of the brain that specialize on different parts of cognitive processing.

LLMs might work for language processing, but that doesn't mean it would work for maths reasoning -- and in fact we already know it doesn't.

This is why we need tools / MCPs. We need ways of turning problems LLM cannot solve into standalone programs that LLMs can cheat and ask the answers for.

fjdjshsh · 2025-09-30T00:38:33 1759192713

>the limits of LLMs will be hit long before we they start to take on human capabilities.

Why do you think this? The rest of the comment is just rephrasing this point ("llms isn't suited for AGI"), but you don't seem to provide any argument.

hnlmorg · 2025-09-30T07:20:31 1759216831

Fair point.

Basically AGI describes human-like capabilities.

The problem with LLMs are that they’re, at their core, a token prediction model. Tokens, typically text, are given a numeric value and can then be used to predict what tokens should follow.

This makes them extremely good things like working with source code and other source of text where relationships are defined via semantics.

The problem with this is that it makes them very poor at dealing with:

1. Limited datasets. Smaller models are shown to be less powerful. So often LLMs need to inject significantly more information than a human would learn in their entire life time, just to approximate what that human might produce in any specific subject.

2. Learning new content. Here we have to rely on non-AI tooling like MCPs. This works really well under the current models because we can say “scrape these software development references” (etc) to keep itself up to date. But there’s no independence behind those actions. An MCP only works because it includes into the prompt how to use that MCP and why you should use that. Whereas if you look at humans, even babies know how to investigate and learn independently. Our ability to self-learn is one of the core principles of human intelligence.

3. Remember past content that resides outside of the original model training. I think this is actually a solvable problem in LLMs but there’s current behaviour of them is to bundle all the current interactions into the next prompt. In reality, the LLM hasn’t really remembered anything, you’re just reminding it about everything with each exchange. So each subsequent prompt gets longer and thus more fallible. It also means that context is always volatile. Basically it’s just a hack that only works because context sizes have grown exponentially. But if we want AGI then there needs to be a persistent way of retaining that context. There are some work around here, but they depend on tools.

4. any operation that isn’t semantic-driven. Things like maths, for example. LLMs have to call a tool (like MCPs) to perform calculations. But that requires having a non-AI function to return a result rather than the AI reason about maths. So it’s another hack. And there are a lot of domains that fall into this kind of category where complex tokenisation is simply not enough. This, I think, is going to be the biggest hurdle for LLMs.

5. Anything related to the physical world. We’ve all seen examples of computer vision models drawing too many fingers on a hand or have disembodied objects floating. The solutions here are to define what a hand should look like. But without an AI having access to a physical 3 dimensional world to explore, it’s all just guessing what things might look like. This is particularly hard for LLMs because they’re language models, not 3D coordinate systems.

There’s also the question about whether holding vector databases of token weights is the same thing as “reasoning”, but I’ll leave that argument for the philosophers.

I think a theoretical AGI might use LLMs as part of its subsystems. But it needs to leverage AI throughout, which LLMs cannot, as it needs handle topics that are more than just token relationships, which LLMs cannot do.

PeterStuer · 2025-09-30T06:18:58 1759213138

AI services are/will be going hybrid. Just like we have seen in search, with thousands of dedicated subsystems handling niches behind the single unified ui element or api call.

hnlmorg · 2025-09-30T06:45:29 1759214729

“Hybrid” is just another way of saying “AI isn’t good enough to work independently”. Which is the crux of my point.

adammarples · 2025-09-28T18:01:46 1759082506

The most common part of the S-curve by far is the flat bit before and the flat bit after. We just don't graph it because it's boring. Besides which there is no reason at all to assume that this process will follow that shape. Seems like guesswork backed up by hand waving.

tempfile · 2025-09-29T09:57:06 1759139826

Very much handwaving. The question is not meaningful at all without knowing the parameters of the S-curve. It's like saying "I flipped a coin and saw heads. What's the most likely next flip?"

YeGoblynQueenne · 2025-09-29T22:58:30 1759186710

So it's an argument impossible to counter because it's based on a hypothesis that is impossible to falsify: it predicts that there will either be a bit of progress, or a lot of progress, soon. Well, duh.

bawolff · 2025-09-28T16:30:40 1759077040

That feels like you're moving the goal posts a bit.

Exponential growth over the short term is very uninteresting. Exponential growth is exciting when it can compound.

E.g. if i offered you an investing opportunity 500% / per year compounded daily - that's amazing. If the fine print is that that rate will only last for the very near term (say a week), then it would be worse than a savings account.

Inufu · 2025-09-28T16:44:03 1759077843

Well, growth has been on this exponential already for 5+ years (for the METR eval), and we are at the point where models are very close to matching human expert capabilities in many domains - only one or two more years of growth would put us well beyond that point.

Personally I think we'll see way more growth than that, but to see profound impacts on our economy you only need to believe the much more conservative assumption of a little extra growth along the same trend.

ModernMech · 2025-09-28T17:36:03 1759080963

> we are at the point where models are very close to matching human expert capabilities in many domains

This is not true because experts in these domains don't make the same routine errors LLMs do. You may point to broad benchmarks to prove your point, but actual experts in the benchmarked fields can point to numerous examples of purportedly "expert" LLMs making things up in a way no expert would ever.

Expertise is supposed to mean something -- it's supposed to describe both a level of competency and trustworthiness. Until they can be trusted, calling LLMs experts in anything degrades the meaning of expertise.

bawolff · 2025-09-28T20:25:35 1759091135

> we are at the point where models are very close to matching human expert capabilities in many domains

That's a bold claim. I don't think it matches most people's experiences.

If that was really true people wouldn't be talking about exponential growth. You don't need exponential growth if you are already almost at your destination.

hnlmorg · 2025-09-28T16:54:49 1759078489

Which domains?

What I’ve seen is that LLMs are very good at simulating an extremely well read junior.

Models know all the tricks but not when to use them.

And because of that, you’re continually have to hand hold them.

Working with an LLM is really closer to pair programming than it is handing a piece of work to an expert.

The stuff I’ve seen in computer vision is far more impressive in terms of putting people out of a job. But even there, it’s still highly specific models left to churn away at tasks that are ostensibly just long and laborious tasks. Which so much of VFX is.

Inufu · 2025-09-28T15:35:36 1759073736

Yeah exactly!

It’s likely that it will slow down at some point, but the highest likelihood scenario for the near future is that scaling will continue.

Inufu · 2025-08-23T18:17:30 1755973050

> You don’t have to randomize the first part of your object keys to ensure they get spread around and avoid hotspots.

Sorry, this is absolutely still the case if you want to scale throughput beyond the few thousand IOPS a single shard can serve. S3 will automatically reshard your key space, but if your keys are sequential (eg leading timestamp) all your writes will still hit the same shard.

Source: direct conversations with AWS teams.

Inufu · 2025-08-22T17:33:31 1755884011

Requiring ownership transfer gives up on one of the main selling points of Rust, being able to verify reference lifetime and safety at compile time. If we have to give up on references then a lot of Rusts complexity no longer buys us anything.

vlovich123 · 2025-08-22T18:18:25 1755886705

I'm not sure what you're trying to say, but the compile-time safety requirement isn't given up. It would look something like:

    self.buffer = io_read(self.buffer)?

This isn't much different than

    io_read(&mut self.buffer)?

since rust doesn't permit simultaneous access when a mutable reference is taken.

Inufu · 2025-08-23T05:34:35 1755927275

It means you can for example no longer do things like get multiple disjoint references into the same buffer for parallel reads/writes of independent chunks.

Or well you can, using unsafe, Arc and Mutex - but at that point the safety guarantees aren’t much better than what I get in well designed C++.

Don’t get me wrong, I still much prefer Rust, but I wish async and references worked together better.

Source: I recently wrote a high-throughput RPC library in Rust (saturating > 100 Gbit NICs)

Inufu · 2025-07-29T20:44:53 1753821893

You can use Claude Code with your own API key if you want to use more tokens than included in the Pro / Max plans.

social_quotient · 2025-07-30T00:49:44 1753836584

Yeah I get that, I’m not “stuck” it’s that I don’t think the comms make sense and it’s troubling none of these teams have figured out a pricing model that isn’t a rug pull. If all of these ai llm coders were priced right, they would be out of the hands of many of the operators that are not dependent users. It’s got a bait and switch feel to it. I’ll deal with it. It’s a good product, I just feel like we deserve better and that these guys are smarter than this.

Can you imagine if AWS pulled half of these tricks with cloud services as a subscription not tethered to usage? They wait for you to move all of your infrastructure to them (to the detriment of their competitors) and then … oh we figured out we can’t do business like this, we need to charge based on XYZ… we are all adults and it’s our job to deal with it or move on but… something doesn’t smell right and that’s the problem.

Inufu · on April 7, 2023

AlphaZero did not run game logic on TPUs (neither chess nor other games), implementing it in C++ is more than fast enough and much simpler.

TPUs were used for neural network inference and training, but game logic as well as MCTS was on the CPU using C++.

JAX is awesome though, I use it for all my neural network stuff!

sillysaurusx · on April 7, 2023

According to the AlphaZero paper (https://arxiv.org/pdf/1712.01815.pdf) they ran game logic on TPUs:

> Training proceeded for 700,000 steps (mini-batches of size 4,096) starting from randomly initialised parameters, using 5,000 first-generation TPUs to generate self-play games and 64 second-generation TPUs to train the neural networks. Further details of the training procedure are provided in the Methods.

Inufu · on Dec 5, 2022

This is not a fake imitation, it's steaming results live as they are being generated by the model.