Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is counting the number of B's vital? Also, I'm pretty sure you can get an LLM to parse text the way you want it, it just doesn't see your text as you do, so that simple operation is not straightforward. Similarly, are you worthless because you seem like you understand language but are incapable of counting the number of octects in "blueberry"?


Let's say I hire a plumber because of his plumbing expertise and he bills me $35 and I pay him with a $50 bill and he gives me back $10 in change. He insists he's right about this.

I am now completely justified in worrying about whether the pipes he just installed were actually made of butter.


Really? Is that easy? Happens quite often to really believe something and be wrong. Maybe you both are right and the $5 bill is on the floor?


> Similarly, are you worthless because you seem like you understand language but are incapable of counting the number of octects in "blueberry"?

Well, I would say that if GP advertised themselves as being able to do so, and confidently gave an incorrect answer, their function as someone who is able to serve their advertised purpose is practically useless.


So ChatGPT was advertised as a letter counter?

Also, no matter what hype or marketing says: GPT is a statistical word bag with a mostly invisible middleman to give it a bias.

A car is a private transportation vehicle but companies still try to sell it as a lifestyle choice. It's still a car.


It is (maybe not directly but very insistently) advertised as taking many jobs soon.

And counting stuff you have in front of yourself is basic skill required everywhere. Counting letters in a word is just a representative task for counting boxes with goods, or money, or kids in a group, or rows on a list on some document, it comes up in all kinds of situations. Of course people insist that AI must do this right. The word bag perhaps can't do it but it can call some better tool, in this case literally one line of python. And that is actually the topic the article touches on.


People always insist that any tool must do things right. They as well insist that people do things right.

Tools are not perfect, people are not perfect.

Thinking that LLMs must do things right, that people find simple, is a common mistake, and it is common because we easily treat the machine as a person, while it only is acting like one.


> Thinking that LLMs must do things right, that people find simple, is a common mistake

Show me any publicly known visible figure that tries to rectify this. Everyone peddles hype, there's no more Babbage as in the "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" anecdote.


People and tools that don't do things right aren't useful. They get replaced. Making do with a shitty tool might make sense economically but not in any other way.


If you follow that reasoning, no person is useful and no tool is useful.

The little box I'm filling now is, compared to a lot of other interfaces, a shitty interface. That doesn't mean it isn't useful. Probably it is getting replaced, only with a slightly better inferface.

The karma system is quite simplistic and far from perfect. I'm sure there are ways to go around it. The moderators make mistakes.

That doesn't mean the karma and moderation are not useful. I hope you get my point but it's fine if we disagree as well.


It is advertised as being able to "analyze data" and "answer complex questions" [0], so I'd hope for them to reliably determine when to use its data-analysis capabilities to answer questions, if nothing else.

[0] http://web.archive.org/web/20250729225834/https://openai.com...


Here I am, a mind the size of a planet and what am I doing? Parking cars. - Marvin


As shown by the GPT-5 reaction, a majority of people just have nothing better to ask the models than how many times does the letter "s" appear in "stupid".


I think this is a completely valid thing to do when you have Sam Altman going on the daily shows and describing it as a genius in your pocket and how it's smarter than any human alive. Deflating hype bubbles is an important service.


Yeah: Like with self-driving vehicles, the characteristics of when and how something breaks are important, not just some average error-rate.

If users cannot anticipate what does or doesn't constitute risky usage or potential damages, things go Extra Wrong.


But the point is, why would you trust it for anything at all, when it can't do an incredibly simple thing reliably at all? (Yes, I understand the tokenizer makes this hard, but still, it's a quick demonstration that it's just bad technology.)


2 time(s)


I mean, I think that anyone who understands UTF-8 will know that there are nine octets in blueberry when it is written on a web page. If you wanted to be tricky, you could have thrown a Β in there or something.


> anyone who understands UTF-8

So not too many?


if i have to talk to it a specific way, why not just use programming. The specific way we talk to computers effectively...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: