Hacker Newsnew | past | comments | ask | show | jobs | submit | gbalduzzi's commentslogin

Neat idea, but why should I use AI for a find and replace?

It feels like shooting a fly with a bazooka


it's like hiring someone to come pick up your trash from your house and put it on the curb.

it's fine if you're disabled


Bazooka guarantees the hit

I like LLMs, but guarantees in LLMs are... you know... not guaranteed ;)

I think that was the point

If all you have is a hammer.. ;)

do you know all the spells you're looking for from memory?

You could just, you know, Google the list.

and then the first thing you see will be at least one of ITS AI responses, whether you liked it or not

You're missing the point, it's only a testing excersize for the new model.

No, the point is that you can set up the testing exercise without using an LLM to do a simple find and replace.

Its a test. Like all tests, its more or less synthetic and focused on specific expected behavior. I am pretty far from llms now but this seems like a very good test to see how geniune this behavior actually is (or repeat it 10x with some scramble for going deeper).

This thread is about the find-and-replace, not the evaluation. Gambling on whether the first AI replaces the right spells just so the second one can try finding them is unnecessary when find-and-replace is faster, easier and works 100%.

... I'm not sure if you're trolling or if you missed the point again. The point is to test the contextual ability and correctness of the LLMs ability's to perform actions that would be hopefully guaranteed to not be in the training data.

It has nothing to do about the performance of the string replacement.

The initial "Find" is to see how well it performs actually find all the "spells" in this case, then to replace them. They using a separate context maybe, evaluate if the results are the same or are they skewed in favour of training data.


> This is different from swagger / OpenAPI how?

In the way that Swagger / OpenAPI is for API endpoints, but most of the "skills" you need for your agents are not based on API endpoints


I mean conceptually.

Why not just extend the OpenAPI specification to skills? Instead of recreating something that's essentially communicating the same information?

T minus a couple years before someone declares that down-mapping skills into a known verb enumeration promotes better skill organization...


> Why not just extend the OpenAPI specification to skills?

Because approximately none of what exists in the existing OpenAPI specification is relevant to the task, and nothing needed for the tasks is relevant to the current OpenAPI use case, so trying to jam one use case into a tool designed for the other would be pure nonsense.

It’s like needing to drive nails and asking why grab a hammer when you already have a screwdriver.


You think indexing skills in increasingly structured, parameterized formats has nothing to do with documenting REST API endpoints?

The language itself was not invented for the purpose: it was the language spoken in Florence, than adopted by the literary movement and than selected as the national language.

It seems like the best tradeoff between information density and understandability actually comes from the deep latin roots of the language


I was honestly surprised to find it in the first place, because I assumed English to be at first place given the simpler grammar and the huge dataset available.

I agree with your belief, other languages have either lower density (e.g. German) or lower understandability (e.g. English)


English has a ton of homophones, way more sounds that differ slightly (long/short vowels), and major pronunciation differences across major "official" languages (think Australia/US/Canada/UK).

Italian has one official italian (two, if you count IT_ch, but difference is minor), doesn't pay much attention to stress and vowel length, and only has a few "confusable" sounds (gl/l, gn/n, double consonants, stuff you get wrong in primary school). Italian dialects would be a disaster tho :)


> Single files in our codebase already blow the Copilot query token limit.

This tells more about your code quality that about copilot, and I'm not a fan of copilot


I disagree.

Sure, it's a dumpster fire. But human engineers work on it just fine without investing man-decades into refactoring it into some shrine to the software engineer's craft.

The whole point of AI, in our parent company's eyes, is for no one to mention "code quality" as something impeding the delivery of features, yesterday, ever.


Claude, with a modicum of guidance from an engineer familiar with your monolith, could could write comprehensive unit tests of your existing system, then refactor it into coherent composable parts, in a day.

Not doing so while senior management demands the use of AI augmentation seems odd.


It's a 25-year-old CAD application written in very non-standard C++. I doubt it.

Certainly I have tried to accomplish tasks giving Claude guidance far outstripping "a modicum".


It baffles me how much the discourse over native apps rarely takes this into consideration.

You reduce development effort by a third, it is ok to debate whether a company so big should invest into a better product anyway but it is pretty clear why they are doing this


That might be true (although you do add in the mess of web frameworks), but I strongly believe that resource usage must factor into these calculations too. It's a net negative to end users if you can develop an app a bit quicker but require the end users to have multiple more times RAM, CPU, etc.

> multiple more times RAM, CPU, etc.

Part of this (especially the CPU) is teams under-optimizing their Electron apps. See the multi-X speedup examples when they look into it and move hot code to C et al.


It might be a cynical take, but I don't think there is a single person in these companies that cares about end user resource usage. They might care if the target were less tech savvy people that are likely to have some laptop barely holding up with just Win11. But for a developer with a MacBook, what is one more electron window?

I agree. I also find it interesting that many developers don't mind using Docker to run Redis / Postgresql and other services on Mac that are very simple to install and run directly. That's fine, but then they don't get to complain about Electron.

There are valid use cases for Docker on those types of software, but most users just use Docker for convenience or because "everyone else" uses them. Maybe influenced by Linux users where Docker has lower overhead. It's convenient for sure, but it also has a cost on Mac/Windows


Especially given how fast things progress, timeline and performance are a tradeoff where I'd say swaying things in favour of the latter is not per definition net positive.

There's another benefit - you don't have to keep refactoring to keep up with "progress"!

Of course you do!

Microsoft makes a new UI framework every couple of years, liquid glass from apple and gnome has a new gtk version every so often.


Microsoft gets largely pilloried on every UI rethink, Apple’s Liquid Glass just annoyed everyone I’ve heard comment on it, and, fwiw, YouTube Music asking if it feels outdated is an unnecessary annoyance.

>You reduce development effort by a third

Done by the company which sells software which is supposed to reduce it tenfold?


> You don't casually give up massive abstraction wins

Value is value, and levers are levers, regardless of the resources you have or the difficulty of the problem you're solving.

If they can save effort with Electron and put that effort into things their research says users care about more, everyone wins.


That's like a luxury lumber company stuffing its showrooms full of ikea furniture.

After every time I read "save effort with Electron", I go back to Win2K VM and poke around things and realize how faster everything is than M4 Max, just because value is value, and Electron saves some effort.

There are cross platform GUI toolkits out there so while I am in team web for lots of reasons, generally it’s because web apps are faster and cheaper to iterate.

Cross platform GUIs might does have the same of support and distributed knowledge as HTML/CSS/JS. If that vendor goes away or the oss maintainers go a different direction, now you have an unsupported GUI platform.

I mean the initial release of Qt predates JavaScript by a few months and CSS by more than a year. GTK is only younger by a few years and both remain actively maintained.

Argument feels more like FUD than something rooted in factual reality.


> You reduce development effort by a third

Sorry to nitpick, but this should be "by three" or "by two thirds", right?


The real question is how much better are native apps compared to Electron apps.

Yes that would take much disk space, but it takes 50Mb or 500Mb isn't noticeable for most users. Same goes for memory, there is a gain for sure but unless you open your system monitor you wouldn't know.

So even if it's something the company could afford, is it even worth it?

Also it's not just about cost but opportunity cost. If a feature takes longer to implement natively compared to Electron, that can cause costly delays.


It absolutely is noticeable the moment you have to run several of these electron “apps” at once.

I have a MacBook with 16GB of RAM and I routinely run out of memory from just having Slack, Discord, Cursor, Figma, Spotify and a couple of Firefox tabs open. I went back to listening to mp3s with a native app to have enough memory to run Docker containers for my dev server.

Come on, I could listen to music, program, chat on IRC or Skype, do graphic design, etc. with 512MB of DDR2 back in 2006, and now you couldn’t run a single one of those Electron apps with that amount of memory. How can a billion dollar corporation doing music streaming not have the resources to make a native app, but the Songbird team could do it for free back in 2006?

I’ve shipped cross platform native UIs by myself. It’s not that hard, and with skyrocketing RAM prices, users might be coming back to 8GB laptops. There’s no justification for a big corporation not to have a native app other than developer negligence.


On that note, I could also comfortably fit a couple of chat windows (skype) on a 17'' CRT (1024x768) back in those days. It's not just the "browser-based resource hog" bit that sucks - non-touch UIs have generally become way less space-efficient.

I think the comparison between native apps and Electron apps is conflating two things:

- Native apps integrate well with the native OS look and feel and native OS features. I'd say it's nice to have, but not a must have, especially considering that the same app can run on multiple platforms.

- Native apps use much less RAM than Electron apps. I believe this one is a real issue for many users. Running Slack, Figma, Linear, Spotify, Discord, Obsidian, and others at the same time consumes a lot of memory for no good reason.

Which makes me wonder: Is there anything that could removed from Electron to make it lighter, similar to what Qt does?


Also, modern native UIs became looking garbage on desktops / laptops, where you usually want a high information density.

Just look at this TreeView in WinUI2 (w/ fluent design) vs a TreeView in the good old event viewer. It just wastes SO MUCH space!

https://f003.backblazeb2.com/file/sharexxx/ShareX/2026/02/mm...

And imo it's just so much easier to write a webapp, than fiddle with WinUI. Of course you can still build on MFC or Win32, but meh.


If I understood correctly, the same can be done on VS Code with the github plugins (for github PRs)

It's pretty straightforward: you checkout a PR, move around, and either make some edits (that you can commit and push to the feature branch) or add comments.


Good to know about its existence. I think I'll have to do my own sleuthing though, since I'm a (neo)vim user who dislikes GitHub.

I don't understand what kind of evidence you expect to receive.

There are plenty of examples from talented individuals, like Antirez or Simonw, and an ocean of examples from random individuals online.

I can say to you that some tasks that would take me a day to complete are done in 2h of agentic coding and 1h of code review, with the additional feature that during the 2h of agenti coding I can do something else. Is this the kind of evidence you are looking for?


Yes, the model runs in C, you just provide the model weights to the program.

The main advantage is that you don't need the python interpreter to run the program.

While not revolutionary, it is definitely not trivial and its main purpose is to demonstrate Claude code abilities in a low level, non trivial task.


The problem with jQuery is that, being imperative, it quickly becomes complex when you need to handle more than one thing because you need to cover imperatively all cases.


Yeah, that's the other HN koan about "You probably don't need React if..." But if you are using jquery/vanilla to shove state into your HTML, you probably actually do need something like react.


It's not really about state but dom updates.


After having some time to think about it, I've seen some really perverse DOM stuff in jquery. Like $(el).next().children(3) type stuff. So I think this stuff really fell-over when there was 'too much state' for the DOM.


I think if you want to go high-dom manipulation a la jQuery, and want some form of complex state, storing the state _on_ the DOM might make sense? Things like data attributes and such, but I also feel like that’s itching for something more like htmx or maybe svelte (I’ve not looked into either enough, so I may be completely off base).

I do agree with the notion that jQuery is easy to mishandle when logic grows beyond a pretty narrow (mostly stateless) scope. It’s fantastic up until that point, and incredibly easy to footgun beyond it.


Yeah, that's the thing, it might make sense in some simple 1-dimensional case, but beyond that it turns into spaghetti code (or a homebrew 'framework'). The big thing is that if you want to re-gigger some of the DOM, React is actually a lot nicer than jquery.


I’ll die on the “give me Vue over react any day” hill in that case, admittedly because I think React’s template/code mix is atrocious. I also _feel_ like React suffers from “why not do everything” syndrome, and that’s from a very naive perspective so grain or mountain of salt

Yeah, I'm not hyping react specifically, just point out that vanilla/jquery or even htmx only works well when the state is not that complicated.

Part of me feels the same way, and ~2015 me was full on SPA believer, but nowadays I sigh a little sigh of relief when I land on a site with the aesthetic markers of PHP and jQuery and not whatever Facebook Marketplace is made out of. Not saying I’d personally want to code in either of them, but I appreciate that they work (or fail) predictably, and usually don’t grind my browser tab to a halt. Maybe it’s because sites that used jQuery and survived, survived because they didn’t exceed a very low threshold of complexity.


Facebook is PHP ironically.


It was once upon a time, hence them moving to HHVM to interpret it, but it’s been all but replaced with a PHP spinoff named Hacklang now.


I think in 2026 Facebook is a conglomeration of a bunch of things... Definitely not just PHP anymore.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: