More

danenania · 2026-02-05T05:30:01 1770269401

There are different kinds of tribal knowledge. Some is company-specific, some is role-specific or domain-specific.

danenania · 2026-01-24T21:59:27 1769291967

The date is just a useful fiction to:

- Create urgency

- Keep scope creep under control

- Prioritize whatever is most valuable and/or can stand on its own

If you just say “I don’t know” and have no target, even if that’s more honest, the project is less likely to ever be shipped at all in any useful form.

danenania · 2026-01-17T18:09:48 1768673388

I don’t think “ready to merge” necessarily means the agent actually merges. Just that it’s gone as far as it can automatically. It’s up to you whether to review at that point or merge, depending on the project and the stakes.

If there are CI failures or obvious issues that another AI can identify, why not have the agent keep going until those are resolved? This tool just makes that process more token efficient. Seems pretty useful to me.

dsifry · 2026-01-17T22:27:36 1768688856

That's EXACTLY right. Ready to merge is an important gate, but it is very stupid to just merge everything without further checks/testing by a human!

forgotpwd16 · 2026-01-21T08:53:48 1768985628

This tool seems agent-oriented for them to merge, rather merely check readiness. In that regard, the page doesn't mention anything about human reviewers, only AI reviewers. Honestly wouldn't be surprised if author, someone seemingly running fully agentic workflows, didn't even consider human reviewers. If it's AI start-to-end*, then yes, quite possibly could push directly to master without much difference.

Call me pessimistic, and considering [1][2][3] (and other similar articles/discussions), believe this tool will be most useful to AI PR spammers the moment is modified to also parse non-AI PR comments.

*Random question: is it start-to-end or end-to-end?

edit: P.S. Agree that it's useful, given its design goals, tool though.

[1]: https://old.reddit.com/r/opensource/comments/1q3f89b/ [2]: https://devansh.bearblog.dev/ai-slop/ [3]: https://etn.se/index.php/nyheter/72808-curl-removes-bug-boun... (currently trending first page)

danenania · 2026-01-12T05:10:13 1768194613

Humans make subtle errors all the time too though. AI results still need to be checked over for anything important, but it's on a vector toward being much more reliable than a human for any kind of repetitive task.

Currently, if you ask an LLM to do something small and self-contained like solve leetcode problems or implement specific algorithms, they will have a much lower rate of mistakes, in terms of implementing the actual code, than an experienced human engineer. The things it does badly are more about architecture, organization, style, and taste.

gjadi · 2026-01-12T08:58:04 1768208284

But with a software bug, the error becomes rapidly widespread and systematic, whereas human error are often not. Doing wrong with a couple of prescription because the doc worked for 12+ hrs is different from systematically doing wrong on a significant number of prescriptions until someone double check the results.

theshrike79 · 2026-01-12T11:18:40 1768216720

An error in a massive hand-crafted Excel sheet also becoms systematic and wide-spread.

Because Excel has no way of doing unit tests or any kind of significant validation. Big BIG things have gone to shit because of Excel.

Things that would have never happened if the same thing was a vibe-coded python script and a CSV.

gjadi · 2026-01-12T14:09:52 1768226992

I agree with the excel thing. Not with thinking it can't happen with vibecoded python.

I think handling sensitive data should be done by professional. A lawyer handles contracts, a doctor handles health issue and a programmer handles data manipulation through programs. This doesn't remove risk of errors completely, but it reduces it significantly.

In my home, it's me who's impacted if I screw up a fix in my plumbing, but I won't try to do it at work or in my child's school.

I don't care if my doctor vibe codes an app to manipulate their holidays pictures, I care if they do it to manipulate my health or personal data.

theshrike79 · 2026-01-12T21:57:36 1768255056

Of course issues CAN happen with Python, but at least with Python we have tools to check for the issues.

Bunch of your personal data is most likely going through some Excel made by a now-retired office worker somewhere 15 years ago. Nobody understands how the sheet works, but it works so they keep using it :) A replacement system (a massive SaaS application) has been "coming soon" for 8 years and cost millions, but it still doesn't work as well as the Excel sheet.

danenania · 2026-01-03T16:14:11 1767456851

I think it’s just realpolitik grand chessboard strategy. Knocking out an unfriendly/uncooperative leader of a strategically important country. That’s always been the real justification for US foreign policy. It’s a game of risk, without moral considerations beyond optics. There isn’t much more to it than that.

You can be socialist if you cooperate. You can be a dictator if you cooperate. It’s not about political philosophy or forms of government, just playing ball with the hegemon.

danenania · 2026-01-01T18:37:31 1767292651

Oh man, I relate so hard on the sports conversations.

a96 · 2026-01-02T14:24:12 1767363852

Did you see that ludicrous display last night?

cpburns2009 · 2026-01-02T15:15:02 1767366902

What was Wenger thinking sending Walcott on that early?

nunez · 2026-01-03T08:17:01 1767428221

There are dozens of us!

danenania · 2026-01-01T18:25:54 1767291954

I definitely see your point. I'd just say though that it can put a lot of pressure on the romantic relationship. Some can handle it; others might not. And also it makes it much more difficult to recover if things don't work out.

Social life is a bit like SEO. To get the full benefits, you needed to start on it years ago. Trying to do it just-in-time is generally a very frustrating experience. I think there's wisdom in doing casual cultivation when you don't feel you need it. It's like keeping your skills/résumé up-to-date just in case.

danenania · 2026-01-01T18:03:08 1767290588

Going further, you don't even need to count your reps or track how much weight you're lifting. Literally just do any exercise with any weight per muscle group to near failure for 2-5 sets. Rest the muscle groups you targeted the next 1-3 days, and be consistent every week. Bodyweight, free weights, machines, bands, kettlebells, etc. are all fine. That gets you 80-90% of the benefit with no stress.

danenania · 2025-12-25T16:44:53 1766681093

That’s because gods are a mythical/supernatural invention. No technology can ever really be omniscient or omnipotent. It will always have limitations.

In reality, even an ASI won’t know your intent unless you communicate it clearly and unambiguously.

bdangubic · 2025-12-25T17:20:31 1766683231

The communication I get from customers is seldom clear and never unambiguous but I’ve managed since the 90’s

array_key_first · 2025-12-25T23:21:24 1766704884

Right, but you have to do a lot of work, and really most of your work is in this area. Less on the actual building stuff.

Figuring out what to build is 80% of the work, building it is maybe 20%. The 20% has never been the bottleneck. We make a lot of software, and most of it is not optimal and requires years if not decades of tweaking to meet the true requirements.

consumer451 · 2025-12-25T17:14:40 1766682880

> In reality, even an ASI won’t know your intent unless you communicate it clearly and unambiguously.

I recently came to this realization as well, and it now seems so obvious. I feel dumb for not realizing it sooner. Is there any good writing or podcast on this topic?

danenania · 2025-12-17T19:06:28 1765998388

Thanks for the comment.

- On precision vs. noise: yeah, this is a core challenge. Quick answer is the scanner tries to be conservative and lean towards not flagging borderline issues. There's a custom guidance field in the config that lets users adjust sensitivity and severity based on domain/preferences.

- CI times: on a medium-sized PR (say 10k lines) in a fairly large codebase (say a few hundred K lines), it will generally run in 5-15 minutes, and run in parallel with other CI actions. In our case, we have other actions that already take this long, so it doesn't increase total CI time at all.

- Vulnerability types: the post goes into this a bit, but I would look at scanning and red teaming as working together for defense in depth. RAG and tool misuse vulnerabilities are definitely things the scanner can catch. Red teaming is better for vulnerabilities that might not be visible in the code and/or require complex setup state or back and forth to successfully attack.