Blog posts like this are for SEO. If the text isn't long enough, Google disregards it. Google has shown a strong preference for long articles.
That's why the search results for "how to X" all starts with "what is X", "why do X", "why is doing X important" for 5 paragraphs before getting to the topic of "how to X".
You are funny. Anthropic refuses to issue refunds, even when they break things.
I had an API token set via an env var on my shell, and claude code changed to read that env var. I had a $10 limit set on it, so found out it was using the API, instead of my subscription, when it stopped working.
I filed a ticket and they refused to refund me, even though it was a breaking change with claude code.
I am saying there is no evidence either way: they had contrasting experiences and one GP established this means that company has no standardized policies. Maybe they do, maybe they don't — I don't think we can definitively conclude anything.
I object to your conclusion that "they have no durable principles": not sure how do you get to that from two different experiences documented with a single paragraph.
This is becoming futile: this is not even about proof, but there not even being a full account of two cases you are basing your opinion on.
Obviously, you can derive any opinion you want out of that, but while I am used to terms like "probability" being misused like this, I've generally seen a higher standard at HN.
To each their own, though. Thank you for the discourse and have a good day.
It is possible that degradation is an unconscious emergent phenomenon that arises from financial incentives, rather than a purposeful degradation to reduce costs.
FYI the sandbox feature is not fully baked and does not seem to be high priority.
For example, for the last 3 weeks using the sandbox on Linux will almost-always litter your repo root with a bunch of write-protected trash files[0] - there are 2 PRs open to fix it, but Anthropic employees have so far entirely ignored both the issue and the PRs.
Very frustrating, since models sometimes accidentally commit those files, so you have to add a bunch of junk to your gitignore. And with claude code being closed source and distributed as a bun standalone executable it's difficult to patch the bug yourself.
Hmm, very good point indeed. So far it’s behaved, but I also admit I wasn’t crazy about the outputs it gave me. We’ll see, Anthropic should probably think about their reputation if these issues are common enough.
Here is what I do: run a container in a folder that has my entire dev environment installed. No VMs needed.
The only access the container has are the folders that are bind mounted from the host’s filesystem. The container gets network access from a transparent proxy.
This works great for naked code, but it kinda becomes a PITA if you want to develop a containerized application. As soon as you ask your agent to start hacking on a dockerfile or some compose files you start needing a bunch of cockeyed hacks to do containers-in-containers. I found it to be much less complicated to just stuff the agent in a full fledged VM with nerdctl and let it rip.
I did this for a while, it's pretty good but I occasionally came across dependencies that were difficult to install in containers, and other minor inconveniences.
I ended up getting a mini-PC solely dedicated toward running agents in dangerous mode, it's refreshing to not have to think too much about sandboxing.
I totally agree with you. Running a cheapo mac mini with full permissions with fully tracked code and no other files of importance is so liberating. Pair that with tailscale, and being able to ssh/screen control at any time, as well as access my dev deployments remotely. :chefs kiss:
I use a new Ryzen based mini PC instead of Mac mini, but the reasoning is the same. For the amount of compute/memory it pays for itself in less than a year, and the lower latency for ssh/dev servers is nice too.
The idea is to completely sandbox the program, and allow only access to specific bind mounted folders. But we also want to have to the frills of using GUI programs, audio, and network access. runc (https://github.com/opencontainers/runc) allows us to do exactly this.
My config sets up a container with folders bind mounted from the host. The only difficult part is setting up a transparent network proxy so that all the programs that need internet just work.
Container has a process namespace, network namespace, etc and has no access to host except through the bind mounted folders.
Network is provided via a domain socket inside a bind mounted folder. GUI programs work by passing through a Wayland socket in a folder and setting environmental variables.
The set up looks like this
* config.json - runc config
* run.sh - runs runc and the proxy server
* rootfs/ - runc rootfs (created by exporting a docker container) `mkdir rootfs && docker export $(docker create archlinux:multilib-devel) | tar -C rootfs -xvf -`
* net/ - folder that is bind mounted into the container for networking
Inside the container (inside rootfs/root):
* net-conf.sh - transparent proxy setup
* nft.conf - transparent proxy nft config
* start.sh - run as a user account
I have a version of this without the GUI, but with shared mounts and user ID mapping. It uses systemd-nspawn, and it's great.
In retrospect, agent permission models are unbelievably silly. Just give the poor agents their own user accounts, credentials, and branch protection, like you would for a short-term consultant.
The other reason to sandbox is to reduce damage if another NPM supply chain attack drops. User accounts should solve the problem, but they are just too coarse grained and fiddly especially when you have path hierarchies. I'd hate to have another dependency on systemd, hence runc only.
Is it? I've seen us going from obvious skeuomorphism to more and more abstract shapes, until we hit peak Windows 8 hubris where everything is a coloured square with a monochrome symbol in it. Then back to icons where shades of colour and contrast finally start meaning things again, but getting stuck in an endless balancing on the edge where icons are abstract enough to confuse but not clear enough to describe their function. We've never gotten fully back to actual skeuomorphism.
Yeah, that's what I'm trying to say! The furthest right icon of this post is peak skeuomorphism, but we've never actually gotten back to it. Someone's always gone "wait, this icon looks a bit too much like the real thing, we can't have that!". It has never been cyclical!
Your video rates the PI as 10 for support, 10 for ease of use and 7 for performance. Just the support and ease of use is enough. You're paying for a mature ecosystem where you know things work and you don't have to waste time struggling.
At the very least, they'd complain about accuracy, if not time zone, or even how we should all be on UTC (do not get one started on the difference between GMT and UTC if you value your... time)
See that the former president of Harvard was caught plagiarism and the former Sanford president resigned due to fraudulent data, the chances may be >0! Just need to lie, cheat, or commit fraud to get in!
That's why the search results for "how to X" all starts with "what is X", "why do X", "why is doing X important" for 5 paragraphs before getting to the topic of "how to X".
reply