Hacker Newsnew | past | comments | ask | show | jobs | submit | jbarrow's commentslogin

Watsi is incredibly inspiring!

I’ve been a monthly donor since ~the beginning when I was just an undergraduate, and I still read the stories and emails I receive. I’m glad that you opted for the steady growth path, and that you’ve made it a sustainable thing.


Your support every month in our Universal Fund means the world to us! This consistency and reliability helps us plan ahead, show up faster for patients in need, and grow to reach new hospitals and communities.

That's incredible. Monthly donors are Watsi's lifeblood - it is so impactful to be able to bet on receiving a certain amount each month - thank you!

The whole thing feels AI written, generated from the codebase.*

*this is incorrect per the author’s response, my apologies.

For instance, it goes into (nano)vLLM internals and doesn’t mention PagedAttention once (one of the core ideas that vLLM is based on)[1].

Also mentions that Part 2 will cover dense vs MoE’s, which is weird because nanovllm hardcodes a dense Qwen3 into the source.

Here are better (imo) explainers about how vLLM works:

- https://hamzaelshafie.bearblog.dev/paged-attention-from-firs...

- https://www.aleksagordic.com/blog/vllm

- https://huggingface.co/blog/continuous_batching

Aleksa’s blog is a bit in the weeds for my taste but it’s really worth working through.

A lot of the magic of vLLM happens in the PagedAttention kernels, which are really succinctly implanted in nanovllm. And the codebase is great and readable by itself!

1. https://arxiv.org/abs/2309.06180


Hi jbarrow, thanks for your feedback and the links you shared—they're great readings for me (and likely others too).

That said, I need to clarify: the content was not written by AI, and certainly not generated from a database in one shot. If there's some agent + prompt that can produce what I wrote, I'd love to learn it—it would've saved me two weekends :)

Before addressing your questions further, some context: I'm a developer with no ML background but plenty of Cloud Infra experience. I'm currently building an open-source AI Infra project, which is why I studied nano-vllm. So my writing reflects some gaps in ML knowledge.

To your specific points:

> it goes into (nano)vLLM internals and doesn't mention PagedAttention once

I didn't find any explicit "paged attention" naming in nano-vllm. After reading the first article you linked—specifically the "Paged KV Caching" section—I believe the block management logic and CPU/GPU block mapping it describes is exactly what I covered in both posts. It may not be the full picture of paged attention, but I interpreted what I saw in the code and captured the core idea. I think that's a reasonable outcome.

> Part 2 will cover dense vs MoE's, which is weird because nanovllm hardcodes a dense Qwen3 into the source

This reflects my learning approach and background. Same as point 1—I may not have realized the block design was the famous PagedAttention implementation, so I didn't name it as such. For point 2, seeing a dense Qwen3 naturally made me wonder how it differs from the xx-B-A-yy-B MoE models I'd seen on Hugging Face—specifically what changes in the decoder layers. That curiosity led me to learn about MoE and write it up for others with the same questions.

---

I completely understand that in this era, people care more about whether what they're reading is AI-generated—no one wants to waste time on low-effort slop with no human involvement.

But as I explained above—and as my hand-drawn Excalidraw diagrams show (I haven't seen an LLM produce diagrams with logic that satisfies me)—this is the result of learning shaped by my own knowledge background and preferences.


Funny, this reads even more AI written than the article itself.


It really doesn't.


One thing to keep in mind is that a lot of non-native English speakers use LLMs to translate to English, or to polish their English prose; they may not realize that it causes the translation to come out in a very LLM-style tone. Not sure if that's the case here, but it looks like OP is a native Chinese speaker so may be using tools to translate to English.


It looks like you were right about that.

https://news.ycombinator.com/item?id=46858409

But: this was never a problem and now we have to distinguish between LLM generated, human generated, LLM polished and human generated. I'd much prefer it if people just wrote their own text, warts and all.


It does, but what does that say about the state of communication in our industry? I've seen a lot of writing that reads like an AI produced it in contexts where I could be pretty sure no AI was involved. We want to sound professional, so we sanitize how we write so much that it becomes... whatever this current situation is.

No offense intended to @yz-yu, by the way. I miss the times when more people wrote in an eccentric style -- like Steve Yegge -- but that doesn't detract from what you wrote.


The comments here turned out much more interesting than I expected—this has become a great place to discuss the difference between AI-generated, AI-written, and AI-assisted content.

So let me start from @jbarrow's comment: "AI written, generated from the codebase."

My actual learning process looked like this:

1. I walked through the nano-vLLM codebase, asking Claude Code some high-level questions to warm up. 2. Then I asked detailed questions one by one, let it explore, and double-checked the code myself. As someone without an ML background, it sometimes took hours to understand a single concept. 3. Once I felt I understood enough, I started drawing Excalidraw diagrams to explain what I learned.

Does this count as "generated from the codebase"? I don't think so.

Where we might disagree is the writing process.

As a non-native English speaker, my workflow looks like this:

1. Write a short paragraph (<100 words), then ask my writing agent to "fix this for readability and grammar." 2. Review the output. *If it changes any technical meaning, I correct it.* I consider this a responsible way to write a tech blog. 3. Move to the next paragraph.

Is this "AI-written"? I'd call it "AI-assisted." Every idea in every sentence is mine. Honestly, things like "em dashes" never stood out to me when reviewing. I suspect that's common for non-native speakers.

I wrote this comment the same way. The LLM fixed 14 grammar mistakes that I think would distract readers more than any LLM-ish phrasing.

That said, I'm open to suggestions on how to improve my writing process :)


When text is (clearly) non native English I think most native readers don’t even register grammar errors.

To be honest most native readers wouldn’t register grammar errors full stop.

I guess I have more awe of people who speak a foreign language at all compared to piping it through some agent malarkey.


> I wrote this comment the same way. The LLM fixed 14 grammar mistakes that I think would distract readers more than any LLM-ish phrasing.

I don't think that assumption is correct. As you can see by the discussion we're having here, the LLM "fixed" text is actually quite distracting, while text written by a reasonably proficient non-native speaker is generally perfectly readable. It's only if your English is extremely poor to non-existant that it makes more sense to use machine translation or editing rather than writing it yourself.

One problem is that people are becoming quite sensitive to slop, where people just post completely unreviewed, AI generated text. It's quite frustrating, because it's asking readers to read something that no one has ever bothered to write, and it frequently crowds out discussion that people are more interested in. So everyone is kind of hyper-sensitive to signs of AI written text right now, which means when you start to see such signs, your brain moves over to trying to interpret whether it's AI generated rather than reading the text itself.


Cool, humans hallucinate too. — AI


The em dashes really aren't helping their case.


Wait—do people here really think the em dash was nonexistent before LLMs? It’s widely used by people like me who care about writing style. The reason LLMs use it is because they reflect care and concern about writing style.


Yeah, people do seem to think that em dashes are an indicator of GenAI. I have been accused of using AI to write my posts on a forum, precisely because of em dashes. That's how I found out about that particular sniff test people use.

Hasn't made me change the way I write, though. Especially because I never actually type an em dash character myself. Back when I started using computers, we only had ASCII, so I got used to writing with double dashes. Nowadays, a lot of software is smart enough to convert a double dash into an em dash. Discourse does that and that's how I ended up being accused of being an AI bot.


Shouldn't a double dash result in an en dash and only a triple in an em dash?


No, people think humans use it a lot less often than AI, because it’s true. Especially for casual writing.

The contrast might become even greater because some humans that did use them have stopped to avoid false accusations.


Not non existent, but rare. And again the presumption was correct, the text was put through an LLM.


Nobody ever said that they were nonexistent before LLMs. When you are investigating and trying to determine if something is AI generated they are the number one indicator.

So if you're being accused of just spewing AI, then double down and spew what looks EVEN MORE like AI. What are you even doing?


Number one indicator? A single punctuation mark that's trivial to make on most keyboards (option-dash on macOS). And generally people who write software are extra fixated on punctuation for obvious reasons: missing semi-colons break your build, etc. Maybe in some other niche message board people will use dash and em dash interchangeably, but here?

Also, if the a single character is how you're red-flagging LLM output, do you know how easy it is to avoid? I didn't use it here at all, but how do you know I didn't run this through some slop-machine to tighten my prose? It's really low-effort take to say "just avoid em dashes so we know you're not an AI".

https://www.mcsweeneys.net/articles/the-em-dash-responds-to-...


Yes, number one indicator. Yes, of course you can go through the output and take out all of the em-dashes. Then the number one indicator will obviously not work.


My guess it's a translator they're using.


Actually I thought it was a great example clarity, focus, and economy of words that AI is not capable of at this point in time.


Not really in the PagedAttention kernels. Paged attention was integrated into FlashAttention so that FlashAttention kernels can be used both for prefill and decoding with paged KV. The only paged attention specific kernels are for copying KV blocks (device to device, device to host and host to device). At least for FA2 and FA3, vLLM maintained a fork of FA with paged attention patches.


Super interesting. Would you be willing to try the Python package (https://github.com/jbarrow/commonforms) or share the PDFs?

For the non-ONNX models there are some inference tricks that generally improve performance, and potentially lowering confidence could help.


Hey, Benjamin, thanks for the attribution! Happy to field any questions HN users have.

It's really gratifying to see people building on the work, and I love that it's possible to do browser-side/on-device.


Training ML models for PDF forms. You can try out what I’ve got so far with this service that automatically detects where fields should go and makes PDFs fillable: https://detect.semanticdocs.org/ Code and models are at: https://github.com/jbarrow/commonforms

That’s built on a dataset and paper I wrote called CommonForms, where I scraped CommonCrawl for hundreds of thousands of fillable form pages and used that as a training set:

https://arxiv.org/abs/2509.16506

Next step is training and releasing some DETRs, which I think will drive quality even higher. But the ultimate end goal is working on automatic form accessibility.


Congratulations on being featured in the Superhuman newsletter. Trying it out.


Woah, did not realize that, haha. Let me know if it works well!


Existing “auto-fillable” tools are pretty lackluster in my experience. CommonForms is tooling that can automatically detect form fields in PDFs and turn those PDFs into fillable documents. The dataset is ~500k form pages pulled from Common Crawl, which I trained the object detectors on. For being vision only, the results are pretty remarkable!

Releasing the dataset, paper, models, and (imo most importantly) simple/convenient tooling to automatically prepare any PDF.

Links: Repo: https://github.com/jbarrow/commonforms - Paper: https://arxiv.org/abs/2509.16506


I’m personally a huge fan of Modal, and have been using their serverless scale-to-zero GPUs for a while. We’ve seen some nice cost reductions from using them, while also being able to scale WAY UP when needed. All with minimal development effort.

Interesting to see a big provider entering this space. Originally swapped to Modal because big providers weren’t offering this (e.g. AWS lambdas can’t run on GPU instances). Assuming all providers are going to start moving towards offering this?


Modal is great, they even released a deep dive into their LP solver for how they're able to get GPUs so quickly (and cheaply).

Coiled is another option worth looking at if you're a Python developer. Not nearly as fast on cold start as Modal, but similarly easy to use and great for spinning up GPU-backed VMs for bursty workloads. Everything runs in your cloud account. The built-in package sync is also pretty nice, it auto-installs CUDA drivers and Python dependencies from your local dev context.

(Disclaimer: I work with Coiled, but genuinely think it's a good option for GPU serverless-ish workflows. )


I’m also a big fan.

Modal has the fastest cold-start I’ve seen for 10GB+ models.


Thanks for sharing! They even support running HIPAA-compliant workloads, which I didn't anticipate.


Modal documentation is also very good.


If you enjoyed this essay, you should check out the author’s current project, Dynamicland[1]. It is a wonderful expression of what computing and interaction could be. Even the project website — navigating a physical shelf, and every part is hyperlinked — is joyful.

1. https://dynamicland.org/


i wish i could say this looked interesting to me but it doesnt :(


Thanks, I'll pick out something else for your birthday then.


> i wish i could say this looked interesting to me but it doesnt :(

Then, not to be snarky, why say anything?


Editing text in PDFs is _really_ hard compared to other document formats because most PDFs don't really encode the "physics" of the document. I.e. there isn't a notion of a "text block with word wrapping," it's more "glyphs inserted at location X with font Y."

If the PDF hasn't been made accessible, you have to do a lot of inferencing based on the layout about how things are grouped and how they should flow if you want to be able to make meaningful edits. Not impossible (Acrobat does it), but very challenging.

It's part of the legacy of PDF as a format for presentation and print jobs, rather than typesetting.


Yes, and alongside formatting challenges, PDFs commonly only include the glyphs from the font that are actually used in the document.

So if you had PDF with "Hello World" on it, you could feasibly change it to "Hello Hello", but wouldn't be able to change it to "Goodbye World" (as the glyphs for "G", "b", "y", and "e" are not included in the PDF)

Sure, you could do a bit of detective work to figure out which font it was from the glyphs or something and lookup and insert new glyphs into the PDF, but I can't imagine a generic PDF editor being capable of doing this for you.

Some editors get around this but just straight up switching the font(s) for the whole PDF, so they'll look different after saving.


It's still what a PDF editor, as it says in the title, would do. With a quick Google I found one that I hadn't heard about before, and it let me edit some text and save it for free.


Ask yourself, why would someone spend money on bandwidth for me to download something for free...


PDF editor is used as a broadly encompassing term. Yes, other tools can edit existing text, but they upload your PDF to their servers, so it's not private if that's something you care about.

There isn't anything off the shelf that enables editing existing text in the browser, but it's something I'll build from scratch. So you'll be able to edit existing PDF text without compromising privacy.


This can if I remember correctly (can't check now), but it's a POC and not a finished product https://github.com/ShizukuIchi/pdf-editor


Sejda.com does it. Though its free one is severely crippled


Wonderful! Inserted form-fields show up in Preview and Acrobat, which is not a trivial task. I run a little AI-powered tool that automatically figures out where form fields should go (https://detect.penpusher.app) and robustly adding form fields to the PDF was the hardest part.

Fwiw, I do see the issue with being unable to scroll down across both Safari and Chrome.


Thanks! I fixed the scrolling issue


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: