This isn't true. It's been explained numerous times on HN how mistaken this view is.
Language models do not work like this. They can copy content but usually that's for something like the GPL language text.
Generally they work on a character by character basis predicting what is the most likely character to appear next.
This very rarely results in copying text, and almost never rare text.
This isn't true. It's been explained numerous times on HN how mistaken this view is.
Language models do not work like this. They can copy content but usually that's for something like the GPL language text.
Generally they work on a character by character basis predicting what is the most likely character to appear next.
This very rarely results in copying text, and almost never rare text.