> Oh, we know for sure that it is reposting partially disguised sentences writte...

TuringTest · on May 26, 2022

> No. That's not how this works.

Ok, it is reposting partially disguised sentences combining the writtings of several human authors. Better now?

nl · on May 27, 2022

No. That is NOT what it does at all.

Here's a quick explainable in this thread for what it does: https://news.ycombinator.com/item?id=31489463

If you gave 100 well read, English speaking humans the following (very commonly found) sentence and asked them to predict the next character what would they do?

"four score and s"

Most would predict "e", then "v" etc until you get "seven". That's what a language model does.

So if you give it (or humans!) a prompt that is less common or has more variation:

"that is no" and ask for the most likely next character you will get a lot of variation from both humans and machines. The heat parameter in language models tune how random it will be. Both will produce English, because English words are more common than gibberish, but which works will be produced is randomish.

In neither case is it doing something that one would really characterise as "reposting partially disguised sentences combining the writtings of several human authors".

It is creating sentences that have never been seen before.

TuringTest · on May 27, 2022

How do you calculate the "most likely character to appear next", if not by memorizing lots and lots of existing sentences? ML is by essence a copycat that will regurgitate what it has seen before in a new context, no matter how hard you try to hide it under the mathematical shape of the probabilities of single characters in a sequence.

Now, there is the philosophical question of whether human creators simply do the same. (Which the don't; we have other mental processes for creating ideas than predicting the next letter we are going to utter next). But that doesn't change the fact that the likeliness of each emitted word is determined by what the model has seen more often in relation to the current context and therefore it considers most "valid".

nl · on May 28, 2022

> How do you calculate the "most likely character to appear next", if not by memorizing lots and lots of existing sentences?

Well that's how languages work right? Words are the most common sequence of letters.

But that doesn't mean it's regurgitating parts of sentences it had previously seen anymore than I'm regurgitating when I'm typing this.

Mechanically it has learnt both syntax of language and how concepts relate. So when it starts generating it makes sentence that are syntactically valid but also make sense in terms of concepts.

Thats really different to just combining bits of sentences, and it gives rise to abilities you wouldn't expect in something just cutting and pasting bits of sentences. For example, few shot learning is mostly driven by its conceptual understanding and can't be done by something with no way to relate concepts.