every now and then I’m left to wonder what all these drivebys think we do/know/practice (and, I suppose, whether they consider it at all?)
not enough to try find out (in lieu of other datapoints). but the thought occasionally haunts me.
do read up a little on how the large language models work before coming here to mansplain, would you kindly?
LLMs are what happens when you point a neural network at a lot of text training data.
You can argue that it’s technically accurate to call RNNs statistical pattern generators, that misses a lot of what makes them interesting.
“Statistical pattern generator” sort of implies that they’re similar to Markov Generators. That’s certainly not true since RNNs can easily maintain state.
It’s much more accurate to say that RNNs are sets extremely sophisticated non-linear functions and that the training algorithms (usually gradient descent) are maximization or minimization functions.