Hi Kal,
You might find these threads of interest :-
I provide a bit of background on the underlying technology here :-
https://www.iceinspace.com.au/forum/...d.php?t=192039
https://www.iceinspace.com.au/forum/...d.php?t=193418
https://www.iceinspace.com.au/forum/...d.php?t=197567
At their heart, systems such as GPT-3 and LamDA are language models.
They are trained on a rich diet of text and what they say at any one
time is based on what is the most likely thing to say based on the training
set and the weighted feedback loop of what they are most recently saying.
Neural networks are employed to perform this task.
Despite the words "neural network" sounding cool, there is nothing
organic, wet or brain-like squidgy about them.

They are
completely mathematically defined and one way to think of them is akin
to machinery that can find the local minima - the valleys - in the
multidimensional landscape of their training set.
Neural networks have been one of those on again off again technologies
in computer science for decades.
As it transpires, they don't work so well unless they support a large number
of parameters (currently in the order of 100's of billions and hundreds
of trillions of parameters) and the training set is large.
Advances in semiconductors helped give rise to the former and the advent
of the Internet the later. Rich diets include material such as all of Wikipedia
and more.
So in the last few years, they have started to spread their wings.
They've become pretty impressive and what engineers refer to in
Information Theory as "perplexity" is becoming reduced. Perplexity
is a measure of how well a probability model predicts a sample,
the smaller the perplexity the better. A key goal is to minimize the
probability these systems say something that is silly or lack what we
regard is commonsense. So for example, if you say, "I just painted
the house blue, what color do you think my house is?" and the response
is, "Most houses in Greece are white", that is an example of poor perplexity.
One controversial point about these large experimental systems is the
cost to train them. In energy alone, they require many millions of dollars.
Not a problem for a company with deep pockets like Google, but pretty
expensive educations. There is a concern in the professional community
that as larger and larger systems are built, ones with even more
parameters, the training costs just in energy usage alone become harder
to justify unless additional breakthroughs are made.
A bit more about LamDA here :-
https://blog.google/technology/ai/lamda/