Chat GPT - Open AI, the future of Astronomy - Page 3

gary · #41 05-02-2023, 08:19 PM

Quote:

Originally Posted by oska

Garry, I said fancy pattern matching. Obviously that's not simple grep or even google's search thingy, that's just silly.

In its barest engineering essence without using any hype, buzzwords, overly technical terms or overstating anything, in as few words as possible what is it technically then?

What are "language models" and "deep learning algorithms" if not a bunch of weighted links on how phrases relate and fancy pattern matchers?

Thanks Steve

Look I know where you are coming from and I'm following up here not
to point score, but rather to have a stab at trying to explain why I would
not refer to ChatGPT as even the fanciest of pattern matchers.

My hope is that for any interested reader who takes the time to read this
overly long response that they may come away with some
appreciation that the way ChatGPT operates is significantly different in
many key ways to the convolutional neural networks that are presented
in the video.

And for the record, all the following text is mine as tempting as it may
be to just get ChatGPT to do it. Thus any factual errors, terrible grammar
are purely mine.

I note that video was made in 2016 which at the current pace is very old.
It predates a seminal paper that came out of Google in 2017 that was
a game changer that made large language models like GPT-2/GPT-3 and
ChatGPT practically possible.

One feature of something that is purely a pattern matcher is that if you
gave it certain specific inputs, it would provide an identical result
each time. The convolutional neural network when processing images is
precisely that type of system.

If you present identical text to ChatGPT in new chat sessions, there is no
guarantee that the output will be identical.

In fact if you were to build an exact clone of the hardware of ChatGPT and
gave both systems the same input conversation, there is no guarantee that
the outputs would be the same.

I will touch upon why that is a little later.

You mention the word "phrases" here and in your first post where you offered an
explanation that, quoting "these phrases are associated with these other
phrases. The filtered phrases are passed on to another pattern matcher that makes
reasonable sentences. Done.".

That is incorrect because during real-time operation, remarkably ChatGPT
doesn't deal with phrases at all.

To understand why that is, it is useful to understand the size of the problem
if it could directly deal with phrases.

If you go back to the 2016 video link you posted, the presenter gives
a hypothetical example of a neural network that might be used to
estimate the price of a house given a set of input parameters it derives
from an image - number of windows, width and height of building and so
on. (2m33s in the video). In actual fact, he goes on to say that's not how
he's image processing system works he but does propose to the
viewer that might be some "smart way of doing it".

Now consider the problem of natural language processing. If one were
to use the phrase approach there are essentially some impractically
enormous number of inputs to deal with because there are essentially an
infinite number of input phrases.

So what does ChatGPT do? Instead of dealing with phrases, it firstly deals
with a word at a time and uses probabilities as to what word is most likely
to come next.

So far, so good. But if you build a system that just relies on what word
is most statistically likely to come next, it tends to quickly drift off topic
on some confusing tangent.

So you then think, "Okay, for it to stay on topic, each time I ask a new
question in a dialog on a subject, I will simply re-parse the entire
conversation up to and including the new question and so it will more
likely stay on topic".

The problem with that approach is that with each new input dialog from
the user and each new output dialog from ChatGPT, if it were to process
the entire conversation again through the neural network, it rapidly exceeds
its computational ability. There is just too much data.

So what do you do? You need to keep focus on the conversation by
distilling the whole conversation up to a given point into a smaller form
of data. Something small enough that you store in memory
which can be thought of as an abstract vector summarizing the
information from the conversation so far.

The hidden state is updated at each time step based on the input (i.e. the
message from the user) and the previous hidden state, allowing the model
to keep track of the context of the conversation and make informed
predictions about the next message.

These types of neural networks are known as Recurrent Neural Networks
(RNNs).

Unlike the convolutional network in the video, the RNN
ChatGPT employs is 'stateful'. Things don't simply pass in and
"filter through" a series of filters or pattern matchers. The hidden memory
component feeding back as one of the inputs makes the system a state machine.

Now RNN's have been around a long time, since the 1980's.
But a key feature is, how do you decide what to retain in the hidden
state and how do yo do it? You only have limited memory and limited
processing capability.

So what to keep in this hidden state vector and what to throw away is
the tough part. It is the part that at any instant is trying to remember
the things that are important to keep the conversation on track.

Does this sound familiar? Probably, because it feels eerily similar to what
we do as humans when having a conversation with someone or reading
a novel.

As the conversation with another person progresses, we don't record
the entire conversation in our heads but instead pay attention to the
details we are talking about. In a similar way, if we are half way through
a thick novel and open it up again, we don't run through every word in
the book in our minds up to that point. We have some distillation of the
plot and what the characters were up to so when we start reading again
where we left off last time, they form inputs into the neural network in
our brains combined with the new words in the book so it makes some
sense to us.

Sure, we do pattern matching when we see the printed words on the
page, for example, it might say, "She turned on me with real fury
as though I were a child who had carelessly broken some vase she
had cherished over the the years for its beauty and the memories it
contained". We do the pattern matching to pick out the space
delimted words "She", "child", "vase" and so on, but for an abstract
phrase like this which we are unlikely to have seen before, we don't
do any fancy phrase pattern matching. Nor for that matter does
ChatGPT.

What we do do is retain some distillation of the novel
up to that point so we know who "she" is and who the person telling
the story is and as we parse the sentence we are processing it serially,
a word at a time and maintaining "attention" on what is important. In
this case, "She" (who we know from previous) is furious with the person
telling the story (who we also know from previous).

ChatGPT, like us, does the same during its conversation.

Now the really, really tricky part is in this small distillation we keep in
our heads of a conversation or up to that point we left off in
a novel is deciding what is important to retain and what is not.

When processing language, we look for what is important. "She", "me",
"fury". That's what we want to keep in the short term memory at least
for the next paragraph to make sense. Who "she" is and why she is furious
will have already been distilled in our compacted memory of the book.

That this is probably the way we process language as humans is often
demonstrated when we are having a conversation, become distracted
and then say, "What were we talking about again?" Our state vector
of the conversation to date requires a refresh. Sometimes both parties
can't recollect what they were talking about.

Now the tough part is knowing what to retain in that distillation
of the conversation. What to keep and what to throw away.
Over the years with RNN's there were several approaches.
Algorithms with names such as Long Short-Term Memory (LSTM) cells
and Gated Recurrent Units (GRUs). These are known as "attention"
mechanisms used to mimic cognitive attention. How we stay focused
when processing language.

Then in 2017, a year after that YouTube video, a breakthrough.
A paper entitled "Attention Is All You Need" by Vaswani et. al.
It proposed a brand new attention mechanism called the "Transformer".
This algorithm not only does a much better job of a RNN staying
on track during a conversation, it is computationally efficient and
parallelizable making it run faster.

The key innovation of the transformer architecture is the use of
"self-attention mechanisms", which allow the network to weigh the
importance of different parts of the input sequence when making
predictions. The self-attention mechanism allows the network to focus on
the most relevant parts of the input when making predictions, rather than
simply processing the entire sequence in a fixed order as in traditional
RNNs or convolutional neural networks.

So GPT-2/GPT-3/ChatGPT are examples of language models using
the "Transformer" algorithm to refresh their hidden state vector.

It gets more complex than that. ChatGPT also uses random number
generators to mix things up. Hence even if you built an identical clone,
it is likely to have a different worded conversation.

So ChatGPT does not pattern match on phrases but instead
uses a RNN that has an internal feedback loop that is stateful, which
in turn is refreshed by a "Transformer" algorithm to maintain
attention.

Mechanisms such as "attention" are key to language models and
hence the term "language model" is not a buzzword substitute
for a "fancy pattern matcher". They are two entirely different
concepts.

In fact it is not even a really fancy pattern matcher any more than
a computer is a fancy typewriter.

With a typewriter, you press the Q key and an Q is printed and so on.
Deterministic. Place finger here, recognizes what key is pressed and
predicatively the same letter is printed.

By comparison, the old Enigma machine you would press a Q and
depending on the settings, some other letter would come out,
say K. It's gears would turn and you press Q again and some other
letter might come out, say 'B'.

The statefullness of CHatGPT, its ability to effectively change its
internal state on the fly, the addition of a random number generator
make it like an Enigma machine on steroids. Billions of times bigger.
No fancy matching of phrases at all.

Let's not beat about the bush. ChatGPT is certainly the most impressive
demonstration of software of any type I have ever seen.

Attached, over two files is a conversation I had with GPT a short while
ago which demonstrates its "attention" mechanism by way of the
"Transformer" algorithm.

I only mention the word "penguin" once at the start of the conversation.
Despite the fact I do not use the world "penguin" again but instead ask
questions such as "What do they eat?" and "How do they withstand the
cold?", notice how ChatGPT understands we are talking about penguins
and say, for example, not the "someone" who suggested a trip to the zoo.

gary · #42 05-02-2023, 09:53 PM

Quote:

Originally Posted by julianh72

I just posed this request to ChatGPT:

Demonstrate "lateral thinking" to develop an innovative means of capturing astrophotography

This was the response:

One approach to demonstrating lateral thinking in astrophotography might involve rethinking the tools and techniques used for capturing images. Instead of relying on traditional telescopes and cameras, for example, one might experiment with new technologies such as:

Drones: Equipping drones with cameras and flying them at high altitudes could provide a unique perspective for capturing astrophotography images.
Balloon-based cameras: Launching cameras attached to balloons into the stratosphere could also offer new angles and perspectives for capturing images of celestial objects.
Augmented reality: Developing an app that combines data from telescopes and cameras with augmented reality technology could create a new way of experiencing and capturing astrophotography images.
Multi-spectral imaging: Using sensors that can capture images across a range of wavelengths, beyond the visible light that traditional cameras pick up, could reveal previously hidden details in astronomical objects.

Nothing truly earth-shattering there - but some food for thought for hackers and makers?

Hi Julian,

Great to hear you have been playing with ChatGPT.

To help set your level of expectation, let me first of all make clear what
ChatGPT isn't.

It isn't sentient and though it often demonstrates remarkable output,
including original stories, the answers to an astonishing number of facts
and so on, it is not designed to be capable of "lateral thinking" in the
sense you and I know.

So what is it? Well, it is a sneak preview of a research language model.
It happens to have been trained on an enormous amount of text, but right
now it isn't as important as to what it knows as to the fact it can have
an extended conversation at all.

You might remember the old TV show, Mr. Ed, the talking horse.
Mr. Ed was smart but would only ever talk to his owner, Wilbur.

Now imagine one day a neighbour calls out to you. "Psst, Julian. Get
over here. I have this horse that can talk."

You say, "Wow. That's incredible! Can I talk to him?"

And the neighbour says, "Sure, come over here and sit in the barn and
why don't you have a chat with him? He knows a lot of stuff. He would
do really well on a quiz show like Jeopardy. Knows capital cities, history
of nations, cellular biology. Well, in chatting to him, he seems to know an
awful lot more facts than me."

So you sit down in the barn and chat all day with him.

You come out of the barn and the neighbour says, "Well, what do think?
Incredible eh?" And you respond, "He couldn't give me a good lateral
thinking approach to astrophotography".

But then the neighbour says, "But he's a talking horse ..."

It's easy to forget what the current release of ChatGPT is attempting
to demonstrate, which is that you are actually having a conversation
with a talking computer

glend (Glen) · #43 05-02-2023, 11:26 PM

Microsoft has just announced that their search engine Bing, will have ChatGPT-4 shortly, and it will be online on the internet, no archived dataset.
Details here:

https://medium.com/@owenyin/scoop-oh...t-62b42d8d7198

glend (Glen) · #44 06-02-2023, 01:18 AM

Seinfeld (continuous) as imagined by GPT3.

https://amp.theguardian.com/tv-and-r...forever-twitch

AdamJL · #45 06-02-2023, 09:35 AM

Quote:

Originally Posted by glend

Microsoft has just announced that their search engine Bing, will have ChatGPT-4 shortly, and it will be online on the internet, no archived dataset.
Details here:

https://medium.com/@owenyin/scoop-oh...t-62b42d8d7198

hopefully something to blunt the power and pervasiveness of Google.

And sadly, I don't think we can sign up here in Australia, but I'll definitely be doing so as soon as I can.

AstroViking (Steve) · #46 06-02-2023, 09:37 AM

FWIW, I work in a technical field (very heavily IT) where I consider a wide range of knowledge to be essential.

The newer members of the team have, to my mind, a sad lack of what I would consider to be basic skills for the field. For example - the newest member of our team is studying for a PhD in the field, but knows nothing about networking.

Quote:

Originally Posted by Boozlefoot

and reflect with considerable dismay the forthcoming possible lack of ability in all the basics by coming generations.

gary · #47 06-02-2023, 01:03 PM

Quote:

Originally Posted by OzEclipse

I have been working with Terry Cuttle on a series of articles about solar eclipses and the 5 Australian eclipses for the AAQ website.

I have mostly been doing a lot of proofing and editing of Terry's articles. Terry asked me to write a 3-page article on eclipse photography. Easy I thought. Well no as it turns out. I ran into what you might call "reverse writers' block," or perhaps, "writers' verbose diarrhoea."

My first draft was a whopping 17 pages. I viciously edited it down and I only got it down to 9 pages then I stalled and could cut no more.

So, I asked ChatGPT to write the article for me. So, I asked, "How do I photograph a total solar eclipse?

Hi Joe,

Here's a tip.

Rather than ask ChatGPT to write a one page original work on some very
technical esoteric subject, you will sometimes get a better result by
prompting it with, "Shorten the following report to a one page quick
technical how-to guide" and then copying and pasting what you have
written after that.

Now at 9 pages it is very likely that it will exceed the maximum number
of input tokens (essentially characters), but if your document was split
into multiple sections, you might be able to feed in a bit at a time.

Just like you yourself went from a draft of 17 pages down to 9 but balked
at being able to get it below that, it probably means your
essential facts to verbiage ratio was nearing unity and so ChatGPT may
do no better with a result missing important stuff.

Let's face it. Say you are sitting on a plane flying to West Australia
and some stranger asks if you are going on a holiday. You tell them
you are going to photograph a solar eclipse. They then ask, "Do you
need to do anything special to photograph them? How do you go
about that?" So you passionately launch into the full spiel including all
your hard won tips to "open up to f5.8-f8 where lens MTF is highest and
resolution sharpest" and so on.

Meanwhile, if you take notice, the stranger's eyes are starting to glaze
over.

When they asked, they were thinking, "Can I somehow snap this
with my phone?"

The reason I mention this is yesterday I wrote an overly long
post attempting to explain what is called the attention mechanism
within ChatGPT that attempts to keep it on topic.

A bit like the entry for "Earth" in the Hitchhikers Guide To the Galaxy
that originally read "Harmless" and was then revised to "Mostly harmless",
I guess it comes down to your target audience.

Believe me. Anyone who is serious enough to travel half way around the
world or cross a continent to get a great image of a solar eclipse will
not hesitate to read 17 pages. So my best tip of all is to keep it like
you have it. Besides, they have plenty of time to read it on the plane.

That hypothetical stranger on the plane was looking for totally
different advice.

Though trained on a large set of text, the current purpose of ChatGPT
was not to be the oracle of all knowledge but a demonstration of a language
model.

The other day I watched a builder using the lithium battery
out of his portable drill to hammer in a nail.

I mention it of course as reference to the old adage of the right tool
for the right job.

So hence as a language model ChatGPT is probably better at word crafting
what you as an expert had written than it is at hammering out an original
document of a technical nature. One day, maybe, but not now.

By the way, when I read the acronym you wrote of MTF, I had no idea
want it meant.

Having said it is not an oracle of all knowledge, it is amazing what
ChatGPT does know.

I asked ChatGPT :-

In the following sentence :-

"open up to f5.8-f8 where lens MTF is highest and resolution sharpest"

What does MTF stand for?

Quote: