Patrick Reany
24 August 2023
In the arguments that I have witnessed concerning whether or not LLMs (Large Language Models)
can understand, I do not often hear the participants providing clear and nuanced definitions of the
key terms, 'understanding' being the primary one. In this essay I intend to provide my own definitions
and then answer in the affirmative that LLMs such as ChatGPT and BingChat do understand.
The first thing is that ChatGPT and BingChat -- the two LLMs that I have had frequent interaction with
-- do apparently understand the statements and questions I have posed to them. This is not to say that
they have always correctly understood on my first effort to start a new discussion with them, but they
are capable of grasping the meaning of my discourse with them by their knowledge of grammar and
their training on the meaning of nouns, verbs, adjectives, etc. I should point out that humans are also
prone to misunderstand me. In fact, humans are more likely to misunderstand me because, compared
to LLMs, as they are far more biased, emotional, impatient, closed minded, and usually lacking in critical
thinking skills.
What do I mean objectively by 'grasping the meaning of my discourse'? I mean that they are able to
provide reasonable responses in them. Let's not underappreciate how significant this is on the part
of LLMs. With most interactions I have had with computer programs (with exception to some degree
with WolframAlpha -- which is still quite primitive to ChatGPT, though), if I did not input at the
keyboard precisely in the manner that the program expects the user input to be formatted, then it
replies only with an error message. However, my own experience with both BingChat and ChatGPT
has been that if I contruct a statement and/or a question that is understandable to the average
human who knows the technical terms I use, then both of these bots are able to reply with reasonable
responses, even to the point of interpreting my misspelled words correctly. This is already an
amazing accomplishment, and proves to me that these two LLMs do indeed understand what I have
communicated to them, at least at some level.
The main problem I see at this point in the discussion of whether LLMs can understand or not is that
the very notion of 'to understand' is very nuanced. Further aggravating this situation is that I don't
think that we humans fully comprehand how we grasp things at a deep level. I'm going to propose two
modes by which we can 'objectively' evaluate understanding, both for humans and for non-humans,
LLMs in particular. These two modes are not mutually exclusive and are probably interoperable,
though I haven't figured out exactly how yet. In any case, they are both contestable, but they satisfy
my own intuitions on understanding. So let's do it.
There are four kinds of understanding that I recognize in this essay: intuitive, superficial, concrete,
and deep. I'll define them after this prefacing comment: The only way to make the notion of
understanding objecting (and thus reasonable to use to analyze the behavior of LLMs) is to formulate
it in terms of objective notions, and of the notions of concern here, it is knowledge, or facts that are
objective. They can be objectively tested for -- just like as is routinely done in school on a test.
In other words, we can only talk objectively about that subset of understanding that can be formulated
objectively. Most human understanding resides in knowledge; and robust understanding requires the
robust acquisition of knowledge.
Intuitive understanding is based on 'knowing' without having an exact rational basis for it.
In other words, there is an intuitive aspect to it. However, this form of understanding won't be
important in this essay, at least so far as is applicable to the LLMs.
Superficial understanding is based on knowing a content domain (case model or category) and
knowing some of its major components, but lacking more than a trivial amount of knowledge of the
relationships between/among the parts/components.
Concrete understanding is based on knowledge of a case model. A case model is a knowledge-
content domain. To use the terminology of category theory, it is knowing some domain of interest (the
'category'), knowing the parts (the 'objects') within it, and knowing how the parts are related to each
other. In category theory the relationships between any two parts are represented by arrows. Now, there
are more defining properties of categories, but I will not state them because they may not always apply
to my case-models/content domains, and the formulation of my 'theory of understanding' strictly by
the rules of category theory would likely be too restrictive.
So, an example of a case model is a natural language, such as English. Once we know some nouns and
verbs and adjectives, we can begin to relate them syntactically by standard rules of the language. At
this point, everything is just knowledge. We can then formally learn to make simple subject-predicate
sentences. This again is pure knowledge, pure facts. The next step is to correlate a large subset of the
natural language and to make it correspond to objects and events in the real world. This is the semantics
of the language. (It can be argued that syntax can provide a syntactical meaning to words, but that is
separate to meaning within the natural world.)
Deep understanding starts with a concrete understanding of a case model and then demonstrates
the capacity to extrapolate 'new' knowledge or inferences from it. This 'new' knowledge relative to the
model, is a statement that can be logically deduced from the current content of the model that is not
already explicitly a member of the knowledge contents of the model. An example is the following. A
is less than B and B is less than C. Hence, A is less than C. (Of course, I am assuming that in this
knowledge domain, the transitive property holds generally.) If this knowledge was not already a
registered knowledge fact of the model, it could be added to it as such. An inference is similar in
nature, but less certain. An inference is a claim that is (supposedly) likely true, based on the facts
of the model, but not necessarily true.
Now that I have presented my technical definitions of understanding, I want to be able to connect
the meaning of the word to how it's used in the vernacular. I will say that a person or LLM understands
a case model (at least in part) it they can accurately answer a 'why' question concerning the case model.
Before we decide to what degree chatbots understand, let's review the process by which humans
demonstrate their capacity to understand.
I want to illustrate with a fictional situation: Dr Adam James is a US surgeon who has just
finished treating his emergency patient, Vinod Abrahms (who is on visit to the US from the
fictional country of East Suiriland), for a herniated muscle. Vinod has a cousin back in his home
country who has tuberculosis and is being treated with the (fictional) drug Haeteperine. However,
he has recently found out that there is a more effective (fictional) drug called Rombibutol
available to treat the disease. Vinod asked Dr James why this was so, but he didn't know. So he
called up his daughter Dr Mellisa James, who is a pathologist who has done field work overseas to
treat diseases such as tuberculosis.
Mellisa: Hi, Dad. What's up?Okay, what has Mellisa done to demonstrate her 'understanding'? She has used her concrete
Adam: Hi, Mellisa. I have a patient who has a cousin in East Suiriland who's being treated
for tuberculosis with Haeteperine, rather than by the more effective Rombibutol. Can
you shed some light on this, so we can understand what's going on?
Mellisa: Sure. It's probably because Haeteperine is legal to be exported to East Suiriland
from the US, but Rombibutol is not.
Adam: What do you mean by 'probably'?
Mellisa: Well, it's definitely true that Rombibutol cannot be legally exported to East
Suiriland at this time, but there are other means to obtain it, such as by the blackmarket,
but I'm supposing that his doctors have rejected that option. I suppose the full answer should
be, his doctors cannot obtain it legally and they refused to obtain it illegally.
As for LLMs being robust and smart programs that perform sophisticated 'autocomplete', as it is
often explained to us: I don't believe that for a second. Obviously, I'm not expert in how LLMs
work, but to me, the assumption that ChatGPT can answer my complex questions with complex
-- even insightful -- replies, based on autocomplete, is ridiculous. Sure, autocomplete is a
part of the mechanism, but not all of it.
----------------------------------
Before I attempt to convince the reader that ChatGPT can demonstrate 'superficial understanding' by
employing a mere superficial amount of knowledge in some subject matter, I will first demonstrate
it for humans.
Here's my example: Say that John is a collector of model airplanes. He doesn't care about their
technical aspects; he just thinks they look cool hanging from his ceiling or displayed on this shelves.
He has a friend named Errol who recently went to a US Army kiosk to inquire what courses he should
take to prepare for flying the A-10 Warthog. The kiosk computer replied: 'That does not compute!'
Errol knows that John knows a lot about airplanes -- he even has a model of an A-10 in his room.
So he goes to John and starts a conversation with him:
Errol: Hey, John, why does the Army kiosk tell me, 'It does not compute,' when I ask it aboutI point out that the knowledge needed to answer this why question does not even require one to
the A-10 Warthog training preparation? I don't understand!
John: Well, you're in luck, Errol!
Errol: Why?
John: I suppose because you went to the wrong kiosk.
Errol: Are you trying to be funny?!
John: Not really, but I'm playing this for all it's worth. You nerdy guys don't usually
come to me for answers.
Errol: Okay. I'll play your game. What kiosk should I have gone to?
John: The Airforce kiosk.
Errol: No way!
John: Look, I know only one fact about the A-10 and that will make you understand why.
Errol: Well, what is it?!
John: The A-10 in not an Army aircraft; it's a USAF aircraft.
Errol: Really?
John: Really.
Errol: Oh, now I understand. Can you tell me the difference between the US Army and
the USAF?
John: Not really. All I know is that I have models of aircraft from both military subcategories.
This terrible phrase is the perpetrator of a lot of misconceptions, and it's a false dichotomy. It's also
at the heart of the modern hatred of memorization. Tell me one thing you can rationally understand
that isn't based on knowing something first. All rational understanding is founded on factual,
objective knowledge. The more you know, the more you can understand. Understanding begins with
knowing the subject and its many parts. Then it's knowing the many relationships among the many
parts. Then it's the ability to form conclusions and/or inferences based on all that relational knowledge.
So, what do people seem to mean when they disparage 'knowing' while extoll 'understanding'?
They mean something like memorization of a few formulas in the hope that that will be enough
to do well on a test is labeled as mere 'knowing'. What do they mean by 'understanding'? They mean
roughly what I have been extolling so far: They mean at least a robust 'concrete understanding' and,
better still, 'deep understanding' that permits deep deductions and inferences.
I will provide an example of mere knowing or the supposed result of the evil of 'rote' memorization.
Person A can paint a house in TA hours. Person B can paint the same house in TB hours. How long
will it take for the paint job to be completed if both A and B paint the house together and work
until the job is finished? There is a well-known formula for this kind of mixed-rate problem.
The asnwer is
T = 1/[1/TA + 1/TB].It's so great to know this formula, you don't even have to know where it comes from to use it,
Here are some algebra word-problem heuristics the student should memorize (know):
For some practice doing algebra word problems, go to Algebra word problems.
If you're not memorizing something every day, you're wasting precious time.
Any person or computer proggram that can receive a query, go to its internal representation
of the correct case model and extract a reasonable response, is demonstrating (concrete)
understanding at some level, even if all one employed was a look-up table. But both humans
and ChatGPT go far beyond just understanding natural languages well.
My essay stops here. What follows is an addendum that shows us how intelligent and
understanding ChatGPT really is by how it reasonably responds to my query.
Patrick to ChatGPT (23 August 2023):
Could you please provide for me some bullet points (with brief explanations) that characterize
critical thinking, and then explain how that these points, if used properly, are the antidotes
to biased, emotional, impatient, and closed-minded thinking.
ChatGPT:
Certainly, here are some bullet points that characterize critical thinking, along with explanations
of how they serve as antidotes to biased, emotional, impatient, and closed-minded thinking:
Characteristics of Critical Thinking:
Objective Analysis: Critical thinking involves examining information and situations objectively,
without allowing personal biases or emotions to cloud judgment.
Evidence-Based: It emphasizes using reliable evidence and data to support conclusions and decisions,
rather than relying on assumptions or opinions.
Questioning Assumptions: Critical thinkers are skilled at identifying and challenging assumptions
underlying arguments, claims, or beliefs to ensure they are well-founded.
Logical Reasoning: This involves constructing clear and valid arguments, avoiding fallacies and
inconsistencies in one's thought process.
Open-Mindedness: Critical thinkers are receptive to new ideas and perspectives, willing to reconsider
their own beliefs when presented with compelling evidence.
Problem-Solving: They approach complex issues by breaking them down into manageable parts and
evaluating possible solutions systematically.
Curiosity: A critical thinker possesses a natural curiosity to explore various facets of a topic,
seeking deeper understanding and broader context.
Effective Communication: Critical thinkers can articulate their thoughts clearly and persuasively,
allowing them to engage in constructive discussions and debates.
Antidotes to Biased Thinking:
Objective Analysis: By focusing on objective analysis, critical thinkers can recognize and mitigate
their own biases, enabling them to assess situations more fairly.
Evidence-Based: Relying on evidence instead of personal biases helps to counteract biased
interpretations and decision-making.
Questioning Assumptions: Critical thinkers scrutinize their assumptions, reducing the influence of
biases that might stem from preconceived notions.
Open-Mindedness: Being open-minded permits critical thinkers to consider alternative viewpoints,
challenging and minimizing biased perspectives.
Antidotes to Emotional Thinking:
Objective Analysis: It helps in distancing from emotionally charged situations, enabling more
rational evaluation.
Evidence-Based: Relying on evidence over emotions prevents impulsive decisions driven by feelings.
Logical Reasoning: Emotional decisions often lack logical structure; employing logical reasoning
can temper emotional impulses.
Problem-Solving: A systematic problem-solving approach reduces the impact of emotional reactions
in complex situations.
Antidotes to Impatient Thinking:
Logical Reasoning: Following a logical process encourages patience as each step is methodically
considered.
Problem-Solving: Complex issues require time and patience to dissect and address, discouraging
hasty decisions.
Curiosity: A curious mindset promotes a willingness to explore all aspects thoroughly before
forming conclusions.
Antidotes to Closed-Minded Thinking:
Questioning Assumptions: Critical thinkers constantly challenge their own assumptions, guarding
against closed-mindedness.
Open-Mindedness: Being open to different perspectives counters the tendency to dismiss new
ideas without consideration.
Effective Communication: Engaging in open and constructive discussions with others exposes one
to diverse viewpoints, reducing closed-mindedness.
In summary, critical thinking acts as an antidote to biased, emotional, impatient, and closed-minded
thinking by promoting objective analysis, evidence-based reasoning, questioning assumptions,
open-mindedness, logical reasoning, systematic problem-solving, curiosity, and effective
communication. These qualities collectively enable individuals to approach information and
situations with a balanced, rational, and open perspective, reducing the influence of cognitive
biases and emotional impulses.