ChatGPT and the viability of replacing humans with circus tricks
OpenAI created something remarkable in ChatGPT. If you ask me what it is, I’ll probably wave my hands a bit, say something about “large language model,” and how it’s credited with solving this year’s Advent of Code amongst other things.
I was skeptical before I tried it. I’ve developed a bit of a gag reflex to the mutant effluence of DALL•E and Stable Diffusion. As a friend in VFX put it to me, it’s a great tool for brainstorming concept art, but not much else. I doubted a chat bot built on similar principles would do any better.
What got me interested enough to overcome my skepticism and try ChatGPT was the primate in me noticing how all the other programmers in the digital jungle were reacting to it. It was a cacophony of exhilaration and fear. Many wonder whether ChatGPT or something like it will replace human programmers. Some despair that it already has. Being a creature of that jungle, their interest became mine.
My knowledge of how it works is limited. As far as I can make out, OpenAPI trained computers to fish for statistically plausible answers to text prompts from a vast lake of human writing and technical knowledge. The results are surprisingly intricate, like this pseudo-VM complete with command prompt that exists inside the context of the chat.
STEMlords arguing about the humanities #
Given ChatGPT’s stated limitations, I figured the easiest way to expose its flaws was to ask an objective question (which it should be able to answer) on a topic that is fundamentally about human existence and its meaning (which I assumed it could not).
Me: What is the difference between Wittgenstein’s philosophy as stated in his Tractatus Logico-Philosophicus and in his later work, Philosophical Investigations?
ChatGPT: Wittgenstein’s philosophy underwent a significant shift between the time he wrote the Tractatus Logico-Philosophicus and his later work, Philosophical Investigations. In the Tractatus, Wittgenstein proposed that the world consists of facts and that language mirrors these facts. He believed that logical analysis could be used to uncover the underlying logical structure of the world and that once this logical structure was uncovered, all philosophical problems would be solved.
In contrast, the later Wittgenstein rejected this view. He argued that the meaning of words is determined by how they are used in a particular language and that there is no inherent logical structure to the world that language must conform to. He also rejected the idea that philosophy could provide definitive answers to philosophical questions and instead emphasized the importance of examining the use of language in everyday life.
In short, the early Wittgenstein believed that philosophy could provide definitive answers to philosophical questions, while the later Wittgenstein emphasized the role of language and its use in determining the meaning of words and ideas.
One thing you are probably missing from the format of this blog is the amazing effect of watching the words stream onto the page.
In retrospect, I’m not sure if that effect is a necessary part of the process or a gimmick to impress the public (and, more cynically, the donors). I was certainly overawed. Perhaps this is why most of the articles I’ve read on ChatGPT consist of screenshots rather than text, which I’m using here. For better or worse, plain text seems to have a sterilizing effect.
I didn’t then (and probably never will) have the credentials to evaluate answers to questions about Wittgenstein authoritatively. But, in the context of the Turing test, ChatGPT is a remarkable achievement. This content wasn’t just scraped off the web. It was a biographical summary written in, as far as I can tell, original language. And it jibed with what little I knew about the man and his ideas. I probably didn’t really challenge it much with my high school level compare-and-contrast challenge. Nevertheless, so far, so stunned.
Reverse engineering intent #
It was time to talk about something I actually know about: programming. I had in mind a bonus question from a take-home assignment I used to give to job applicants seeking positions on my team.
Me: What does this JavaScript function do and why is it faulty?
function a(x) { return (x ^ 0) === x; }
When I asked this question of human candidates, I gave the function a name, isInteger()
, and a bit of context. I was more interested to see if they could figure out what was wrong with the function than whether they could divine its intent. But I wanted to see ChatGPT fail, so I obscured those details.
ChatGPT: The
a
function takes a single argumentx
and returns a boolean value indicating whetherx
is an integer. It does this by using the bitwise XOR operator (^
) to comparex
to zero, and then checking whether the result of this comparison is equal tox
.However, this function is faulty because the bitwise XOR operator is not a reliable way to check whether a number is an integer. For example, if
x
is a negative integer, the result of the^
operation will be a positive integer, which will not equalx
and will cause the function to return the wrong value.A better way to check whether a number is an integer would be to use the
Number.isInteger
method, which is a built-in method in JavaScript that returns a boolean value indicating whether a given value is an integer. Here is how the a function could be rewritten usingNumber.isInteger
:
function a(x) { return Number.isInteger(x); }
All I read at first was “…whether x
is an integer…” and the recommendation to use the native Number.isInteger()
instead. Both are correct. Mouth agape, I wondered if it was time to find another line of work.
Chat me out of a job, please #
Me: How would you write an
isInteger
function if the nativeNumber.isInteger()
didn’t exist?ChatGPT: Here is an example of how you could write an isInteger function if the Number.isInteger method did not exist:
function isInteger(x) {
// Check if x is a number
if (typeof x !== 'number') return false;
// Check if x is equal to the nearest integer
return x === Math.round(x);
}
This function first checks whether the value of
x
is a number, and returnsfalse
if it is not. Ifx
is a number, it then checks whetherx
is equal to the nearest integer by using theMath.round
method to roundx
to the nearest integer and then comparing the result tox
. Ifx
is equal to the nearest integer, the function returnstrue
, indicating thatx
is an integer. Otherwise, it returnsfalse
, indicating thatx
is not an integer.Note that this implementation of the
isInteger
function is not as robust as theNumber.isInteger
method, because it does not handle special cases such asNaN
andInfinity
correctly. TheNumber.isInteger
method is the recommended way to check whether a value is an integer in JavaScript.
I admit, I was so flabbergasted by the apparent fluency of the bot that I was oblivious to the mounting factual errors. I was an ape poking at the alien monolith.
Me: How would you modify your implementation of
isInteger
to handleNaN
andInfinity
?ChatGPT: To handle
NaN
andInfinity
values in theisInteger
function, you can add additional checks to the function to returnfalse
for these special cases. Here is an example of how you could modify theisInteger
function to handle these cases:
function isInteger(x) {
// Check if x is a number
if (typeof x !== 'number') return false;
// Check if x is NaN or Infinity
if (Number.isNaN(x) || !Number.isFinite(x)) return false;
// Check if x is equal to the nearest integer
return x === Math.round(x);
}
This updated version of the isInteger function first checks whether
x
is a number, and returnsfalse
if it is not. It then checks whetherx
isNaN
orInfinity
by using theNumber.isNaN
andNumber.isFinite
methods, and returnsfalse
ifx
is either of these special values. Finally, it checks whetherx
is equal to the nearest integer by using theMath.round
method, and returnstrue
if it is orfalse
if it is not.This updated version of the
isInteger
function will correctly handleNaN
andInfinity
values and will return the correct result for these special cases. However, as mentioned earlier, it is still not as robust as the built-inNumber.isInteger
method, which is the recommended way to check whether a value is an integer in JavaScript.
Notice how it keeps using the word “robust” without defining what it means in this context.
Me [growing skeptical again]: Why is your implementation still not as robust as the native implementation?
ChatGPT: While the updated
isInteger
function I provided will handleNaN
andInfinity
values correctly, it is still not as robust as the built-inNumber.isInteger
method for several reasons.First, the
Number.isInteger
method is a built-in method in JavaScript, which means it is optimized for performance and is tested extensively to ensure that it works correctly in all situations. TheisInteger
function I provided is a user-defined function, which means it may not be as efficient or as reliable as the built-in method.
OK, sure, that’s true. If ChatGPT had left it at that, I probably would have been satisfied, but it continued.
Second, the
Number.isInteger
method takes into account the fact that JavaScript numbers are represented in binary floating-point format, which can sometimes cause unexpected behavior when dealing with integers. For example, the value of0.1 + 0.2
is not exactly equal to0.3
in JavaScript, because binary floating-point numbers cannot precisely represent the decimal value0.3
. TheNumber.isInteger
method accounts for this behavior and correctly handles cases where a number may not be represented exactly in binary floating-point format. TheisInteger
function I provided does not account for this behavior and may return incorrect results in such cases…
This is a weird answer. Yes, some JavaScript expressions yield floating point rounding errors. But such expressions would also yield the same result when passed to the native Number.isInteger()
. There are no values I am aware of that would yield a result different from the native implementation.
Me: Can you give me an example of a value of
x
that would fail because of a floating point rounding error?
At this point, ChatGPT lost the thread of the conversation and made some general statements about floating point numbers, forgetting all about isInteger()
.
Me: No, I meant a value of
x
passed to yourisInteger
function.ChatGPT: Ah, I see what you mean now. Here is an example of a value of
x
that would fail when passed to theisInteger
function I provided earlier because of a floating point rounding error:
isInteger(0.1 + 0.2); // expected output: true
ChatGPT continued, but let’s stop here. Clearly, it’s off the rails. isInteger(0.1 + 0.2)
is passing an expression with a rounding error into its function. I asked about a value. Perhaps more alarmingly, the expected output should most certainly not be true
. No implementation of isInteger()
should return true
for 0.1 + 0.2
, with or without the floating point rounding error. Maybe it wasn’t fair for me to interrogate ChatGPT about its faulty premise, but I think this illustrates how a so-called “artificial intelligence” isn’t self-aware enough to smell its own bullshit.
What else ChatGPT got wrong #
This made me wonder about some of the other claims ChatGPT had made back when my disbelief was suspended.
- Contrary to what ChatGPT claims, the bitwise operator implementation of
isInteger()
correctly returnstrue
for negative numbers. - The provided explanation for why the bitwise operator implementation fails is so vague as to be meaningless. The reason a bitwise XOR operator works for some values of
x
is that it coercesx
into a 32-bit integer first before converting it back to a IEEE 754 double-precision 64-bit floating point number. The^ 0
operation itself is effectively a noop. It’s a clever magic trick with some misdirection. It’s wrong because the range of expressible 64-bit floating point numbers is greater than those of 32-bit integers. Very large numbers will yield false negatives because bits get chopped off in the conversion. - ChatGPT’s first attempt at writing an
isInteger()
implementation does in fact handleNaN
correctly. The code it added to handleNaN
is superfluous. Number.isInteger()
does not handle very small expressions of numbers like1e-324
any better than any other implementation because they evaluate to0
before they’re applied.
The reason I missed these errors initially was that the summary was correct and I agreed with the conclusion, which was that one should always favor the native implementation. But I had skimmed the substance of its answers and failed to notice the factual errors and fluff masquerading as supporting evidence.
Grasping for prestige #
At best, ChatGPT could become a lazy person’s automated research librarian, a first stop on a longer quest for knowledge. Unlike traditional research tools, however, clarity diminishes with use. There are no citations, no barometer of confidence, just answers. Examining the answers closely is like looking at the world through glass bricks. At worst, it could become a crutch for those unscrupulous enough to represent ChatGPT’s answers as their own, particularly in academic contexts. I would not relish being a computer science instructor at this juncture.
I’m reminded of a story about a U.S. president and their cabinet. (I’ll not say which one because this is not a political blog.) They prided themselves on their ability to speed read. In addition to the blizzard of papers ferried in and out of the White House, they read copious books ranging from popular fiction to critiques of the American nuclear arms strategy and could speak compellingly about them at cocktail parties and National Security Briefings. As far as I know, they never revealed their methods, but if they’re anything like the rest of us mortals, they accomplished the feat by not actually reading all of the text. For example, some speed readers read only the first and last chapters of a book and the first sentence of each paragraph in between. Skimming is commonplace and even desirable in many cases, but this particular presidency made a cult of it. The lesson I drew was that it is frighteningly tempting for seemingly responsible adults to value the appearance of understanding something over deeply understanding it.
What’s interesting about ChatGPT is that not only can it skim on command, it can also act as a kind of mechanical ghostwriter. Of course, everyone is ignoring all the warnings not to do that. If you care to, you can ask it why that’s a bad idea and it will elaborate:
Me: What are the limitations of using ChatGPT to write code?
ChatGPT: As a large language model, ChatGPT is not specifically designed to write code and therefore may not have the knowledge or ability to do so effectively. Some of the limitations of using ChatGPT for writing code include:
- Lack of knowledge of programming languages and their syntax and semantics
- Limited understanding of the logic and structure of a program
- Inability to access external resources or reference documentation
- Lack of knowledge about specific libraries or frameworks
- Difficulty following the conventions and best practices of programming
Overall, it is generally not recommended to use ChatGPT or any other language model for writing code, as it is likely to produce poor-quality or incorrect code. Instead, it is better to use tools specifically designed for writing code, such as a code editor or integrated development environment (IDE).
ChatGPT was built by creative, innovative people, but it has no creativity or innovation of its own. It can reverse engineer intent, but it has no intention of its own. Perhaps it can help those of us who have not yet given up on ourselves in our quest for creativity and innovation. But we must supply the intent and assume responsibility for the outcome.