December 6, 2022

ChatGPT and the viability of replacing humans with circus tricks

OpenAI created something remarkable in ChatGPT. If you ask me what it is, I’ll probably wave my hands a bit, say something about “large language model,” and how it’s credited with solving this year’s Advent of Code amongst other things.

I was skeptical before I tried it. I’ve developed a bit of a gag reflex to the mutant effluence of DALL•E and Stable Diffusion. As a friend in VFX put it to me, it’s a great tool for brainstorming concept art, but not much else. I doubted a chat bot built on similar principles would do any better.

What got me interested enough to overcome my skepticism and try ChatGPT was the primate in me noticing how all the other programmers in the digital jungle were reacting to it. It was a cacophony of exhilaration and fear. Many wonder whether ChatGPT or something like it will replace human programmers. Some despair that it already has. Being a creature of that jungle, their interest became mine.

My knowledge of how it works is limited. As far as I can make out, OpenAPI trained computers to fish for statistically plausible answers to text prompts from a vast lake of human writing and technical knowledge. The results are surprisingly intricate, like this pseudo-VM complete with command prompt that exists inside the context of the chat.

STEMlords arguing about the humanities #

Given ChatGPT’s stated limitations, I figured the easiest way to expose its flaws was to ask an objective question (which it should be able to answer) on a topic that is fundamentally about human existence and its meaning (which I assumed it could not).

Me: What is the difference between Wittgenstein’s philosophy as stated in his Tractatus Logico-Philosophicus and in his later work, Philosophical Investigations?

ChatGPT: Wittgenstein’s philosophy underwent a significant shift between the time he wrote the Tractatus Logico-Philosophicus and his later work, Philosophical Investigations. In the Tractatus, Wittgenstein proposed that the world consists of facts and that language mirrors these facts. He believed that logical analysis could be used to uncover the underlying logical structure of the world and that once this logical structure was uncovered, all philosophical problems would be solved.

In contrast, the later Wittgenstein rejected this view. He argued that the meaning of words is determined by how they are used in a particular language and that there is no inherent logical structure to the world that language must conform to. He also rejected the idea that philosophy could provide definitive answers to philosophical questions and instead emphasized the importance of examining the use of language in everyday life.

In short, the early Wittgenstein believed that philosophy could provide definitive answers to philosophical questions, while the later Wittgenstein emphasized the role of language and its use in determining the meaning of words and ideas.

One thing you are probably missing from the format of this blog is the amazing effect of watching the words stream onto the page.

In retrospect, I’m not sure if that effect is a necessary part of the process or a gimmick to impress the public (and, more cynically, the donors). I was certainly overawed. Perhaps this is why most of the articles I’ve read on ChatGPT consist of screenshots rather than text, which I’m using here. For better or worse, plain text seems to have a sterilizing effect.

I didn’t then (and probably never will) have the credentials to evaluate answers to questions about Wittgenstein authoritatively. But, in the context of the Turing test, ChatGPT is a remarkable achievement. This content wasn’t just scraped off the web. It was a biographical summary written in, as far as I can tell, original language. And it jibed with what little I knew about the man and his ideas. I probably didn’t really challenge it much with my high school level compare-and-contrast challenge. Nevertheless, so far, so stunned.

Reverse engineering intent #

It was time to talk about something I actually know about: programming. I had in mind a bonus question from a take-home assignment I used to give to job applicants seeking positions on my team.

Me: What does this JavaScript function do and why is it faulty?

function a(x) { return (x ^ 0) === x; }

When I asked this question of human candidates, I gave the function a name, isInteger(), and a bit of context. I was more interested to see if they could figure out what was wrong with the function than whether they could divine its intent. But I wanted to see ChatGPT fail, so I obscured those details.

ChatGPT: The a function takes a single argument x and returns a boolean value indicating whether x is an integer. It does this by using the bitwise XOR operator (^) to compare x to zero, and then checking whether the result of this comparison is equal to x.

However, this function is faulty because the bitwise XOR operator is not a reliable way to check whether a number is an integer. For example, if x is a negative integer, the result of the ^ operation will be a positive integer, which will not equal x and will cause the function to return the wrong value.

A better way to check whether a number is an integer would be to use the Number.isInteger method, which is a built-in method in JavaScript that returns a boolean value indicating whether a given value is an integer. Here is how the a function could be rewritten using Number.isInteger:

function a(x) { return Number.isInteger(x); }

All I read at first was “…whether x is an integer…” and the recommendation to use the native Number.isInteger() instead. Both are correct. Mouth agape, I wondered if it was time to find another line of work.

Chat me out of a job, please #

Me: How would you write an isInteger function if the native Number.isInteger() didn’t exist?

ChatGPT: Here is an example of how you could write an isInteger function if the Number.isInteger method did not exist:

function isInteger(x) {
  // Check if x is a number
  if (typeof x !== 'number') return false;

  // Check if x is equal to the nearest integer
  return x === Math.round(x);
}

This function first checks whether the value of x is a number, and returns false if it is not. If x is a number, it then checks whether x is equal to the nearest integer by using the Math.round method to round x to the nearest integer and then comparing the result to x. If x is equal to the nearest integer, the function returns true, indicating that x is an integer. Otherwise, it returns false, indicating that x is not an integer.

Note that this implementation of the isInteger function is not as robust as the Number.isInteger method, because it does not handle special cases such as NaN and Infinity correctly. The Number.isInteger method is the recommended way to check whether a value is an integer in JavaScript.

I admit, I was so flabbergasted by the apparent fluency of the bot that I was oblivious to the mounting factual errors. I was an ape poking at the alien monolith.

Me: How would you modify your implementation of isInteger to handle NaN and Infinity?

ChatGPT: To handle NaN and Infinity values in the isInteger function, you can add additional checks to the function to return false for these special cases. Here is an example of how you could modify the isInteger function to handle these cases:

function isInteger(x) {
  // Check if x is a number
  if (typeof x !== 'number') return false;

  // Check if x is NaN or Infinity
  if (Number.isNaN(x) || !Number.isFinite(x)) return false;

  // Check if x is equal to the nearest integer
  return x === Math.round(x);
}

This updated version of the isInteger function first checks whether x is a number, and returns false if it is not. It then checks whether x is NaN or Infinity by using the Number.isNaN and Number.isFinite methods, and returns false if x is either of these special values. Finally, it checks whether x is equal to the nearest integer by using the Math.round method, and returns true if it is or false if it is not.

This updated version of the isInteger function will correctly handle NaN and Infinity values and will return the correct result for these special cases. However, as mentioned earlier, it is still not as robust as the built-in Number.isInteger method, which is the recommended way to check whether a value is an integer in JavaScript.

Notice how it keeps using the word “robust” without defining what it means in this context.

Me [growing skeptical again]: Why is your implementation still not as robust as the native implementation?

ChatGPT: While the updated isInteger function I provided will handle NaN and Infinity values correctly, it is still not as robust as the built-in Number.isInteger method for several reasons.

First, the Number.isInteger method is a built-in method in JavaScript, which means it is optimized for performance and is tested extensively to ensure that it works correctly in all situations. The isInteger function I provided is a user-defined function, which means it may not be as efficient or as reliable as the built-in method.

OK, sure, that’s true. If ChatGPT had left it at that, I probably would have been satisfied, but it continued.

Second, the Number.isInteger method takes into account the fact that JavaScript numbers are represented in binary floating-point format, which can sometimes cause unexpected behavior when dealing with integers. For example, the value of 0.1 + 0.2 is not exactly equal to 0.3 in JavaScript, because binary floating-point numbers cannot precisely represent the decimal value 0.3. The Number.isInteger method accounts for this behavior and correctly handles cases where a number may not be represented exactly in binary floating-point format. The isInteger function I provided does not account for this behavior and may return incorrect results in such cases…

This is a weird answer. Yes, some JavaScript expressions yield floating point rounding errors. But such expressions would also yield the same result when passed to the native Number.isInteger(). There are no values I am aware of that would yield a result different from the native implementation.

Me: Can you give me an example of a value of x that would fail because of a floating point rounding error?

At this point, ChatGPT lost the thread of the conversation and made some general statements about floating point numbers, forgetting all about isInteger().

Me: No, I meant a value of x passed to your isInteger function.

ChatGPT: Ah, I see what you mean now. Here is an example of a value of x that would fail when passed to the isInteger function I provided earlier because of a floating point rounding error:

isInteger(0.1 + 0.2);  // expected output: true

ChatGPT continued, but let’s stop here. Clearly, it’s off the rails. isInteger(0.1 + 0.2) is passing an expression with a rounding error into its function. I asked about a value. Perhaps more alarmingly, the expected output should most certainly not be true. No implementation of isInteger() should return true for 0.1 + 0.2, with or without the floating point rounding error. Maybe it wasn’t fair for me to interrogate ChatGPT about its faulty premise, but I think this illustrates how a so-called “artificial intelligence” isn’t self-aware enough to smell its own bullshit.

What else ChatGPT got wrong #

This made me wonder about some of the other claims ChatGPT had made back when my disbelief was suspended.

Contrary to what ChatGPT claims, the bitwise operator implementation of isInteger() correctly returns true for negative numbers.
The provided explanation for why the bitwise operator implementation fails is so vague as to be meaningless. The reason a bitwise XOR operator works for some values of x is that it coerces x into a 32-bit integer first before converting it back to a IEEE 754 double-precision 64-bit floating point number. The ^ 0 operation itself is effectively a noop. It’s a clever magic trick with some misdirection. It’s wrong because the range of expressible 64-bit floating point numbers is greater than those of 32-bit integers. Very large numbers will yield false negatives because bits get chopped off in the conversion.
ChatGPT’s first attempt at writing an isInteger() implementation does in fact handle NaN correctly. The code it added to handle NaN is superfluous.
Number.isInteger() does not handle very small expressions of numbers like 1e-324 any better than any other implementation because they evaluate to 0 before they’re applied.

The reason I missed these errors initially was that the summary was correct and I agreed with the conclusion, which was that one should always favor the native implementation. But I had skimmed the substance of its answers and failed to notice the factual errors and fluff masquerading as supporting evidence.

Grasping for prestige #

At best, ChatGPT could become a lazy person’s automated research librarian, a first stop on a longer quest for knowledge. Unlike traditional research tools, however, clarity diminishes with use. There are no citations, no barometer of confidence, just answers. Examining the answers closely is like looking at the world through glass bricks. At worst, it could become a crutch for those unscrupulous enough to represent ChatGPT’s answers as their own, particularly in academic contexts. I would not relish being a computer science instructor at this juncture.

I’m reminded of a story about a U.S. president and their cabinet. (I’ll not say which one because this is not a political blog.) They prided themselves on their ability to speed read. In addition to the blizzard of papers ferried in and out of the White House, they read copious books ranging from popular fiction to critiques of the American nuclear arms strategy and could speak compellingly about them at cocktail parties and National Security Briefings. As far as I know, they never revealed their methods, but if they’re anything like the rest of us mortals, they accomplished the feat by not actually reading all of the text. For example, some speed readers read only the first and last chapters of a book and the first sentence of each paragraph in between. Skimming is commonplace and even desirable in many cases, but this particular presidency made a cult of it. The lesson I drew was that it is frighteningly tempting for seemingly responsible adults to value the appearance of understanding something over deeply understanding it.

What’s interesting about ChatGPT is that not only can it skim on command, it can also act as a kind of mechanical ghostwriter. Of course, everyone is ignoring all the warnings not to do that. If you care to, you can ask it why that’s a bad idea and it will elaborate:

Me: What are the limitations of using ChatGPT to write code?

ChatGPT: As a large language model, ChatGPT is not specifically designed to write code and therefore may not have the knowledge or ability to do so effectively. Some of the limitations of using ChatGPT for writing code include:

Lack of knowledge of programming languages and their syntax and semantics

Limited understanding of the logic and structure of a program

Inability to access external resources or reference documentation

Lack of knowledge about specific libraries or frameworks

Difficulty following the conventions and best practices of programming

Overall, it is generally not recommended to use ChatGPT or any other language model for writing code, as it is likely to produce poor-quality or incorrect code. Instead, it is better to use tools specifically designed for writing code, such as a code editor or integrated development environment (IDE).

ChatGPT was built by creative, innovative people, but it has no creativity or innovation of its own. It can reverse engineer intent, but it has no intention of its own. Perhaps it can help those of us who have not yet given up on ourselves in our quest for creativity and innovation. But we must supply the intent and assume responsibility for the outcome.

144

Kudos

144

Kudos