The Problem with AI is "us"

Originally Published: 01/21/2024

This page is a bit of a working document with some of my thoughts on AI. I dumped all this text out in one long marathon writing session without much editing, so it currently drifts around a little in terms of its argument and general line of thinking. Ironically, that makes it, in its current state, an example of the somewhat poor writing of the sort that I'm pointing out in the output of LLMs. I did however want to post it as is to "get it out there" and then edit it over time. Expect this document to evolve (and hopefully improve) as I spend more time on it.

The past year or two has seen a massive surge in so-called "AI" [1] applications. The big push happened in the wake of OpenAI's release of ChatGPT in November of 2022, but this was preceded a bit more quietly (at least in many circles) by the growing capabilities of AI art generators in the months leading up to it. This explosion has lead to a lot of uncertainly, particularly when it comes to jobs. These new AI systems appear very capable; will they replace us?

On this page, I'm going to give my thoughts about AI and its effects on the three disciplines that I see as most affected in current year: writing, programming, and art. At times I may seem disparaging towards AI; I want to be very clear up front: The capabilities of modern machine learning systems are very impressive. Given where we were not that many years ago, and how far we've come, it's difficult not to be impressed. But, remember that for a religion of one adherent, converting a single new member constitutes a 100% growth. Even as the growth of the capabilities of these systems continues (assuming it does--I have my doubts), they have a lot of ground to make up. It is possible to be both impressed with the progress and pessimistic about the outlook of a technology .

Writing

For most people, the use of large language models for text generation is the most apparent consequence of the recent AI developments. ChatGPT and co. have been used by many as a writing tool--to either help edit or reword their thoughts, or as a substitute for thinking altogether. Even people who've never used it have seen its effects if they have spent any time on the Internet: LLM generated articles abound.

In a sense, this technology has the potential to do a lot of good, for this reason. As the late Dr. Patrick Winston famously remarked in his MIT lectures on public speaking,

Your success in life in life will be determined largely by your ability to speak, your ability to write, and the quality of your ideas. In that order.

The fact is (and I know this from experience teaching), most people simply cannot write. And this constitutes a massive barrier to their success. Even more so as we move deeper into the Internet age, with its focus on written communciation. While one might argue that they should just "learn" how to do it; I'm increasingly of the mind that this isn't nearly as simple a solution as one might think. Particularly if they still can't write by the time that they hit college. What this means, ultimately, is that ChatGPT could democratize writing in a very big way. Even if you cannot clearly express your ideas, ChatGPT probably can.

There is another side to this coin, however. ChatGPT can certainly write better than most people. But, this is largely because of the fact that most people aren't very good at writing. ChatGPT can construct a coherent sentence, and it can even construct mostly coherent paragraphs. But I would not describe its output as "good". Acceptable, perhaps; above average, definitely. But not exactly "good". In fact, I still remember my very first impression of it: The output of ChatGPT in response to a question that I asked struck me as very similar to the text produced by a college student in response to a question they didn't know the answer to. It was very wordy and rambling, and never quite got to the point.

ChatGPT 3.5 Example

To demonstrate this, let me analyze some writing from ChatGPT in response to a simple question. I'll show you exactly what I mean. I used the GPT 3.5 version as I don't pay for access to GPT 4. The prompt that I used for this example is: Explain the Church-Turing thesis and why it is important in no more than 6 sentences.

The Church-Turing thesis posits that any function computable by an effective method (algorithm) can be computed by a Turing machine. This theoretical concept, introduced by Alonzo Church and Alan Turing, unifies various models of computation and defines the limits of algorithmic solvability. It serves as the cornerstone of computer science, providing a common framework to understand computation. The thesis is crucial in establishing the notion of universal Turing machines, laying the foundation for the development of modern computers. It defines what is algorithmically computable, shaping the study of algorithms, complexity theory, and the theoretical aspects of computation. Overall, the Church-Turing thesis is fundamental for understanding the scope and boundaries of computation in theoretical computer science.

I went back and forth a few times with the prompt to get something workable. ChatGPT tends to produce very verbose outputs with low information density, and I didn't want to subject you to me commenting on a particularly long block of text for the purposes of this page. I may create a stand-alone document discussing a full-length ChatGPT output at some point. Most of the longer answers had the same elements as this restricted length one, in terms of key points.

Overally, it isn't too bad. You can see how this technology is appealing. But, while this paragraph is passable, probably even above average, I'd hardly call it "good". Let's break it down sentence by sentence,

The Church-Turing thesis posits that any function computable by an effective method (algorithm) can be computed by a Turing machine.

We aren't exactly off to a good start here. Superficially, this statement is correct. But it actually is missing the most important aspect of the thesis. The sentence is setting the thesis up as an implication

algorithmically computable implies Turing machine computable

when the thesis is actually a biconditional (if and only if), which contains both the forward and backward implication,

algorithmically computable implies Turing machine computable, and
Turing machine computable implies algorithmically computable

This is a hugely significant point, but the sort of thing I'm not surpised that a language model would miss, as in English speaking and writing we are often sloppy with expressing formal logical.

This theoretical concept, introduced by Alonzo Church and Alan Turing, unifies various models of computation and defines the limits of algorithmic solvability.

This sentence also contains some sloppy wording. The Church-Turing Thesis does apply to a wide range of theoretical models of computation (Turing Machines, Lambda Calculus, Combinators, etc.), but not because it is doing the unifying. The thesis applies to all of these models because they are provably equivalent to a Turing machine, but the thesis itself does not prove or require this to be the case.

It serves as the cornerstone of computer science, providing a common framework to understand computation.

The thesis is crucial in establishing the notion of universal Turing machines, laying the foundation for the development of modern computers.

It defines what is algorithmically computable, shaping the study of algorithms, complexity theory, and the theoretical aspects of computation.

These three sentences are largely redundant with each other. Given the size constraints of the output that I placed, this could have been condensed and the extra space used to provide more information, or clarify some of the inaccuracies I've mentioned above. But ChatGPT really likes to produce low-density output, so this redundancy isn't very surprising to me.

Also, I would argue that the second sentence gets the causality a bit backwards. The Church-Turing Thesis, at least to my eye, stems from the idea of a Universal Turing machine, not the other way around.

Overall, the Church-Turing thesis is fundamental for understanding the scope and boundaries of computation in theoretical computer science.

And then we have a conclusion, which again is basically just a statement of the previous three sentences. Given that it's a concluding statement, I'm not too beat up about it. But it is amusing given that this point has already been stated two or three times in the space of a very short block of text.

To hammer home how redundant this passage is, let me restate each sentence in fairly simple terms,

The Church-Turing Thesis defines a computable function as one that can be computed by a Turing Machine
The Church-Turing Thesis was introduced by Church and Turing and defines the concept of a computable function.
It is important to computer science because it defines the concept of a computable function
It is crucial to defining the concept of a Turing machine, and is therefore important to computer science.
It defines the concept of a computable function, which is important
In conclusion, it is fundamental because it defines computable functions

This amount of redundancy isn't necessarily a bad thing in a longer piece--it is well known that good communication involves repeating the core idea to ensure that the reader/listener "gets it". I daresay, I've certainly repeated myself a lot in the writing of this article. But when the goal is to produce a short paragraph, this redundancy is harmful.

For context, I then asked ChatGPT to describe the Church-Turing Thesis and why it is important in one sentence. It gave me the following,

The Church-Turing thesis asserts that any computable function can be computed by a Turing machine, providing a foundational concept in computer science by defining the theoretical limits and universality of computation.

Notice that this one sentence answer basically includes the same information (minus the Turing machine definition bit) as the six sentence answer. I question why it added the comment on "universality", though.

Writing -- Conclusion

I could belabor this by going into more examples of poor writing by ChatGPT, but I think I've made my point. It's output is passable, but isn't amazing. It misses important nuance in language (if vs iff) and tends to be incredibly redundant, even when tight size restrictions are put in place. It is true that its output is likely better than a lot of people could do themselves, but it is nowhere near replacing a talented, or even half-decent, writer. And this is only for short texts. Things get even worse (though I won't demonstrate it here) when ChatGPT is used to produce article-length blocks of text.

Programming

Another domain where AI is becoming popular is computer programming. This is, in many ways, similar to writing, and so it isn't surprising that a large language model can perform reasonably well as code generation. If anything, because programming languages are much simpler than English, one might expect an LLM to have slightly easier time producing syntactically correct (if not semantically correct) code.

However, the fact of the matter is, that large language models cannot do anything beyond the most basic and repetitive coding tasks. As an example, I asked ChatGPT to write me a lock-free stack in Python. As expected, it produced a convincing looking Python implementation of a Treiber stack,

import threading
 
class Node:
    def __init__(self, value):
        self.value = value
        self.next = None
 
class LockFreeStack:
    def __init__(self):
        self.head = threading.AtomicReference(None)
 
    def push(self, value):
        new_node = Node(value)
        while True:
            current_head = self.head.get()
            new_node.next = current_head
            if self.head.compare_and_set(current_head, new_node):
                return
 
    def pop(self):
        while True:
            current_head = self.head.get()
            if current_head is None:
                return None
            new_head = current_head.next
            if self.head.compare_and_set(current_head, new_head):
                return current_head.value
 
    def is_empty(self):
        return self.head.get() is None

However, more than a cursory glance reveals that this implementation is horribly broken: the model has completely hallucinated how to do atomic operations in Python. The threading.AtomicReference type doesn't even exist! A little bit of searching reveals that this output is basically the Java example code from Wikipedia's article on the Treiber stack, with the variable names and style translated into Python conventions.

And this wasn't even a particularly complicated task--the answer was literally a copy/paste of a canned solution. And it still couldn't do it. The results of using ChatGPT for simple programming tasks, particularly creating HTML and CSS for websites, are impressive. But the limitations are enormous.

And no, I didn't cook the above example. It was actually the very first prompt that I had tried. I was all ready to talk about how the solution had small concurrency bugs, or didn't properly apply memory reclamation and suffered from the ABA problem, or that a Treiber stack is really not that great of a lock free stack implementation... but I didn't even get that far!

Programming -- Conclusion

The story here is largely the same was with writing. Most people cannot program (even those who work as programmers, sadly), and so the basic abilities of ChatGPT are potentially "better" than average. But they are not "good". In programming it's a bit worse than writing, because there's less space for solutions that "look" good but are actually bad--programs that don't work... won't work. It doesn't matter how convincing a piece of code looks if it doesn't even run.

The fact that there is even a concern about LLMs replacing programmers speaks more to the quality of the average programmer, and the knowledge of the average manager of progammers, than it does to the capabilities of LLMs in this space.

Art

And finally, we get to AI generated art. I will refrain from posting any model outputs in this article, if only because I have artist friends who would probably strangle me for doing so. Of all the domains affected by AI, art seems to be the one that invokes the most murderous rage on the part of those affected. On that note, I won't get into the copyright and intellectual property concerns regarding the training of AI art models (which could just as easily apply to writing and programming). I may publish another piece discussing this specifically at some point.

In any case, AI art generators, more so than anything else, caught me completely off guard. Like many, I always assumed that the more "creative" types would have the most secure positions in the upcoming AI-job apocalypse. As it turns out, they are the ones to be hit the first, and the hardest. How can this be?

I've thought long and hard about this conundrum, and I think I understand why it is happening now. It's actually just a repeat of the same problems I mentioned above regarding writing and programming, but exacerbated by the medium. Namely: most people who "consume" art don't have good taste.

Okay--I need to explain this carefully, because I'm in pretty risky territory here. In writing, to take probably the clearest example, we commonly use the word literature to distinguish between, for lack of a better phrasing, works with high artistic "value" (whatever that means), and the general products of authors that lack this "value" to some degree.

We understand that there is some, possibly hard to define, qualitative difference between the works of Dostoevsky and those of Jim Butcher. It's not that Butcher is a "bad" author (I personally have read everything he has ever written!), it's just that he is trying to do something very different with his work (namely: entertain) than Dostoevsky. While I enjoy reading the adventures of Harry Dresden, I won't say that those adventures speak as deeply to the human condition as those of Myshkin or Raskolnikov.

The visual arts, at least in most popular discourse, doesn't seem to have this same distinction. From paintings by Monet and Waterhouse, to the (in my opinion) creepy corporate "flat art" that Microsoft/Google/etc. seem so fond of, to the creative sticker designs that you get with orders from a variety of trendy web shops, it's all just "art". If somebody called ChatGPT a "literature generator" rather than a "text generator", they'd probably get laughed at. Yet, we happily call their visual counterparts "art" generators, which implicitly ranks them in the same category as Monet.

This is made worse by the fact that there is almost no "floor" when it comes to art "quality". A program that doesn't run is certainly bad. An incoherent and grammatically incorrect piece of text is also bad. These are, mostly, objective metrics of quality. For visual art though, what constitutes a bad piece of art?

Good art isn't necessarily pretty, or invoking of good feelings, or even complex. For example, Goya's Saturn Devouring His Son is none of these things; in fact, thinking about it keeps me up at night. Yet, that is part of what makes it great--it is so uncomfortable. Meanwhile, I have filled my apartment with beautiful works by Waterhouse. Yet, even I have to admit that he probably isn't one of the "great" artists (he was, after all, both popular and successful while he was still alive: one of the sure signs of a lack of artistic greatness :p).

The fact of the matter is, artistic quality is very difficult to assess, and being able to recognize the qualities that make works "great" is not a natural or innate thing. It requires a very good understanding of cultural context, the technical aspects of creating art, etc. As a result, most of us simply fall back on liking pleasing arrangements of color. Such works may be cute, or pretty, or otherwise evoke positive emotions; but they don't necessarily have much to say. That is to say, most of us lack good "taste" in art--in the technical sense. We aren't conciousnessly aware of that which makes us like art and are, more or less, "along for the ride" when it comes to our opinions (yes, I'm including myself in this--I know a little, but not much, when it comes to visual arts). It's possible that this is the reason that we don't (in popular discourse) distinguish between "high art" and normal art in the way we do with literature and writing.

This general vague unconsciousness when it comes to art makes it very easy for AI art generators to produce "good" outputs. The output doesn't need to "say" anything, in the way that text or programs do, and doesn't need to meet any particular quality standards, for its output to be useful as "art". It becomes a simple game of mimicking the pretty color patterns.

Art -- Conclusion

In the same way that ChatGPT is nowhere near the level of Dostoevsky (or even Butcher), an art generator is nowhere near the level of even a thoroughly middling artist. It splashes around color and images without intent and says nothing. But while we are pretty well tuned to narratives and will immediately pick up on nonsensical writing, our sense of visual art is so immature that we don't notice how bad these pictures are. Them being "pretty" is good enough. In this sense, the success of "AI" "art" generators is simply the same story as writing and programming, but executed on a much grander scale against a far less discerning audience. This might explain why AI art preceded AI text in terms of culture recognition.

Conclusion

Thus, we see the same basic narrative played out in three different spaces. These AI systems aren't actually very good in an objective sense, but they are either a) better than a lot of people, b) operating in a space where people aren't very good at discerning quality, or c) both at once. In all cases, the success of AI, and the displacement of people isn't so much a function as the AI being "good" in any real sense, but rather the result of poor performance and poor taste by us, the people.

This sounds a bit like victim-blaming: the reason you lost your job to AI is because you were terrible at it to begin with. But that's not really the message here. While I view the situation with annoyance (how could we let it get this bad), I also view it with hope. These systems really aren't very good, and probably won't be for a long time. We can be better than them. But it will require both a) work to improve our own capabilities at programming, writing, and art, as well as b) working to improve our artistic taste. If we can, as a society, start to value art as more than just pretty pictures and start to value writing for its clarity, accuracy, and creativity, then these generators will be unable to satisfy us, artistically or informatively. And if we can stop focusing on the most rudimentary of programming problems, the apparent value of LLMs at "replacing" programmers will vanish.

In any case, there are simple, cultural fixes (beyond murderous rage and Twitter mobs) to the problem of AI taking jobs in various fields. We may yet survive this, if we can make the necessary cultural shifts. At least until somebody develops a real artificial general intelligence--at which point jobs will probably be the least of our concerns.

[1]: I place AI in scare quotes because I'm still an old-school sorta guy. Machine learning is interesting, but I don't really consider it "intelligence", certainly not yet anyway. I'm much more interested in classical AI than machine learning shenanigans. Particularly because machine learning isn't difficult--the neural network techniques that we still use today date back failed models of human cognition from the '70s (see Mind and Body: The Theories of Their Relation, Bain, 1973). What has changed is that we now actually have the computational power to train and use these models. Nvidia is probably very happy at how this has shaken out--were we to have developed a more intelligent way of doing AI than sheer brute force, their money printing machine wouldn't be nearly as effective!