Book Review: What Is ChatGPT Doing … and Why Does It Work? 

https://www.goodreads.com/book/show/123245371-what-is-chatgpt-doing-and-why-does-it-work

ChatGPT is a sophisticated computational model of natural language processing that is capable of producing coherent and semantically meaningful text through incremental word addition. This is accomplished by scanning and locating every occurrence of the text in question from an extensive corpus of human-authored text.Afterward, the system generates a ranked list of potential subsequent words, accompanied by their corresponding probabilities. The fascinating thing is that when we ask ChatGPT with a prompt like “Compose an Essay”, all it does is ask, “Given the text so far, what should the next word be?” over and over again, and then it adds a word one after another. However, if the model uses the highest-ranked words all the time, the essay it will produce will not be creative. The system utilizes a “temperature” parameter to decide how often lower-ranked probable words will be used. “Temperature” is one of the tunable parameters of the model, ranging from 0 to 1, where 0 produces the flattest essay with no creativity and 1 is the most creative.

The foundation of neural networks is numerical. Therefore, in order to utilize neural networks for textual analysis, it is necessary to establish a method for numerical representation of the text. This concept is fundamental to ChatGPT, which employs an embedding to attempt to represent the entities as a set of numbers. To create this kind of embedding, we must examine vast quantities of text to determine how similar the contexts in which various words appear are. Word embedding discovery requires beginning with a trainable job using words, such as word prediction. For instance, to solve “the ___ cat” problem, let’s say among the 50000 most common words used in English, “the” is 914 and “cat” (with a space before it) is 3542. Then the input is {914, 3542}. Output should be a list of approximately 50,000 numbers representing the probabilities for each of the potential “fill-in” terms. By intercepting the embedded layer, the neural network reaches its conclusion about which terms are appropriate.

Human language and the cognitive processes involved in its production have always appeared to be the pinnacle of complexity. However, ChatGPT has shown that a completely artificial neural network with a large number of connecting nodes resembling connected neurons in the brain can generate human language with amazing fidelity. The fundamental reason is that language is fundamentally simpler than it appears, and ChatGPT effectively captures the essence of human language and the underlying reasoning. In addition, ChatGPT’s training has “implicitly discovered” whatever linguistic (and cognitive) patterns make this possible.

Syntax of language and parse trees of language are two well-known examples of what can be considered “laws of language,” and there are (relatively) clear grammatical norms for how words of various types can be combined, such as nouns may have adjectives before them and verbs after them, but two nouns usually can’t be immediately next to each other.  ChatGPT has no explicit “knowledge” of such principles, but through its training, it implicitly “discovers” and subsequently applies them. The most crucial information presented here is that a neural net can be taught to generate “grammatically correct” sequences, and there are several methods to deal with sequences in neural nets, including the use of transformer networks, which is what ChatGPT accomplishes. Like Aristotle, who “discovered syllogistic logic” by studying many instances of rhetoric, ChatGPT is projected to do the same by studying vast quantities of online literature with its 175B parameters.