Chatbots put to the scripting test

Beginner's Mistake

A large language model (LLM) uses what is known as a transformer architecture, hence the name "generative pretrained transformer" (GPT). Put simply, an LLM represents each word used in a sentence as a series of mathematical vectors. The trained model comprises several layers of transformer encoders and decoders that put the vector groups of words from a sentence or word block in a suitable context.

As the model learns, it creates encoders that suggest a connection between the vectors of terms (e.g., "ship" and "water" or "helicopter" and "flying"). During the learning phase, the LLM tries to build correct sentences. It compares these with the correct answers given by the trainer. The difference between the LLM's response and the correct response is transferred to the transformer layers and optimizes the LLM's function. A generic LLM therefore becomes proficient in one or more languages with which it has been trained and has knowledge of the data used in the training.

For example, a model such as ChatGPT 3.5 was trained with 175 billion parameters – its state of knowledge dates back to September 2021. GPT-4 already has more than a trillion parameters. After completing the basic training, an LLM can be improved with further levels of knowledge. These difference models are known as a low-rank adaptation of large language models (LoRAs). For example, a LoRA can help a generic language model learn how Python programming works without having to retrain the entire LLM.

This mode of operation also shows the weaknesses of LLMs: They are not creative and do not generate any new information. They only use existing knowledge and reformulate it to match the question. Of course, they have a massive amount of information that no single person has at their disposal, but the LLM is again limited to the information with which it was trained.