The GPT in Chat GPT stands for “Generative Pre-Trained Transformer.” And although this may sound like a sort of 90s TV cartoon Megazord, it really means that it is a type of language model trained on large amounts of text data. While Chat GPT is a conversational app, a specific tool, the GPT is the brain that powers it under the hood.
In essence, Chat GPT is a chatbot designed for conversational interactions with humans. It is trained to answer a question or instruction (a prompt), and to respond to follow-up questions in a detailed manner, admitting mistakes, challenging incorrect premises, and considering content filters to avoid inappropriate use.
Chat GPT, the interface product we use and love (or hate and fear, depending on where you stand), is owned by OpenAI, and there are now many similar competing products sprouting everywhere. GPT, the processing model behind some of them, is a type of Large Language Model (LLM), an Artificial Intelligence that is trained with large amounts of data to understand and generate human language.
Let’s see what is it about in more detail.
Generative Pre-Trained Transformers
Now you know that Chat GPT and GPT are two different, but complementing things. Chat GPT has been using different iterations of GPT throughout its development, now using GPT 3.5 for its free-to-use version, and GPT 4, the most advanced model so far, for paid subscribers only. But what is exactly this Generative Pre-trained Transformer stuff?
Generative means that it is able to create new text, that is coherent and contextually relevant, depending on whatever input you feed it. This is what gives it its conversational attributes, which make it seem like you are talking to a reacting person and not a predetermined static robot (like your old-fashioned, standard banking bots, for example).
Pre-trained refers to the large amounts of text data that are fed to the model so it can learn to respond to prompts in a human fashion. This learning is done with no specific task in mind, other than familiarizing the model with how human language works: its grammar, patterns, and semantic uses.
Transformer indicates the deep learning architecture that actually powers GPTs (and other AI models). It was actually proposed fairly recently, in 2017, through a research paper by Ashish Vaswani et al. called Attention Is All You Need.
The principle behind it is a self-attention mechanism, which weighs and scores the importance of each word in a sentence, and how it relates to and depends on other words in that same sentence. Its introduction meant higher-skilled language models that resembled more human language, which need less time to train than older models.
Large Language Models And Other Forms Of AI
GPTs are just one specific type (developed by Open AI) of a larger category of Large Language Models. Simply put, LLMs are AI models trained on large data to understand and generate human language.
GPT is not the only one of these LLMs, with several companies designing and developing their own models that use similar or different architectures.
BERT and T5, for example, are LLMs developed by Google with predictive and generative language purposes. And RoBERTa is an enhanced model of BERT made by Facebook AI for natural language understanding and sentiment analysis.
In the end, LLMs are just one among many possible applications of Artificial Intelligence systems. AI is a whole computer science field devoted to developing intelligent systems capable of learning, which are able to perform tasks that would normally require human intelligence.
When Could AI Overtake Human Performance is a whole different question.