This increase in scaling drastically changes the behavior of the model: GPT- can perform tasks that it was not explicitly trained on, such as translating sentences from English to French, with little or no training examples. This behavior was mostly absent in GPT- . Furthermore, for some tasks, GPT- outperforms models that were explicitly trained to solve those tasks, while for other tasks it falls short." LLMs predict the next word in a series of words in a sentence. And the following sentences, something like the famous self-completion, but on a really mind-boggling scale.
This ability allows them to write paragraphs and entire pages of content. But LLMs are limited because they don't always understand exactly what a human being wants. And that's Latest Mailing Database where Chat GPT improves on the state of the art, with the Reinforcement Learning with Human Feedback RLHF training that we mentioned earlier. How was Chat GPT trained? GPT- . was trained on massive amounts of code data and information from the Internet. Including sources like Reddit discussions, to help Chat GPT learn to dialogue and achieve a human style of response. Chat GPT was also trained using human feedback a technique called Human Feedback Reinforcement Learning so that the AI would learn what humans expected when they asked a question.
Training the LLM in this way is revolutionary because it goes beyond just training the LLM to predict the next word. A March research article titled: "Training Language Models to Follow Instructions with Human Feedback" explains why this is an innovative approach: “This work is motivated by our goal of increasing the positive impact of great language models by training them to do what a given group of humans wants them to do. By default, language models optimize for the next word prediction target, which is just an indicator of what we want these models to do.