Generative Pre-trained Transformer 3 (GPT-3; stylized GPT·3) is an autoregressive language model that uses deep learning to produce human-like text.

Snippet from Wikipedia: GPT-3

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor GPT-2, it is a decoder-only transformer model of deep neural network, which uses attention in place of previous recurrence- and convolution-based architectures. Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant. It uses a 2048-tokens-long context and then-unprecedented size of 175 billion parameters, requiring 800GB to store. The model demonstrated strong zero-shot and few-shot learning on many tasks.

Microsoft announced on September 22, 2020 that it had licensed "exclusive" use of GPT-3; others could still use the public API to receive output, but only Microsoft has access to GPT-3's underlying model.


  • kb/gpt-3.txt
  • Last modified: 2022/08/13 10:59
  • by Henrik Yllemo