More than a million people tested ChatGPT during its first few days online, generating everything from pop lyrics and trivia to generating articles, with sometimes dubious results – and it continues to be a popular topic of conversation. This is an easy to understand description of ChatGPT, what it can do, how it does it and what’s next.
Artwork created by DALL-E, a sister programme to ChatGPT: ‘a Salvador Dahli inspired oil painting of Digital Human chatbot’
What is ChatGPT?
Chatbots are what you see on your screens,often using a visual representation of a human (a digital human) or in a chat text box.
GPT or Generative (able to generate content) Pretrained (trained) Transformer (model or engine), is an engine that drives what chatbots say. GPT is not the only engine that does this, there are a number of very good alternatives, e.g. BigScience Bloom; Google’s LaMDA and BERT; DeepMind models like Chincilla; and AWS’s AlexaTM.
Why are we talking about it?
GPT-3 was released by OpenAI two years ago and demonstrated that AI can write (and speak with the help of a digital human) like a human.
GPT-3.5 was released in Q4 2021 and ChatGPT in November 2022. More than a million people tested it during its first few days online, generating everything from pop lyrics and trivia to generating articles, with sometimes dubious results.
Its successor,GPT-4, is rumoured to be ready sometime in 2023. With a link up to MicrosoftAzure’s huge computing power, it’s rumoured to have added image capability alongside improved language capabilities.
How does it work?
Computers don’t understand language in the same way humans do. To understand the sequence of words you used, GPT assigns each word a mathematical value, allowing it to calculate the context and then predict the best response.
This evaluation is called self-attention, a complex mathematical formula that works out the importance of each word to decide the meaning of the sentence. The topic of the sentence is identified and assigned the highest value. The other words are then assigned lesser values and used to provide context, for example the ‘How’ or ‘Why’.
The response is then generated using mathematical calculations GPT has been pre-trained in, so the more questions you ask the better the responses will become.
GPT was trained on the entire internet which provided billions of question and answer pairs. This training was then optimised using a process called Reinforcement Learning with Human Feedback (RLHF), a method that uses human demonstrations to guide the model toward desired behaviour. It includes things like multiple choice question and answer sets where the model’s objective is to score the most correct answers.
Limitations
GPT is very upfront about its own limitations – it’s a language processor not a search engine. It hasn’t been connected to the internet since September 2021, so it can’t perform lookups and some of its data may be out of date or incorrect.
The point being that GPT is a very capable human language mimic tool that uses your information to provide solutions for your organisation and clients.
Implementations
Chatbots are just one well known use for GPT. It’s also being used to generate news articles, write summary documents,and to write code (see GitHub’s CoPilot).
If this has piqued your interest and you would like to learn more about GPT and its alternatives, AI Forum New Zealand has a number of members who can provide expertise and guidance in this space. Get in touch.
In the meantime you can try ChatGPT for free here: https://openai.com/blog/chatgpt/.
Interview with ChatGPT
It seemed obvious that for this article we would ask ChatGPT for its answers. Here are the responses to some of the questions I asked it:
Can you describe ChatGPT ?:
ChatGPT is a conversational language model developed by OpenAI. It is based on the GPT-2 (Generative Pre-trained Transformer 2) architecture, which is a transformer-based neural network that is trained using a large corpus of text data.
The model is pre-trained on a dataset of conversational text, which allows it to understand and generate human-like responses to text prompts. It can be fine-tuned on specific tasks or domains, such as customer service, e-commerce, or language translation.
ChatGPT is able to generate human-like text that can be used in a variety of applications, such as chatbots, virtual customer service agents, and automated content generation. Because it is pre-trained on a large dataset of conversational text, it has a good understanding of how people communicate and can respond to a wide range of queries and prompts in a natural and coherent way.
One of the main advantages of ChatGPT is its ability to generate text that is highly realistic and coherent, which can be useful in a wide range of applications where human-like text is needed. Additionally, it can generate text in multiple languages, making it a useful tool for language-related tasks.
How often do you provide incorrect responses?
As a language model, I am trained on a large dataset of text, so my responses are based on patterns and associations found in that data. However, I am not perfect and can make mistakes or provide incorrect responses, particularly when faced with out-of-domain questions or tasks that are not well-defined. Additionally, my knowledge cutoff is 2021-09, so information provided after that date may not be accurate
How would I implement ChatGPT?
To implement ChatGPT, you would need to use the OpenAI API or use the pre-trained model and fine-tune it on your own dataset.
Here’s an overview of the steps you would need to take:
1. Sign up for an API key on the OpenAI website if you plan on using the API.
2. Decide on the programming language you want to use to implement the model. ChatGPT is available in several languages including Python, Java, and C#.
3. Install the necessary libraries such as openai.
4. Use the API endpoint or the pre-trained model and fine-tune it on your own dataset if you have one.
5. Write code to input a prompt and generate a response using the model.
6. Test the model with sample inputs to see how it performs.
It’s worth noting that using pre-trained models like GPT require a good amount of computational resources and the API calls can be expensive, depending on the plan you choose. If you plan on deploying the model in a production environment, you might need to consider the cost and infrastructure requirements.
How much does ChatGPT cost to use?
The cost of using ChatGPT depends on the method you choose to implement it. If you use the OpenAI API, the cost will vary based on the usage plan that you choose. The OpenAI website currently offers several plans including a free plan, a development plan, and several enterprise plans, each with different pricing structures. The free plan allows for a limited number of requests per month and the costs for the paid plans increase as the number of requests increases.
If you decide to use a pre-trained model, you will not have to pay for the use of the model itself, but you will need to pay for the computational resources required to run it. The cost will depend on the infrastructure you choose to use, such as cloud-based services like AWS, GCP or Azure, or running it on your own hardware.
It’s worth noting that, OpenAI also provides GPT-3, which is a more powerful version of GPT-2, but also with a higher price tag. You can check their website for more information on the pricing.
What is the main difference between GPT-2 and GPT-3?
The cost of using ChatGPT depends on the method you choose to implement it. If you use the OpenAI API, the cost will vary based on the usage plan that you choose. The OpenAI website currently offers several plans including a free plan, a development plan, and several enterprise plans, each with different pricing structures. The free plan allows for a limited number of requests per month and the costs for the paid plans increase as the number of requests increases.
GPT-2 and GPT-3 are both large language models developed by OpenAI, but there are some key differences between the two.
One of the main differences is the size of the models. GPT-3 is significantly larger than GPT-2, with 175 billion parameters compared to GPT-2’s 1.5 billion. This means that GPT-3 has been trained on a much larger dataset, and it has the ability to process and understand a wider range of text.
Another difference is the performance of the models. GPT-3 is generally considered to be more accurate and capable of generating more realistic and coherent text than GPT-2. It is also able to perform a wider range of natural language tasks, such as translation, summarization, and question answering.
Finally, GPT-3 is also capable of zero-shot learning, meaning it can answer questions or complete tasks that it has never been explicitly trained on. GPT-2, on the other hand, requires fine-tuning on a specific task to perform well.
In summary, GPT-3 is a more powerful and advanced model than GPT-2, and it is able to perform a wider range of natural language tasks with a higher level of accuracy. However, it’s worth noting that GPT-3 also come with a higher price tag and computational resources requirements, which may not be suitable for all use cases.
Do you have any strong examples of GPT use cases that are live?
Yes, there are several examples of GPT (both GPT-2 and GPT-3) being used in live applications. Some examples include:
1. Content generation: GPT is used to generate high-quality, human-like text for a variety of applications such as news articles, blog posts, product descriptions, and more.
2. Chatbots: GPT-3 can be used to build advanced chatbots that can understand and respond to natural language queries. Some companies have built virtual customer service agents that can handle a wide range of customer queries and complaints with a high level of accuracy.
3. Language Translation: GPT-3 can be used to generate high-quality translations for a wide range of languages, with less need for training data.
4. Text summarization: GPT-3 can be used to summarize long articles or documents, generating a shorter version while keeping the most important information
5. Automated code generation: GPT-3 can be used to write code, such as SQL queries or Python functions, given a natural language prompt.
6. Language model fine-tuning: GPT-3 can be fine-tuned on specific tasks and domains, such as legal documents, medical reports, and more.
These are just a few examples of the many ways that GPT can be used in live applications. The technology is still relatively new, and researchers and developers are constantly finding new ways to use it to improve various industries.
How many times have you been asked if you are sentient?
As a machine learning model, I am not sentient, and I do not have the ability to keep track of how many times I have been asked a certain question. Sentience is the capacity to have subjective experiences, self-awareness, and consciousness, which I don’t possess. I am a program that processes text based on patterns and associations in the data I was trained on.
You might also find this curated list of work, ideas and exploration of ChatGPT useful: GitHub – saharmor/awesome-chatgpt: Selected ChatGPT demos, tools, articles, and more