End of last year, I was flabbergasted by the incredible breakthrough we witnessed with the opening of chatgpt, the free preview of a chat interface backed by a GPT-3.5 model launched by OpenAI. It was never seen before and everybody went mad about it. We will take that train with Smile and here's why.
The wrong horse
Even though I am very enthusiastic about technology, I always see it with a business filter glasses on and asking to myself what impact it could have in our daily business or our customers' ones. I mean, technology is only interesting when it actually brings something to solve a problem aka the use case. So rather than rushing I prefer to sit and think for a while and avoid betting on the wrong horse too fast.
Last year for example, people were all about the Metaverse and I tried hard to see how this could be interesting business-wise or simply interesting at all but I was definitely not convinced and it is not surprising Meta simply decided to drop it despite the incredible amount of money they put on the table and I am happy I didn't make Smile jump on that boat and lose money as well (or make our customers waste their time and money), wandering in the land of avatars with no legs.
For GPT it's a whole different story.
Monitoring and using AI for a while
The industry was not ready and companies dreaming about AI are (were) fantasizing how 'intelligent' AI was
I have to first say I have been monitoring AI topic for quite some time now, in fact since I took over my role of Innovation Director at Smile in 2018 (after that incredible neopixl dventure I started after my boss told me he wasn't believing in mobile). Since that moment, together with Alain Rouen, Thibault Milan and a bunch of other adventurers of Innovation like Fabien Gasser or Patrice Ferlet, we were interested in NLP (Natural Language Processing). At first with voice assistants, like Snips (poke Yann Lechelle ),(poke Yann Lechelle), then chatbots, like Clevy, and continued our journey with Amazon Web Services (AWS) and AI on the edge using Google in our Warhol experiment and LSTM models for our Koda project among many others projects. In all these projects, we learned several things:
- We could only succeed if we had data. A LOT of data. Data that we could classify either ourselves or using services like AWS Mechanical Turk. This means that AI projects where hard to setup.
- Artificial Intelligence is all but intelligent. I enjoyed that talk from Dr. Luc JULIA at GEN conference where he was explaining how AI works and why this cannot and won't ever take over human intelligence.
- The industry was not ready and companies dreaming about AI are (were) fantasizing how 'intelligent' AI was and are (were) almost always lacking data and so this was very often a show stopper.
With LLM like GPT-3 and 4 as used by Chat-GPT, the data that has been used to train the model is huge and the incredible fine tuning and reinforcement learning that has been conducted by OpenAI has led to amazing results never achieved before and progressing fast, real fast.
But let's put the buzz aside and get back to what I said at the beginning: is this something that will be useful and why?
What is LLM, how does it work?
For the model, there is no truth. Just a most probable sequence of words.
I cannot pretend to explain all concepts in a few words and there are plenty of great learning material out there to learn from. But just to be sure people understand our vision at Smile I'll say this: Large Language Models are a type of AI that consists of Language Model that are trained on a very large dataset which is simply just text taken from Wikipedia or from Common Crawl for instance. A Language Model is a probabilistic model that can predict what can be the next word based on the previous words that constitutes the context. The training is first unsupervised then fine-tuned with supervised training which means humans answers questions and these QA pairs are used to train the model and finally there is a reinforcement training that is performed where the model generates different contents from a given context and this contents are rated by humans.
What is really important to understand in all this is: there is no understanding of the text. It's just based on probabilities that the generated content can be what is expected. For the model, there is no truth, just a most probable sequence of words. This explains why sometimes it generates content that is wrong : there is either not enough training content on the topic or not enough context provided by the user to the model to generate a correct answer. But because the training dataset is very large, there is a good chance that the generated content is correct.
Another thing that is important to understand is that is call the prompt. The prompt is what you provide to the model to guide him to generate content. It will "weight" the various possible follow-up contents. It can be as simple as a question but it can also be large chunks of contents that adds to the trained model. In fact, you can provide up to 32.000 tokens of context to ChatGPT 4 which represents roughly 24.000 words or 80 pages. Furthermore, ChatGPT keeps the history of the conversation (your prompts and his answers) to generate future answers which makes a context richer and richer and allows the model to refine the answers given. So carefully crafting the prompts is very important to guide the model in the right direction and this is called prompt engineering.
The capabilities of the model are in fact quite large as well and it ranges from summarizing text to code generation or sentiment analysis.
What's in for Smile and our customers
At first I must admit that I was not sure it was not another toy that would once again fall short. But after playing always further with ChatGPT I understood that using the chat by itself was not the final goal but was rather a part of a larger toolchain to achieve a larger purpose.
For instance, you can ask ChatGPT to help you generate prompts for another generative AI such as MidJourney by teaching GPT what are the parameters you can pass to an /imagine query as well as some good prompt answer examples based on a given GPT prompt.
Or you can ask chatGPT to generate source code or correct source code that you could then copy paste in your project to either speed things up or improve your code.
So far so good. But to what extent could we use ChatGPT to actually help in some business case which (if you recall from the beginning of the article) is my key for my decision making to engage or not with a technology at Smile.
Auto-GPT to automate prompt engineering
In fact, what I have learned from all the experiments I have made, is that it is quite tricky to write the best prompt to have ChatGPT generate exactly what you expect. Of course there are some tricks but once you want to get to more elaborate scenarios and have ChatGPT write a whole project source code for example. You have to "prime" the chat in a very specific way and it can take a very long time for potentially no result.
But then I came across Auto-GPT. And this thing literally blew my mind away. Auto-GPT is able to (taken from a Weaviate blog post)
independently define objectives needed to complete the assigned task without (or with reduced) human feedback and intervention. This is because of its ability to chain together thoughts.
Chain of thought is a method that is used to help language models improve their reasoning. It breaks down tasks into the intermediate steps that are needed in order to complete it. The program will then run continuously until it completes these tasks. For example, if it is working on a coding project, it will debug the code as it goes. [...] At configuration, Auto-GPT is given a list of tools such as a code executor, google search API, or a calculator
There, GPT is used as part of a continuous chain of thoughts and actions that corrects itself to achieve, by itself, a goal. This actually gets interesting because from a chat-oriented interaction with the language model, we get to an automated ,almost magical and scary bot, that can be used as part of the tooling we could use to implement or improve projects for our customers.
Use a privately owned corpus to answer questions
Another use case my colleague Alain Rouen explored very recently was to see if we could use private data to feed to GPT so that the model can answer questions based on the knowledge base we have provided. This poses quite a big challenge. If you carefully read ChatGPT documentation you will see that data you provide to the chat can be used as part of the training data for the model to improve itself (well, very recently they made it possible to disable this but it's the case by default) . This means that if you provide ChatGPT with confidential figures, chances are that these confidential informations might be part of an answer sent to another user.
To avoid this kind of situation, he deployed his own instance GPT-4 model using Microsoft Azure OpenAi services. This way he could get rid of any traffic slowdowns on the public instance API and keep chat discussion private (and not used for any retraining purpose). Then the general idea is what has been well explained by Dataiku in this excellent blog post:
- Index a collection of documents (it can be anything as long as you can extract text) using embeddings (a representation of a text as a long series of numbers)
- Store these embeddings in a vector database such a Chroma for instance (other popular vector database are Weaviate or Pinecone that both, interestingly enough, raised quite some money during the last few days...)
- Query the database to find relevant facts
- Ask GPT to generate an answer based on a prompt template that must contain, a refined question (also by GTP), and any relevant document extract.
This is well explained in this schema from the blog post: