Editor’s note: Marshall Brain – futurist, inventor, NCSU professor, writer and creator of “How Stuff Works” is a contributor to WRAL TechWire.  Brain takes a serious as well as entertaining look at a world of possibilities for Earth and the human race.  He’s also author of “The Doomsday Book: The Science Behind Humanity’s Greatest Threats.” 

Note to readers: WRAL TechWire would like to hear from you about views expressed by our contributors. Please send email to: info@wraltechwire.com.

+++

RALEIGH – ChatGPT was released on November 30, 2022. To use it, we are required to create an account with our name, email address, and phone number. In less than two months, ChatGPT had over 100 million registered users, making it the all-time king of resonance on the Internet:

ChatGPT sets record for fastest-growing user base

“ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after launch, making it the fastest-growing consumer application in history, according to a UBS study on Wednesday.”

ChatGPT was able to grow so quickly through a combination of word-of-mouth between friends, social media mentions, and high-intensity media coverage.

This is just one tiny measure of the explosion that ChatGPT has caused on the Internet. In just two months, ChatGPT has been able to redirect some of the largest companies in the world due to its many capabilities and thus its popularity.

If you have not tried out ChatGPT for yourself yet and seen what it can do, I would encourage you to read the following article and give it a try. It will help you understand why ChatGPT is so incredibly popular:

Taking a deep dive into AI ChatGPT: Here’s 20 examples of how you can make it useful

Let’s take a look at everything that is happening in the wake of ChatGPT’s explosion.

Microsoft makes the first move

Very soon after ChatGPT started taking off, Microsoft became one of the first tech giants to make a public move. Microsoft invested $10 billion into OpenAI, the creator of ChatGPT:

Microsoft reportedly plans to invest $10 billion in creator of buzzy A.I. tool ChatGPT

Then, just a few weeks later, Microsoft announced how it would integrate ChatGPT’s features into its Bing search engine and its Edge browser, as Microsoft CEO Satya Nadella explains in this WSJ demo video: https://www.youtube.com/watch?v=bsFXgfbj8Bc

 

He starts with: “Bing with AI is going to completely change what people can expect from search.” And then the demo shows what he means. Toward the end he is quite frank about one goal: he wants Bing to take some market share away from Google’s dominance in the search engine marketplace. Therefore…

Google announces Bard and several other things

Back in December Google looked at ChatGPT and went on red alert:

Google’s management has reportedly issued a ‘code red’ amid the rising popularity of the ChatGPT AI

“Sundar Pichai, the CEO of Google and its parent company, Alphabet, has participated in several meetings around Google’s AI strategy and directed numerous groups in the company to refocus their efforts on addressing the threat that ChatGPT poses to its search-engine business… In particular, teams in Google’s research, trust, and safety division, among other departments, have been directed to switch gears to assist in the development and launch of AI prototypes and products, The Times reported.”

A month later Google CEO Sundar Pichai announced its new AI chat product called Bard:

An important next step on our AI journey

“Two years ago we unveiled next-generation language and conversation capabilities powered by our Language Model for Dialogue Applications (or LaMDA for short). We’ve been working on an experimental conversational AI service, powered by LaMDA, that we’re calling Bard. And today, we’re taking another step forward by opening it up to trusted testers ahead of making it more widely available to the public in the coming weeks. Bard seeks to combine the breadth of the world’s knowledge with the power, intelligence and creativity of our large language models.”

LaMDA, you may recall, made huge headlines last June when a Google employee named Blake Lemoine announced that he felt LaMDA to be sentient. See this video for details: https://www.youtube.com/watch?v=kgCUn4fQTsc

The “general public” never really got to see LaMDA. It was more of an experimental system for use inside of Google. Bard will be something of a coming out party for LaMDA.

But Bard is not the only thing Google did. Google owns a company named DeepMind, acquired in 2014. DeepMind announced Sparrow:

Google To Unveil It’s ChatGPT Killer AI ‘Sparrow’ Soon; All You Need To Know

DeepMind also announced its chatbot for medicine, called Med-PaLM, in January. Its performance is impressive:

Google and DeepMind just launched MedPaLM, a large language model aligned to the medical domain

“A group of healthcare professionals determined that 92.6 percent of the Med-PaLM responses were on par with clinician-generated answers (92.9 percent).”

Then Google also invested hundreds of millions of dollars in another company named Antropic, which has its own chatbot named Claude:

Google invests almost $400 million in ChatGPT rival Anthropic

“Founded in 2021 by former leaders of OpenAI, including siblings Daniela and Dario Amodei, Anthropic AI in January released a limited test of a new chatbot named Claude to rival to OpenAI’s wildly popular ChatGPT.”

LaMDA, Bard, Sparrow, Med-PaLM, Claude… It will be fascinating to watch how all of this shakes out on Google’s side in 2023.

What about China?

The dominant search engine in China is called Baidu. Not to be outdone, Baidu has announced Ernie Bot (“Enhanced Representation through Knowledge Integration”):

Baidu leaps to 11-month high as it reveals plan to launch ChatGPT-style ‘Ernie Bot’

“It has since evolved into a series of advanced big models that are capable of handling a wide range of tasks, such as language understanding, language generation, and text-to-image generation”

That last part is interesting because Ernie Bot appears to be a combination of text generation and image generation in one system. We saw steps in this direction back in 2021, when a Chinese system named Wu Dao 2.0 combined the two and also used massive amounts of training data:

GPT-3 Scared You? Meet Wu Dao 2.0: A Monster of 1.75 Trillion Parameters

“Wu Dao — which means Enlightenment — is another GPT-like language model. Jack Clark, OpenAI’s policy director, calls this trend of copying GPT-3, “model diffusion.” Yet, among all the copies, Wu Dao 2.0 holds the record of being the largest of all with a striking 1.75 trillion parameters (10x GPT-3). Coco Feng reported for South China Morning Post that Wu Dao 2.0 was trained on 4.9TB of high-quality text and image data, which makes GPT-3’s training dataset (570GB) pale in comparison. Yet, it’s worth noting OpenAI researchers curated 45TB of data to extract clean those 570GB.” 

The fact that these massive combined models are becoming public now should shake things up even more

What about Apple and Facebook (Meta)?

Facebook parent company Meta did release a public chatbot prior to ChatGPT. It is called Galactica. But almost immediately Galactica became private again. Galactica focuses on research papers, meaning it has less training data of a higher quality. Why was it taken down?

Meta’s New Large Language Model Galactica Pulled Down Three Days After Launch

“On Tuesday last week, a demo of Meta’s much touted artificial intelligence model Galactica was unveiled to the public. Before the weekend, however, the tool was taken down following its tendency to generate what can only be termed gibberish and its inability to solve some rudimentary mathematical questions accurately. Galactica is Meta’s initiative to streamline scientific research discovery and ease scientific text generation. In a paper, Meta termed it as a tool to organize science and disclosed that it is trained on 48 million papers, textbooks and lecture notes, millions of compounds and proteins, scientific websites, and encyclopedias totaling 120 billion parameters.”

So it sounds like Galactica was not ready for prime time when released, and it accumulated a lot of bad press very quickly.

Apple has been dabbling in AI for some time, and even includes a “neural engine” as part of the chips that power its smart phones. The neural engine helps to speed up AI processing. But Apple does not have a chatbot anywhere on the horizon as of yet. See this recent statement (https://www.businessinsider.com/apple-tim-cook-ai-artificial-intelligence-affect-products-services-2023-2) from Tim Cook.

Open source competition to ChatGPT

One problem with everything described above is that all of it is “Big Tech”. And Big Tech can have a tendency to play things safe. For example, there is a report from one early user of the Bing chatbot indicating that the chatbot will not create cover letters for people applying for jobs. ChatGPT is happy to create cover letters, but for how long? There are many things ChatGPT would do in December that it will not do today. It seems like ChatGPT becomes more restrictive with each passing moment.

The counter to Big Tech would be open source chatbots. One of these efforts is called Open Assistant at https://open-assistant.io. You can see a description of this project’s goals in this video: https://www.youtube.com/watch?v=64Izfm24FKA

 

Another is called BLOOM:

“Large language models (LLMs) have made a significant impact on AI research. These powerful, general models can take on a wide variety of new language tasks from a user’s instructions. However, academia, nonprofits and smaller companies’ research labs find it difficult to create, study, or even use LLMs as only a few industrial labs with the necessary resources and exclusive rights can fully access them. Today, we release BLOOM, the first multilingual LLM trained in complete transparency, to change this status quo — the result of the largest collaboration of AI researchers ever involved in a single research project. With its 176 billion parameters, BLOOM is able to generate text in 46 natural languages and 13 programming languages. For almost all of them, such as Spanish, French and Arabic, BLOOM will be the first language model with over 100B parameters ever created. This is the culmination of a year of work involving over 1000 researchers from 70+ countries and 250+ institutions, leading to a final run of 117 days (March 11 – July 6) training the BLOOM model on the Jean Zay supercomputer in the south of Paris, France thanks to a compute grant worth an estimated €3M from French research agencies CNRS and GENCI.”

Today, BLOOM is not as good as ChatGPT in terms of its responses, but it may be just a matter of time until the two reach parity.

Where did all these chatbots come from?

We have discussed many different systems in this article:

  • GPT-3
  • ChatGPT
  • Bing
  • LaMDA
  • Bard
  • Sparrow
  • Med-PaLM
  • Claude
  • Ernie Bot
  • Mu Dao 2.0
  • Galactica
  • Open Assistant
  • BLOOM

They all work in approximately the same way. How can all of this AI innovation be happening at once? The reason is because there is a scientific paper published in 2017 that laid out the core technology used in all of them:

Attention Is All You Need

It creates the concept of a Transformer, which you can learn more about here (https://en.wikipedia.org/wiki/Transformer_(machine_learning_model) and here: https://www.youtube.com/watch?v=ZXiruGOCn9s

In a decade, we will probably look back on this paper in the same way we look back on Einstein’s papers describing special and general relativity. Everything we see happening with chatbots today starts with this paper.

Then we need to have a ton of training data for the chatbot. Two common starting points today are called the Common Crawl and the Pile. Imagine if a robot spidered the entire internet and saved all the text it found in a text file. It would be a massive amount of text. The Common Crawl and the Pile maintain these storehouses of text for anyone to use:

Then each vendor can add to or customize the data.

GPT-3 is essentially this. They took the “Attention is all you need” paper and trained it on the Common Crawl or similar.

ChatGPT’s innovation is that they took GPT-3.5 and then they trained it to be more human-like. They gave to model thousands of sample queries that a human might ask, and then responses that humans might want to receive. This makes ChatGPT sound much more human-like.

Then they have been using all of us – all of the users of ChatGPT today – to gather data and help refine the model even more.

The only real barrier to creating these chatbots is the large amount of computation needed to train the model and then run the model. The cost of this computation can run into the millions of dollars.

Since nothing that ChatGPT is doing is really hidden or secret, it is easy for lots of people to copy it.  It will be fascinating to see how many different chatbots there are vying for our attention by the end of 2023.

Who is going to win?

I gave this query to ChatGPT: “Tell me a story about ChatGPT beating out Google.”

Here is the story ChatGPT created:

“Once upon a time, in a world where technology dominated everything, there lived a language model known as ChatGPT. ChatGPT was created by OpenAI, a company known for developing cutting-edge artificial intelligence systems.

 One day, a competition was announced where the best search engine would be crowned the winner. Google, the giant of the search engine world, was confident that it would win easily. However, ChatGPT was determined to give it a run for its money.

 As the competition began, Google’s algorithms scanned through millions of pages in a matter of seconds to provide answers to the search queries. ChatGPT, on the other hand, used its advanced language processing skills to understand the context and intent behind each query. It then used this understanding to provide highly relevant and accurate answers.

 As the competition progressed, people started to take notice of ChatGPT’s abilities. Its answers were not only faster but also more insightful and personalised than Google’s. ChatGPT had an answer for every question, and its answers were so good that people were amazed.

 Soon, ChatGPT’s popularity skyrocketed, and it became the go-to search engine for millions of people. Google was left in the dust, and it was clear that ChatGPT was the winner of the competition.

 The news spread like wildfire, and soon everyone was talking about ChatGPT and its incredible abilities. OpenAI was hailed as the company that had created the future of search engines, and ChatGPT became a household name.

 From that day on, ChatGPT continued to improve and innovate, and it remained the top search engine for years to come. Google, on the other hand, was left to play catch up, but it could never quite match ChatGPT’s brilliance.

 And that, my friend, is the story of how ChatGPT beat out Google and became the king of search engines.”

It is an interesting future to ponder.

Sources