The Algorithmic Bridge

Share this post

GPT-4 Rumors From Silicon Valley

thealgorithmicbridge.substack.com

GPT-4 Rumors From Silicon Valley

People are saying things...

Alberto Romero
Nov 11, 2022
41
32
Share this post

GPT-4 Rumors From Silicon Valley

thealgorithmicbridge.substack.com
“GPT-4”. Credit: Author via Midjourney

GPT-4 is possibly the most anticipated AI model in history.

In 2020, GPT-3 surprised everyone with a huge performance leap from GPT-2 and set unprecedented expectations for its successor.

But for two years OpenAI has been super shy about GPT-4—letting out info in dribs and drabs and remaining silent for the most part.

Not anymore.

People have been talking these months. What I’ve heard from several sources: GPT-4 is almost ready and will be released (hopefully) sometime December-February.

Twitter avatar for @Killa_ru
Igor Baikov 💙💛 @Killa_ru
OpenAI started to train GPT-4. Release is planned for Dec-Feb.
4:35 PM ∙ Sep 2, 2022
29Likes4Retweets

A brief review of my GPT-4 predictions

This isn't my first post on GPT-4. I think it's worth revisiting (briefly) the info we've been given these years and my old predictions before getting into the new stuff.

In May 2021 I wrote an early analysis of GPT-4's likely improvements from GPT-3. There wasn't any public info so I based my ideas on GPT-3's shortcomings + general AI trends at the time.

I argued GPT-4 would be significantly larger, better at multitasking, less dependent on good prompting, and with a larger context window than its predecessor.

Although these features sound reasonable today, I look back at them as quite conservative—who would’ve guessed the field would take off as it did since 2020?

In September 2021 I wrote about a possible OpenAI-Cerebras partnership (Cerebras’ WSE-2 is the industry’s largest AI processor—by far). Andrew Feldman, Cerebras’ CEO, told Wired that “from talking to OpenAI, GPT-4 will be about 100 trillion parameters.” People were unsurprisingly excited about this one.

But not much later, Sam Altman, OpenAI’s CEO, denied the 100T GPT-4 rumor in a private Q&A. He asked attendees to not share the info so I waited until April 2022 to reveal the updates and make a new set of predictions that I laid out here.

As it turned out, they were focused on optimizing data and compute (per Chinchilla-like compute-optimal laws) instead of parameters (GPT-4 would be roughly GPT-3’s size). Also, the model would be text-only, and possibly more aligned with human preference (like instructGPT).

Much more boring than the 100T model, but more reasonable given the trends that were driving the field at the time.


Well, all that is gone.

Most of this early information (even that which came directly from Altman himself) is outdated. What I’ve read these days is significantly more exciting and, for some, frightening.

Let’s see what public info is being shared in Silicon Valley’s exclusive circles, how plausible it seems, and what I think about it.

The Algorithmic Bridge is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

An open secret: GPT-4 is almost ready

I first heard OpenAI was giving GPT-4 beta access to a small group close to the company on August 20, when Robert Scoble tweeted these:

Twitter avatar for @Scobleizer
Robert Scoble @Scobleizer
A friend has access to GPT-4 and can’t talk about it due to NDAs. Was about to tell me everything about it and then remembered who he was talking to. His emotion alone told me it is next level.
Twitter avatar for @kareem_carr
Kareem Carr @kareem_carr
Finally tried out the GPT-3 model from OpenAI. The green text is unaltered output generated by the AI. https://t.co/NxveJhduGn
4:19 AM ∙ Aug 20, 2022
158Likes39Retweets
Twitter avatar for @Scobleizer
Robert Scoble @Scobleizer
@__Seethrough He said it is just as exciting a Leap as GPT-3 was. Insane.
7:35 PM ∙ Aug 20, 2022
9Likes4Retweets

This is, of course, anecdotal evidence. It could very well be biased by excitement, cherry-picking, or the lack of a reliable testing method.

But, if GPT-4 is to GPT-3 what GPT-3 was to GPT-2—which isn't at all unreasonable given that OpenAI took its time with this one—this is big news.

Think about it: as models improve, we need larger leaps in performance to feel a similar degree of excitement. Thus, if we assume Scoble’s source relies mainly on perception (in contrast to rigorous scientific assessment) those claims may imply a significantly larger leap than GPT-2 → GPT-3.

That's a >100x improvement (in terms of parameters—which is a commonly used metric—GPT-3 is ~100x GPT-2).

On November 8 Scoble did it again:

Twitter avatar for @Scobleizer
Robert Scoble @Scobleizer
Disruption is coming. GPT-4 is better than anyone expects. And it is one of several such AIs that will ship next year.
6:39 AM ∙ Nov 8, 2022
3,018Likes351Retweets

More hype.

Two days ago, Altman tweeted this not-so-cryptic captioned image:

Twitter avatar for @sama
Sam Altman @sama
Image
6:49 PM ∙ Nov 9, 2022
593Likes42Retweets

One thing is an external person’s anecdotal claim and another is the company's CEO talking about the Turing test as if it were a kids’ game. This is interesting because of two reasons.

First, the Turing test has historical relevance—it’s a symbol of the limits of machine intelligence. No AI system can pass it reliably, although the most advanced would surely put up a fight. I wouldn’t be surprised if GPT-4 could pass it amply.

The second reason deflates the importance of the first: the Turing test is generally regarded as obsolete. In essence, it’s a test of deception (fooling a person) so an AI could theoretically pass it without possessing intelligence in the human sense. It's also quite narrow, as it's exclusively focused on the linguistic domain.

New proposals, like the Winograd schema challenge (an updated version of the test), non-related tests like the Coffee test, and the recently suggested embodied Turing test, could better serve as benchmarks of machine intelligence.

Could GPT-4 pass these—arguably more adequate—tests (at least those the model would be naturally prepared to face)? Could GPT-4 force us to reimagine AI benchmarking? Could GPT-4 send us into the fuzzy stage in which discussions over AGI aren’t placed in the future but in the present?

These are the questions we'll have to answer.


As you can tell, neither Scoble nor Altman are sharing (m)any details. Everyone is under NDA. OpenAI doesn’t want killjoys: GPT-4 will be the big news in the field whenever it’s released and the company plans to, once again, steal the show.

GPT-4: A glimpse of what’s coming

But there's more.

Even with NDAs hiding the good stuff, I had access to a more detailed description of what GPT-4 will (may) be like.

I can’t personally assess the reliability of the source (it was shared in a private subreddit). And neither can Gwern, who shared it (he said: “No idea how reliable”).

It may not be (completely) true but it's the best we've got. Take it with a (big) grain of salt:

Reddit comment screenshot. Credit: Igor Baikov (shared by Gwern)

Baikov highlights these three features:

First, GPT-4 would be very large and sparse (sparsity means that not all neurons are active at any given time).

This is surprising given OpenAI’s history of building dense models. GPT-4's sparse nature would deem meaningless a direct size comparison with the most popular—dense—models (e.g. GPT-3, LaMDA, and PaLM).

It's nevertheless great news: sparsity is, in my opinion, the future of AI (inspiration from neuro).

Also, if by “colossal” Baikov means ~100T parameters, then that's really big regardless of its sparse nature (other sparse models are in the order of a few trillion params, e.g. Switch Transformer, 1.7T, and Wu Dao, 1.75T).

Second, GPT-4 would be multimodal, accepting text, audio, image, and possibly video inputs. Given the already high ability of language models and this year’s wave of audiovisual generative AI (where there's still so much to explore), it makes sense to continue this venue.

Like sparsity, multimodality is the future of AI—not just because our brain is multisensory but because the world is multimodal.

Third, a training cost of “~$.e6” ($1-10M) is significantly lower than that of GPT-3.

This would mean OpenAI found a way to reduce costs. A few possibilities here: better optimization at the software level, faster chips, and/or less computing power used.

A more intriguing option would be the aforementioned partnership between OpenAI and Cerebras to train GPT-4 on the CS-2 (powered by the WSE-2), freeing developers from optimization heuristics. Cerebras’ CS-2 is so large that there’s no need for such performance-hindering tricks—not even to train a “colossal” model.

My thoughts on all this

When Sam Altman revealed in September last year that GPT-4 would be ~GPT-3’s size, text-only, and they were focused on optimizing compute and data, I (like many others) felt a pang of disappointment.

You know I'm of the opinion we should progress mindfully, but I’m excited about AI’s future—and another “GPT-3” wasn’t going to cut it.

The news I just shared with you tells a whole different story.

Of course, OpenAI’s execs and their friends (the only sources) have incentives to hype GPT-4 (if anything, OpenAI will need more money to get to AGI). Whether this hype-inducing news is true is yet to be seen.

What I think: I’m always skeptical, but also open-minded. I can't give these sources any first-hand credit, but I don't dismiss them either. I don’t know what's true and what's an exaggeration—or even plain false—but nothing I’ve shared is too wild as to be unbelievable.

OpenAI has changed course with GPT-4 a few times throughout these two years, so everything is in the air. We’ll have to wait until early 2023—which promises to be another great year for AI.

The Algorithmic Bridge is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

32
Share this post

GPT-4 Rumors From Silicon Valley

thealgorithmicbridge.substack.com
32 Comments
Fred Hapgood
Nov 12, 2022·edited Nov 12, 2022Liked by Alberto Romero

>> I wouldn’t be surprised if GPT-4 could pass it (the Turing test) amply.

If this is true -- with the provisos of a) doing so as well as humans with an IQ of at least 100

and b) at a cost level of not more than $10/hour -- this will be one of great technological revolutions in history. The applications are virtually infinite. *If it is true* society needs to start thinking about the consequences immediately. Because they will be enormous.

Expand full comment
Reply
13 replies by Alberto Romero and others
Dan Greller
Nov 11, 2022Liked by Alberto Romero

The marketplace has barely absorbed and leveraged the capabilities of existing SOTA models such as GPT-3. If GPT-4 turns out to be a quantum leap in capability, that gap will only widen. VC's, start-ups and incumbents will be scrambling to implement these untapped abilities. It will be interesting to see how much GPT-4 simplifies the ability to use it, accelerating the innovation that can be quickly rolled out.

Expand full comment
Reply
2 replies by Alberto Romero and others
30 more comments…
TopNewCommunity

No posts

Ready for more?

© 2023 Alberto Romero
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing