AI Unicorn Anthropic Releases Claude 3, A Model It Claims Can Beat OpenAI’s Best

Forbes

Anthropic today announced a new series of large language models that the artificial intelligence company claims are the world’s most intelligent to date, outperforming rival offerings from OpenAI and Google.  Read the full story on Forbes: https://www.forbes.com/sites/alexkonrad/2024/03/04/anthropic-releases-claude-3-claims-beat-openai/?sh=59619f0157bc  Forbes Daily Briefing shares the best of Forbes reporting on wealth, business, entrepreneurship, leadership and more. Tune in every day, seven days a week, to hear a new story. Subscribe here: https://art19.com/shows/forbes-daily-briefing  Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:  https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript  Stay Connected Forbes newsletters: https://newsletters.editorial.forbes.com Forbes on Facebook: http://fb.com/forbes Forbes Video on Twitter: http://www.twitter.com/forbes Forbes Video on Instagram: http://instagram.com/forbes More From Forbes:  http://forbes.com  Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.

Transcript

00:00 Here's your Forbes daily briefing for Wednesday, March 6th.

00:05 Today on Forbes, AI unicorn Anthropic releases CLAWD3, a model it claims can beat OpenAI's

00:14 best.

00:15 On Monday, Anthropic announced a new series of large language models that the artificial

00:20 intelligence company claims are the world's most intelligent to date, outperforming rival

00:26 offerings from OpenAI and Google.

00:29 Called CLAWD3, Anthropic's new model family comes in three versions, Opus, Sonnet, and

00:36 Haiku, that vary by performance and price.

00:40 The company said that Opus, the most powerful and most expensive version to run, outperformed

00:46 OpenAI's GPT-4 and Google's Gemini 1.0 Ultra across a series of benchmarks that measure

00:53 intelligence.

00:55 Both Opus and Sonnet, the mid-tier offering, were made available Monday, while Haiku will

01:00 be released at a later announced date.

01:03 In an interview, co-founder and CEO Dario Amodei said the model family was designed

01:09 with different business use cases in mind.

01:11 He said, "CLAWD3 Opus is, at least according to the evaluations, in many respects the best-performing

01:18 model in the world across a range of tasks."

01:23 On a number of popular test subjects, including undergraduate-level general knowledge, grade

01:27 school math, computer code, and question and answers knowledge, CLAWD3 Opus outperformed

01:33 OpenAI's GPT-4 and Google's Gemini 1.0 Ultra, this according to the benchmarks the company

01:39 shared.

01:41 On the general knowledge benchmark, CLAWD3 Opus also outperformed Mistral Large, the top-line

01:47 release model from open-source AI unicorn Mistral, released last week.

01:52 The version of CLAWD3 that most users will see, however, CLAWD3 Sonnet, performed more

01:58 on par with GPT-4, ahead on some benchmarks, behind on others.

02:03 And Amodei conceded that Anthropx benchmarks did not factor in recent updates from OpenAI's

02:09 GPT-4 Turbo and Google's Gemini 1.5 Pro, as their peers have not yet published corresponding

02:15 test evaluations.

02:17 But Amodei said, "I would be surprised if we did not perform competitively."

02:24 Amodei and co-founder and sister Daniela Amodei told Forbes they expect Opus to be used by

02:29 businesses that need the most cutting-edge performance for functions like complex data

02:33 analysis and biomedical research.

02:37 Formed by seven researchers who quit OpenAI, Anthropx has historically aimed to separate

02:42 itself from its progenitor and other companies in the field through a deeper focus on AI

02:47 safety.

02:49 Some industry insiders have wondered if this has slowed the company down and questioned

02:53 its model performance in recent months, including on social media.

02:57 On a popular crowdsourced leaderboard of human evaluators, CLAWD1 currently carries a higher

03:02 rating than its successors, CLAWD2.0 and the updated CLAWD2.1.

03:08 Dario Amodei shrugged off those ratings as just one human-based evaluation of a finite

03:13 number of consumer tasks.

03:15 He conceded that while CLAWD2 was safer than its predecessor in a way that satisfied Anthropx

03:20 researchers, that came at the cost of higher so-called "incorrect refusals," or rejections

03:25 of prompts that the model believed came too close to its safety guardrails.

03:30 Anthropx claimed that the CLAWD3 family performs much better than predecessors in not serving

03:35 those rejections.

03:37 Harmless prompts close in content to its safety limits are refused about 10% of the time,

03:42 compared to 25% for CLAWD2.1.

03:46 Amodei said, "Now we're making progress towards more balance between the two, something that

03:51 gets the best of both worlds.

03:53 It's really hard to draw a complex boundary in the right way.

03:56 We're always trying to do that better."

03:59 While companies like Inflection, Character.ai, and even OpenAI have ventured further into

04:04 consumer use cases, Anthropx is focusing on business customers.

04:09 Users of its free consumer chatbot, also called CLAWD, will now get access to SONNET, while

04:15 individuals looking to try OPUS will need to subscribe to its $20 per month paid version.

04:20 But again, Daniela Amodei reiterated that CLAWD3's releases were made more for business

04:25 use cases in mind.

04:28 CLAWD customers include tech companies GitLab, Notion, Quora, and Salesforce, which is an

04:34 anthropic investor, financial giant Bridgewater, and conglomerate SAP, as well as business

04:39 research portal LexisNexis, Telco, SK Telecom, and the Dana-Farber Cancer Institute.

04:47 For full coverage, check out Alex Conrad and Kenric Kai's piece on Forbes.com.

04:53 This is Kieran Meadows from Forbes.

04:56 Thanks for tuning in.

04:57 [MUSIC PLAYING]

Category

Transcript

Recommended