• last year
Anthropic today announced a new series of large language models that the artificial intelligence company claims are the world’s most intelligent to date, outperforming rival offerings from OpenAI and Google.

Read the full story on Forbes: https://www.forbes.com/sites/alexkonrad/2024/03/04/anthropic-releases-claude-3-claims-beat-openai/?sh=59619f0157bc

Forbes Daily Briefing shares the best of Forbes reporting on wealth, business, entrepreneurship, leadership and more. Tune in every day, seven days a week, to hear a new story. Subscribe here: https://art19.com/shows/forbes-daily-briefing

Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:

https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript

Stay Connected
Forbes newsletters: https://newsletters.editorial.forbes.com
Forbes on Facebook: http://fb.com/forbes
Forbes Video on Twitter: http://www.twitter.com/forbes
Forbes Video on Instagram: http://instagram.com/forbes
More From Forbes: http://forbes.com

Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.

Category

🤖
Tech
Transcript
00:00 Here's your Forbes daily briefing for Wednesday, March 6th.
00:05 Today on Forbes, AI unicorn Anthropic releases CLAWD3, a model it claims can beat OpenAI's
00:14 best.
00:15 On Monday, Anthropic announced a new series of large language models that the artificial
00:20 intelligence company claims are the world's most intelligent to date, outperforming rival
00:26 offerings from OpenAI and Google.
00:29 Called CLAWD3, Anthropic's new model family comes in three versions, Opus, Sonnet, and
00:36 Haiku, that vary by performance and price.
00:40 The company said that Opus, the most powerful and most expensive version to run, outperformed
00:46 OpenAI's GPT-4 and Google's Gemini 1.0 Ultra across a series of benchmarks that measure
00:53 intelligence.
00:55 Both Opus and Sonnet, the mid-tier offering, were made available Monday, while Haiku will
01:00 be released at a later announced date.
01:03 In an interview, co-founder and CEO Dario Amodei said the model family was designed
01:09 with different business use cases in mind.
01:11 He said, "CLAWD3 Opus is, at least according to the evaluations, in many respects the best-performing
01:18 model in the world across a range of tasks."
01:23 On a number of popular test subjects, including undergraduate-level general knowledge, grade
01:27 school math, computer code, and question and answers knowledge, CLAWD3 Opus outperformed
01:33 OpenAI's GPT-4 and Google's Gemini 1.0 Ultra, this according to the benchmarks the company
01:39 shared.
01:41 On the general knowledge benchmark, CLAWD3 Opus also outperformed Mistral Large, the top-line
01:47 release model from open-source AI unicorn Mistral, released last week.
01:52 The version of CLAWD3 that most users will see, however, CLAWD3 Sonnet, performed more
01:58 on par with GPT-4, ahead on some benchmarks, behind on others.
02:03 And Amodei conceded that Anthropx benchmarks did not factor in recent updates from OpenAI's
02:09 GPT-4 Turbo and Google's Gemini 1.5 Pro, as their peers have not yet published corresponding
02:15 test evaluations.
02:17 But Amodei said, "I would be surprised if we did not perform competitively."
02:24 Amodei and co-founder and sister Daniela Amodei told Forbes they expect Opus to be used by
02:29 businesses that need the most cutting-edge performance for functions like complex data
02:33 analysis and biomedical research.
02:37 Formed by seven researchers who quit OpenAI, Anthropx has historically aimed to separate
02:42 itself from its progenitor and other companies in the field through a deeper focus on AI
02:47 safety.
02:49 Some industry insiders have wondered if this has slowed the company down and questioned
02:53 its model performance in recent months, including on social media.
02:57 On a popular crowdsourced leaderboard of human evaluators, CLAWD1 currently carries a higher
03:02 rating than its successors, CLAWD2.0 and the updated CLAWD2.1.
03:08 Dario Amodei shrugged off those ratings as just one human-based evaluation of a finite
03:13 number of consumer tasks.
03:15 He conceded that while CLAWD2 was safer than its predecessor in a way that satisfied Anthropx
03:20 researchers, that came at the cost of higher so-called "incorrect refusals," or rejections
03:25 of prompts that the model believed came too close to its safety guardrails.
03:30 Anthropx claimed that the CLAWD3 family performs much better than predecessors in not serving
03:35 those rejections.
03:37 Harmless prompts close in content to its safety limits are refused about 10% of the time,
03:42 compared to 25% for CLAWD2.1.
03:46 Amodei said, "Now we're making progress towards more balance between the two, something that
03:51 gets the best of both worlds.
03:53 It's really hard to draw a complex boundary in the right way.
03:56 We're always trying to do that better."
03:59 While companies like Inflection, Character.ai, and even OpenAI have ventured further into
04:04 consumer use cases, Anthropx is focusing on business customers.
04:09 Users of its free consumer chatbot, also called CLAWD, will now get access to SONNET, while
04:15 individuals looking to try OPUS will need to subscribe to its $20 per month paid version.
04:20 But again, Daniela Amodei reiterated that CLAWD3's releases were made more for business
04:25 use cases in mind.
04:28 CLAWD customers include tech companies GitLab, Notion, Quora, and Salesforce, which is an
04:34 anthropic investor, financial giant Bridgewater, and conglomerate SAP, as well as business
04:39 research portal LexisNexis, Telco, SK Telecom, and the Dana-Farber Cancer Institute.
04:47 For full coverage, check out Alex Conrad and Kenric Kai's piece on Forbes.com.
04:53 This is Kieran Meadows from Forbes.
04:56 Thanks for tuning in.
04:57 [MUSIC PLAYING]

Recommended