The Entrepreneurs Weekly
No Result
View All Result
Friday, July 25, 2025
  • Login
  • Home
  • BUSINESS
  • POLITICS
  • ENTREPRENEURSHIP
  • ENTERTAINMENT
Subscribe
The Entrepreneurs Weekly
  • Home
  • BUSINESS
  • POLITICS
  • ENTREPRENEURSHIP
  • ENTERTAINMENT
No Result
View All Result
The Entrepreneurs Weekly
No Result
View All Result
Home Business

Model From OpenAI Rival Anthropic Shows ‘Metacognition’: Report | Entrepreneur

by Brand Post
March 7, 2024
in Business
0
Model From OpenAI Rival Anthropic Shows ‘Metacognition’: Report | Entrepreneur
152
SHARES
1.9k
VIEWS
Share on FacebookShare on Twitter


A developer at Anthropic, an OpenAI rival reportedly in talks to raise $750 million in funding, revealed this week that its latest AI model appears to recognize when it is being tested.

The capability, which has never been seen before publicly, sparked a conversation about “metacognition” in AI or the potential for AI to monitor what it is doing and one day even self-correct.

Anthropic announced three new models: Claude 3 Sonnet and Claude 3 Opus, which are available to use now in 159 countries, and Claude 3 Haiku, which will be “available soon.” The Opus model, which packs in the most powerful performance of the three, was the one that appeared to display a type of metacognition in internal tests, according to Anthropic prompt engineer Alex Albert.

“Fun story from our internal testing on Claude 3 Opus,” Albert wrote on X, formerly Twitter. “It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.”

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

For background, this tests a model’s recall ability by inserting a target sentence (the “needle”) into a corpus of… pic.twitter.com/m7wWhhu6Fg

— Alex (@alexalbert__) March 4, 2024

The evaluation involves placing a sentence (the “needle’) into the “haystack” of a wider range of random documents and asking the AI about information contained only in the needle sentence.

“When we ran this test on Opus, we noticed some interesting behavior – it seemed to suspect that we were running an eval on it,” Albert wrote.

According to Albert, Opus went beyond what the test was asking for by noticing that the needle sentence looked remarkably different from the rest of the documents. The AI was able to hypothesize that the researchers were conducting a test or that the fact the researcher asked for might, in fact, be a joke.

Related: JPMorgan Says Its AI Cash Flow Software Cut Human Work By Almost 90%

“This level of meta-awareness was very cool to see,” Albert wrote.

Users on X had mixed feelings about Albert’s post, with American psychologist Geoffrey Miller writing, “That fine line between ‘fun story’ and ‘existentially terrifying horrorshow.'”

AI researcher Margaret Mitchell wrote: “That’s fairly terrifying, no?”

Anthropic is the first to publicly speak about this particular kind of AI capability in internal tests.

According to Bloomberg, the company tried to cut hallucinations, or incorrect or misleading results, in half with its latest Claude rollout and inspire user trust by having the AI cite its sources.

Anthropic stated that Claude Opus “outperforms its peers” when compared to OpenAI’s GPT-4 and GPT-3.5 and Google’s Gemini 1.0 Ultra and 1.0 Pro. According to Anthropic, Opus shows “near-human” levels of understanding and fluency on tasks like solving math problems and reasoning on a graduate-school level.

Related: An AI Scam Stole 3 Million Site Visitors. Business Clones Are Pirating Services. Here’s How to Prep Yourself for Alarming Trends in AI.

Google made similar comparisons when it launched Gemini in December, placing the Gemini Ultra alongside OpenAI’s GPT-4 and showing that the Ultra’s performance surpassed GPT-4’s results on 30 of 32 academic benchmark tests.

“With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities,” Google stated in a blog post.





Source link

Tags: AnthropicBusiness NewsentrepreneurMetacognitionModelNews and TrendsOpenAIReportRivalShows

Related Posts

Here’s Why I Tell Enterprise Companies to Make Time for Play | Entrepreneur
Business

Here’s Why I Tell Enterprise Companies to Make Time for Play | Entrepreneur

July 25, 2025
Why You Should Start a Business After Retirement | Entrepreneur
Business

Why You Should Start a Business After Retirement | Entrepreneur

July 25, 2025
How to Try Apple iOS 26 Beta Preview, Liquid Glass | Entrepreneur
Business

How to Try Apple iOS 26 Beta Preview, Liquid Glass | Entrepreneur

July 25, 2025
  • Trending
  • Comments
  • Latest
Meet Amir Kenzo: A Well Known Musical Artist From Iran.

Meet Amir Kenzo: A Well Known Musical Artist From Iran.

August 21, 2022
Behind the Glamour: Bella Davis Opens Up About Overcoming Adversity in Modeling

Behind the Glamour: Bella Davis Opens Up About Overcoming Adversity in Modeling

April 20, 2024
Dr. Donya Ball: Pioneering Leadership Solutions for Tomorrow’s Challenges

Dr. Donya Ball: Pioneering Leadership Solutions for Tomorrow’s Challenges

May 10, 2024
Nasiyr Bey’s Journey from Brooklyn to Charlotte: The Entrepreneurial Path to Owning a Successful Cigar Lounge

Nasiyr Bey’s Journey from Brooklyn to Charlotte: The Entrepreneurial Path to Owning a Successful Cigar Lounge

August 8, 2024
Augmented.City Startup Developers Appeal To US Politicians With An Open Letter

Augmented.City Startup Developers Appeal To US Politicians With An Open Letter

0
U.S. High Court Snubs Challenge To State And Local Tax Deduction Cap

U.S. High Court Snubs Challenge To State And Local Tax Deduction Cap

0
GOP Lawmaker Blames Biden For Russia-Ukraine War: Putin ‘Could never have Invaded’

GOP Lawmaker Blames Biden For Russia-Ukraine War: Putin ‘Could never have Invaded’

0
Brad Winget’s Tips and Tricks on Having a Career in Real Estate

Brad Winget’s Tips and Tricks on Having a Career in Real Estate

0
Here’s Why I Tell Enterprise Companies to Make Time for Play | Entrepreneur

Here’s Why I Tell Enterprise Companies to Make Time for Play | Entrepreneur

July 25, 2025
Why You Should Start a Business After Retirement | Entrepreneur

Why You Should Start a Business After Retirement | Entrepreneur

July 25, 2025
How to Try Apple iOS 26 Beta Preview, Liquid Glass | Entrepreneur

How to Try Apple iOS 26 Beta Preview, Liquid Glass | Entrepreneur

July 25, 2025
Why This Ex-TV Producer Walked Away From Hollywood to Rewrite the ‘Cat Lady’ Story | Entrepreneur

Why This Ex-TV Producer Walked Away From Hollywood to Rewrite the ‘Cat Lady’ Story | Entrepreneur

July 25, 2025

The EW prides itself on assembling a proficient and dedicated team comprising seasoned journalists and editors. This collective commitment drives us to provide our esteemed readership with nothing short of the most comprehensive, accurate, and captivating news coverage available.

Transcending the bounds of Chicago to encompass a broader scope, we ensure that our audience remains well-informed and engaged with the latest developments, both locally and beyond.

NEWS

  • Business
  • Politics
  • Entrepreneurship
  • Entertainment
Instagram Facebook

© 2024 Entrepreneurs Weekly.  All Rights Reserved.

  • About Us
  • Advertise
  • Contact Us
No Result
View All Result
  • ENTREPRENEURSHIP
  • ENTERTAINMENT
  • POLITICS
  • BUSINESS
  • CONTACT US
  • ADVERTISEMENT

Copyright © 2024 - The Entrepreneurs Weekly

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In