Anthropic rolls out Claude 3, says it outperforms generative AI rivals

This audio is auto-generated. Please let us know if you have feedback.

Dive Brief:

Anthropic is launching its latest large language model family, Claude 3, which it says outperforms its generative AI rivals, according to a blog post Monday.
Claude 3 Opus and Sonnet, the two more advanced models in the set, are now available via the Claude API as well as the online chatbot service claude.ai. Anthropic said the most compact Claude 3 model, called Haiku, will become available soon.
Based on Anthropic’s research, the family of models shows increased analysis, forecasting, code generation and content creation capabilities compared to its peers. Claude 3 Opus scored 84.9% on the HumanEval coding test, a step above Google’s Gemini Ultra score of 74.4% and OpenAI’s GPT-4 score of 67%.

Dive Insight:

As newer versions of LLMs hit the enterprise technology market, it’s up to tech leaders to sidestep the hype and assess their capabilities.

Anthropic launched Claude a year ago and quickly followed up with Claude 2 in July as the startup worked to enhance the models’ capabilities and accuracy. Meta released Llama last February and unveiled its successor, Llama 2, in July.

OpenAI had a bit of an earlier start, releasing GPT-3.5 in November 2022 with the launch of ChatGPT. The startup released GPT-4 last March and has been tweaking GPT-4 Turbo since the startup’s Dev Day in November. The leading AI startup is reportedly working on its latest iteration, called Q*.

The pace of innovation has complicated matters for enterprise leaders.

On the one hand, CIOs want to bring the best capabilities to their organizations, which makes updated model versions quite appealing. But newer versions don’t always guarantee better results.

As tech leaders continue to search for the best solutions, CIOs will need to monitor tools on an ongoing basis to protect the business from risks associated with degraded performance.