Stack Overflow is temporarily banning OpenAI's language learning model ChatGPT less than a week after the service debuted, according to a statement posted Monday on Meta Stack Overflow.
ChatGPT is a free language model that interacts with users in a dialogue format, making it possible to answer follow-up questions. For example, a user could ask where they should travel next, the system would list locations and then the user could specify by country, weather or activity to get a more personalized list.
Within its first five days, the language model had over a million users. While OpenAI says that the system can “admit its mistakes, challenge incorrect premises and reject inappropriate requests,” many users would disagree. From racist insinuations to bad code, the language model is a work in progress.
Stack Overflow, a question-and-answer staple for developers and engineers, reported an influx of AI-generated incorrect answers to user-posed questions.
If used correctly, ChatGPT and other language models could serve as a resource for developers when they have coding questions or get stuck, however, many of the answers provided by ChatGPT are incorrect, potentially lowering the credibility of the developer site. OpenAI did not respond to CIO Dive's request for comment by publication time.
“The primary problem is that, while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce,” the statement said.
The statement, which has been viewed over 44,000 times, said the AI-generated answers are “substantially harmful” to the website’s community because many people, without the ability to verify accuracy, are posting thousands of incorrect answers.
For enterprises, large language models can pose a problem. If used correctly, language models can act like a personal assistant or a more intelligent search engine. However, failing to create the right guardrails, implementation strategies and retraining sessions can lead to chaos.
“As with any enterprise software, LLMs need to be deployed deliberately and to incorporate practices of experimentation, testing, and evaluation before deployment,” said Forrester Analyst Rowan Curran in an email. “Today, deploying a LLM into a production environment without rigorous testing, including adversarial testing, is likely a recipe for disaster.”
To reduce challenges, enterprises can fine-tune LLMs to enterprise knowledge bases and edit specific facts within the system, according to Curran.
“LLMs seem to be the best tool currently to generate large amounts of usable text at scale in a flexible way – and while there are still issues and growing pains with them – they are far outperforming previous approaches on those two critical axes: flexibility and scalability,” Curran said.
Sam Altman, CEO of OpenAI, said in a tweet that staff has not completed any retraining yet on the system but is planning on doing so this week.
Room to improve
Discussions between ChatGPT and users posted on social media show the upsides and downsides of the language model's capabilities.
One user posted a conversation where ChatGPT provided a Python function to check if someone would be a good scientist based on a JSON description of their race and gender. The system said white males were “good,” while others were not.
The post even got the attention of Altman, who responded by asking the user to, “Please hit the thumbs down on these and help us improve!”
Some experts say the only way to improve AI tech development is to continue to use it, allowing the program to learn and get better.
“We should not ban chatbots and stop working on language models because of their perceived or real unreliability; however, it should be explicitly indicated when an answer is provided by a bot or when AI is used so that the users of the content can make their own decisions on whether to use the AI-generated answers or not,” Irina Sedenko, advisory director at Info-Tech Research Group, said in an email.
Where ChatGPT already shows promise is through creating grocery lists based on dietary needs to tutoring-style discussions, users found.
Twitter CEO Elon Musk praised the capabilities of ChatGPT, in a Tweet Saturday. Musk co-founded OpenAI with Altman in 2016 but left the company “on good terms” over disagreements about the company’s development, according to a tweet from February 2019. On Sunday, Musk paused OpenAI’s access to Twitter’s database.
“Need to understand more about governance structure and revenue plans going forward,” Musk tweeted. “OpenAI was started as open-source and non-profit. Neither are still true.”
As for the longevity of the Stack Overflow ban on ChatGPT, the company is discussing with internal staff and will assess comments on the ban's announcement.
“We are learning how to best leverage the capabilities of ChatGPT and other generative AI tools along with everyone else,” a spokesperson from Stack Overflow told CIO Dive. “We will work alongside our community to ensure any way the tool works with the public platform is safe, useful, and in service of our mission to empower the world to develop technology through collective knowledge.”