Why Stack Overflow Banned ChatGPT

The Q&A site has been flooded with ChatGPT coding answers that look correct but often aren't

Dec 13, 2022

∙ Paid

Hey Everyone,

In my article the Dark Side of Generative A.I. I argue (among other things) that unmitigated spam could be a problem with tools like ChatGPT. For A.I. at the intersection of coding, the lack of fact and objective verification of ChatGPT is seriously problematic.

Many sites are having to temporarily ban ChatGPT outright. The Q&A site has been flooded with ChatGPT coding answers that look correct but often aren't, with moderators calling for a halt. This is what happens when you let the public test a demo product that hasn’t been tested enough internally.

One week ago, on December 7th, 2022 Stack Overflow said it was withholding a permanent decision on AI-generated answers until after a larger staff discussion, but was taking action now due to fears that ChatGPT could be "substantially harmful" to both the org and its users.

After the honeymoon phase with ChatGPT’s not so open trial ends, we’ll have to take some time to take a hard look at if our A.I. ethics and legal system are ready for Generative A.I. conversational agents like ChatGPT. From sites like DeviantArt having to deal with text-to-image “creations” to GitHub Copilot running amuck with copyright, clearly we seem to be rushing things.

Is it quality content? Is it safe? Clearly the mods at Stack Overflow had to do something. So as the mods explained, ChatGPT simply makes it too easy for users to generate responses and flood the site with answers that seem correct at first glance but are often wrong on close examination. For OpenAI’s demo, this is a misinformation stain. Microsoft in the end will have to be accountable for the fledgling A.I. lab, which is basically its foundation model generator.

"The average rate of getting correct answers from ChatGPT is too low," Stack Overflow said in a policy statement entitled "ChatGPT is banned."

ChatGPT is Breaking Trust & Safety

Stack Overflow is a community built upon trust. The community trusts that users are submitting answers that reflect what they actually know to be accurate and that they and their peers have the knowledge and skill set to verify and validate those answers.

I actually think the Stack Overflow example best demonstrates the relationship of ChatGPT and misinformation at scale. And this is important guys, “the primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce,” wrote the mods (emphasis theirs).

We know that on the censored and moderated internet, the misinformation that slips though does tend to go viral much easier, and as has been the case with ChatGPT hype. Even when it was basically fooling users into feeling like it was telling the truth. I cannot verify how many times I witnessed this in tweets and use cases, but with over 1 million users of the demo, you can imagine how much misinformation care of OpenAI has been flying around without any legal responsibility.

The mods at Stack Overflow continued:

Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.

ChatGPT was released about two weeks ago and touted by OpenAI in a blog post as a conversational AI that can provide detailed answers to questions, as well as "answer follow-up questions, admit its mistakes, challenge inappropriate premises and reject inappropriate requests." The problem is it doesn’t seem like OpenAI did their due diligence on how the tool would be used, misused and create spam and confusion. OpenAI has seemed very proud of its number of users (hint: free testers).

It’s all about the RLHF in GPT-3.5

Nathan Lambert

and the team at Hugging Face wrote a nice piece about the importance of reinforcement learning from human feedback (RLHF) recently.

However OpenAI is also getting clever at trying to protect themselves legally: they made Sam say this:

Sam Altman @sama

ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness. it's a mistake to be relying on it for anything important right now. it’s a preview of progress; we have lots of work to do on robustness and truthfulness.

If you create a flawed demo and promote it and let Tweets (that are basically misinformation proliferate), yes Sam, it will indeed be misused in more serious contexts. But hang on a second, the owner of Twitter was also a co-founder of OpenAI. Major conflict of interest! Oh well, not legally binding, fair use right?!

Remember guys even before Generative A.I. hits its stride Web 3.0 is basically unable to moderate itself even with A.I. doing the heavy lifting. Between January 1 and June 30, more than 21 million fake accounts were detected and removed from LinkedIn, according to the company’s community report. That doesn’t stop me from encounting dozens each week. We simply won’t be able to moderate, regulate or enforce the spam of tools like ChatGPT. There’s simply no way! And it’s a problem.

While we hype Generative A.I. on one hand, we won’t be able to even moderate it properly on the other. So what will be the result? The Western internet will continue to degrade. Free speech will turn into an A.I. playground. The dials of censorship and spam won’t really work any longer in way that doesn’t feel like total chaos, and even in 2022 we’ve had a dose of what that might feel like moving forwards.

For Silicon Valley tycoons, it’s their job to promote A.I. at scale. Whether you are related to a PayPal mafia is beside the point.

Sam Altman @sama

people seem very excited that chatgpt can expand a few bullet points into a lot of well-written text. it can also collapse a lot of well-written text into a a few bullet points. the latter is much more valuable, right?

The mods at Stack Overflow a serious site for coders, realized quickly that ChatGPT was a spam problem. They quickly needed to reduce the volume of these posts and needed to be able to deal with the ones which are posted quickly, which means dealing with users, rather than individual posts. So, for now, the use of ChatGPT to create posts there on Stack Overflow is not permitted. If a user is believed to have used ChatGPT after this temporary policy is posted, sanctions will be imposed to prevent users from continuing to post such content, even if the posts would otherwise be acceptable.

Can you realize the nightmare this is going to create from everything from Canva to Adobe? Who can integrate text-to-image the best and the greatest? The Generative A.I. use of tools, coding and text-to-video on the mobile short-form internet becomes even more vague and gets even more dystopian with misinformation as we head deeper down the rabbit-hole of deep fakes.

Curiously that Generative A.I. is a spam problem is mostly totally unaddressed in the mainstream media or by the actors themselves. But my Twitter, Reddit, Discord and LinkedIn feeds are full of very bad examples of how ChatGPT is used even as a flawed demo today that highlights the scale of the issues created.

In early November, 2022 DeviantArt, a Wix-owned artist community, announced a new protection for creators to prevent their work from being scraped for AI development. It’s not just artists that are being ripped off and who gets to benefit? Microsoft recently announced that GitHub Copilot for Business would go live at $19. I was surprised to see the price so much higher than an individual account and what that actually included. The legal grey areas here are significant, and BigTech can bulldoze A.I. adoption at scale, before regulation and moderation is even remotely ready.

Stack Overflow really put it well when it said:

The objective nature of the content on Stack Overflow means that if any part of an answer is wrong, then the answer is objectively wrong. In order for Stack Overflow to maintain a strong standard as a reliable source for correct and verified information, such answers must be edited or replaced.

The Internet is About to Enter a Deepfakes and Misinformation Era

Increasingly as users of the internet we have to trust our sources less and do more fact-checking ourselves. But with Generative A.I. and tools like chatGPT it’s the illusion of truth-sounding content that begins to blur, just as deepfakes get better at producing believable fake accounts that even Microsoft’s LinkedIn A.I. cannot catch them upon account creation.

Between January 1 and June 30, more than 21 million fake accounts were detected and removed from LinkedIn, according to the company’s community report. But there’s always about 4 percent that get though, and that turns out to be a lot. While 95.3% of those fake accounts were stopped at registration by automated defenses, according to the company, there was a nearly 28% increase in fake accounts caught compared to the previous six-month period. LinkedIn says it currently has more than 875 million members on its platform. The deepfake internet is clearly just beginning.

Even the internet doesn’t understand what’s coming. Generative A.I. tools are already becoming controversial like Lensa A.I. that can sexualize users without their consent. But what happens when you believe the source to be accurate as is the case so often with ChatGPT? Who is liable? OpenAI doesn’t seem to think it is liable for much of anything, and that’s a problem.

Keep reading with a 7-day free trial

Subscribe to AI Supremacy to keep reading this post and get 7 days of free access to the full post archives.