Skip to content

Tales From the Trail

5th March, 2023 The Human Cost of ChatGPT

Everywhere I look, I see online chatter from every one of my sources buzzing about ChatGPT. Twitter, local Slack organizations, and even some of my clients are full of excitement about experimenting with it and finding ways to harness it to their advantage. I experimented with it myself to see if it could help me write blog articles (like this one) more frequently and with less effort. Given regular practice, I expect it probably could. I expect the more I use it, the more applications I could find for it to help me in some way. But I won’t be doing that.

One of the downsides of the training data that OpenAI’s models use to power things like ChatGPT is that it’s generated from the open Internet. Any and all information found on the web is used to help train the language models that power these fascinating tools. While it is true that the best of the content and commentary found online can be found among the information that powers ChatGPT, it also includes much more. Some of the worst, vilest, most reprehensible content - full of hate, bigotry, racism, violence, misogyny, and more - is represented in ChatGPT’s “personality.” Some of the darkest, basest corners of the global online community are a part of OpenAI’s training data. The brilliant, insightful commentary of an anonymous layperson on economics, technology, or government has equal footing in this training data as a hate-driven neo-Nazi’s. There is enough awful content present on the Internet - and, by extension, in OpenAI’s training data - that simply releasing a model like ChatGPT to the general public, trained on such data, would consistently produce vile and disgusting content.

OpenAI, therefore, recognized the need to moderate its AI models. In other words, humans were required to further train these models to prevent them from producing hateful, bigoted, racist, homophobic, or otherwise abhorrent responses. Therefore, it became necessary to employ humans to label content if it contained violence, sexual abuse, racism, and so on, so that those labels could further train ChatGPT to avoid producing responses containing such content. OpenAI recruited Sama, a San Francisco-based firm, to apply a human workforce to apply these labels to content. Sama’s solution employed workers in a developing nation (Kenya) to spend their days exposed to the worst kinds of content in order to produce the labels necessary to train ChatGPT to produce palatable content instead of to reflect the abominable bowels of the open Internet. These workers earned between $1.32 and $2 per hour, and were often traumatized by the graphic nature of the content to which they were exposed to.

It is difficult for me to view ChatGPT in any other way than this: it is a technology that is powered, in part, by human suffering. I cannot, in good conscience, allow myself to continue using a technology that traumatizes developing-world people for the sake of making its content palatable for general usage. I have, therefore, made a decision not to personally use ChatGPT in its current form.

This does put us in an uncomfortable position at Canyon Trail, and we’re evaluating policy decisions about how we will or won’t engage with ChatGPT. We can neither force our clients to avoid ChatGPT nor cut ties with them when they do use it. However, I want us to be a company that is ethically responsible to the global technology community. Our mission is to elevate the craft of software development, and our impact is diminished if we knowingly build our success on tech that causes such harm. We intend to stay true to our mission, even if it may require some uncomfortable conversations with our clients.

Some further reading:

More Tales