With ChatGPT making waves in the popular media and the big much-anticipated GPT-4 release last week, the GPT-mania seems to be utterly inescapable. Even South Park, one of my favorite escapes from work, wouldn’t let me get a break. Episode four of the current season is titled “Deep Learning” and features characters using ChatGPT to cheat on school assignments and reply automatically to their girlfriend’s texts—to, of course, hilarious consequences. In their characteristic satirical, self-referential style, South Park’s writers even use text output by ChatGPT to resolve the episode’s climax.
With all of this ChatGPT and GPT-4 news, I was wondering whether these generative A.I. tools actually result in the productivity gains everyone supposes them to. Well, wonder no more.
In a hot-off-the-press study titled “Experimental Evidence on the Productivity Effects ofGenerative A.I.” economics researchers at MIT asked several hundred white-collar workers todo writing and editing tasks. These workers worked in a wide variety of fields—from data analysis to marketing, from grant writing to human resources—and they were assigned writing and editing tasks that were specific to their niche and that took about half an hour to complete. Evaluators from their niche then graded the quality of the assignment.
The workers were randomly split into two groups: an experimental group that used ChatGPT while they worked on their assignment, and a control group that simply worked without assistance. I expected ChatGPT usage to make an impact, but I am blown away by the magnitude of the impact: The experimental group that had access to ChatGPT completed their assignment in 17 minutes on average while the control group that didn’t have ChatGPT took 27minutes. That ten-minute delta corresponds to a 37% speed-up thanks to ChatGPT usage and, for you statistics buffs out there, the difference is highly significant, with ap-value of less than .001. In lay terms, this means that there’s a less than one in a thousand probability of this experimental result being observed by chance alone. In other words, ChatGPT appears to cause the 37% speed improvement.
But it wasn’t just speed that improved and this is where the results get really interesting. Not only were users in the ChatGPT group much faster, the evaluators’ ratings were dramatically higher for the ChatGPT users as well: Indeed, again, the evaluators’ ratings were so much higher in the ChatGPT condition that the results were highly statistically significant with ap-value of less than .001 (a less than a one-in-a-thousand probability that this effect happened by chance alone).
You may have hunches from your own experience using ChatGPT, but how exactly did the generative A.I. tool improve the speed and quality of the white-collar workers’ output? Well, the study showed that ChatGPT:
●Somewhat reduces the time required to brainstorm
●Greatly reduces the time required to create rough drafts
●Is used extensively, and most actively, during the final editing process
If you’re wondering whether ChatGPT was only so impactful because the white-collar workers didn’t have very good baseline writing skills, well, that was evaluated in this study too. While relatively poor writers did see big improvements in their assignments, the good writers also became faster and had higher-quality outputs. Indeed, the poor writers and strong writers had almost equal assessments on the value they received from ChatGPT and their willingness to pay for access to it.
Given all this, if you aren’t already making a ton of use of ChatGPT in your work, you may be curious how you could be augmenting your intelligence with it even more:
Episode #660 for five data scientist-specific tips
Episode #646 for more general tips that anyone can make use of
One item to watch out for: The study hasn’t yet been peer-reviewed, so you should take these findings with a grain of salt.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.