Behold, a Database That Tracks More Than 500 Episodes of The Simpsons

The project, created by a professor of the digital humanities, is nerdy and eeeeeeeexcellent.
More
The Simpsons visit New York in Season 24's "Moonshine River" episode. (Fox)

What is The Simpsons? It's a television show, certainly—specifically, the longest-running American sitcom of all time. It's a cultural touchstone. It's a delight. But it's also an archival collection—25 years' worth of characters, themes, stories, and scripts.  

To celebrate the show's quarter-century of existence, fans are being treated to projects that capitalize on this documentary breadth. There's the marathon of the show that's been airing on the cable network FXX; the social media conversation that has accompanied the marathon; the new app, Simpsons World, that will function like a DVD box set for the show, with even more extras. But there's another Simpsons project Fox isn't responsible for: a searchable database. One that has taken every episode of The Simpsons and made it, in its way, interactive. As Homer might put it: "Mmmmmm, searchability."  

As Homer could also put it, though: "Mmmmmm, digital humanities." The project is the work of Ben Schmidt, a professor of digital and intellectual history at Northeastern University. Schmidt works in the field of the digital humanities, meaning he focuses on applying computational approaches to things like books, newspapers, and other pieces of literature. 

So how do you turn The Simpsons, the show, into The Simpsons, the textual corpus? You take advantage of the fact that the series' episodes—all 552 of them—have been close-captioned. You treat the show's subtitles, essentially, as their texts. Which isn't a fool-proof method—"it's often very quickly done," Schmidt points out of the transcript-creation process—but it does allow for an overall, text-based reading of the show. And, because subtitles are plotted by time, they allow you to understand the shows as they move forward, minute by minute as well as season by season. So they allow you to compare the over-time appearances of, say, Mrs. Krabappel with those of, say, Mayor Quimby. They allow you to plot the writers' relative reliance on particular catchphrases ("D'oh!," "Release the hounds!," "Ay, carumba!") over the show's evolution. 

They allow you to treat The Simpsons as, effectively, a single book. A single, enormous, unapologetically four-fingered book. 

Once Schmidt had gathered the show's subtitles, it was a quick process to convert those into a database. He and a research partner, Erez Lieberman Aidan, had already created the Bookworm project, which turned a large corpora of books, scientific papers, newspapers, and legal documents into a searchable database. (It's similar in that way to Google's extensive nGrams database.) The Simpsons-specific Brookworm was built on Bookworm's pre-existing architecture, which allowed Schmidt to put it together quickly. As he told me: "I just, over an evening, threw it together."

Searching the resulting database, you get findings like this, which charts the minutes within each episode that characters talk about "school": 

And like this—which suggests that, as Schmidt puts it, "'I'm Kent Brockman' seems to be overwhelmingly a gag from the opening scene":

And what were some of Schmidt's broader findings? "I think the show just got more self-referential over time," he says. "Everybody's been talking about the Family Guy shift"—meaning the shift in TV comedy to include in its plot line external pop cultural references—"and the thing The Simpsons can do that Family Guy can't do is refer to something that we actually care about in its own universe. So I think there are some of those jokes."

One other finding? The Simpsons was so creative that its catchphrases often defy common spellings—not only among the show's caption-writers, but among the people who have so far searched the Simpsons database. Take Ned Flanders' signature catch phrase. "Nobody has any idea how to spell 'okeley dokely,'" Schmidt says. "It sort of defies measurement for the time being."

Jump to comments
Presented by

Megan Garber is a staff writer at The Atlantic. She was formerly an assistant editor at the Nieman Journalism Lab, where she wrote about innovations in the media.

Get Today's Top Stories in Your Inbox (preview)

What Is a Sandwich? (No, Seriously, Though)

We're overthinking sandwiches, so you don't have to.


Join the Discussion

After you comment, click Post. If you’re not already logged in you will be asked to log in or register. blog comments powered by Disqus

Video

What Is a Sandwich?

We're overthinking sandwiches, so you don't have to.

Video

How Will Climate Change Affect Cities?

Urban planners and environmentalists predict the future of city life.

Video

The Inner Life of a Drag Queen

A short documentary about cross-dressing, masculinity, identity, and performance

Video

Let's Talk About Not Smoking

Why does smoking maintain its allure? James Hamblin seeks the wisdom of a cool person.

Video

The Joy of Learning French

Ta-Nehisi Coates speaks français after a summer of intensive language study.

Video

A Fascinating Short Film About the Multiverse

If life is a series of infinite possibilities, what does it mean to be alive?

Writers

Up
Down

More in Technology

Just In