Internet/Web vs. internet/web

5 April 2016

I like fivethirtyeight.com. Nate Silver and his crew have pioneered a new form of journalism, one based on data rather than punditry, but like anyone else, they get into trouble when they stray outside their wheelhouse. Most recently, a blog post on their website took on the announcement that the Associate Press (AP) has changed their stylebook to use lowercase letters when writing internet and web. Formerly, the news organization had advocated for Internet and Web. In so doing, they not only demonstrated a misunderstanding of how language works, but they also screwed up their analysis of the data.

This past week the AP announced that they will no longer be capitalizing internet or web. The change is significant because many journalistic organizations in the US follow the AP style.

But the AP is hardly on the leading edge of this trend. I discussed the capitalization of internet back in 2004 when Wired magazine made the switch. The fivethirtyeight.com blog post says the AP is a “cultural bellwether for writer types.” Yet as it is on most stylistic points, the AP style is inherently a conservative one, not one that rides the edge of linguistic change. Nor do most “writer-types” give a crap what the AP says. (In fact, AP style is something of a joke among most writer-types, or rather among writers who are aware that it exists at all.) Fivethirtyeight is viewing the situation from the perspective of a journalist, assuming that everyone else is a journalist too. They’ve moved from data analysis to punditry.

And what’s worse, the blog post misrepresents the data. Like any good fivethirtyeight.com story, it presents data, this time from Google Ngrams, but the words in the story don’t match what the data tells us. First, the data is from the wrong source for their purposes. It’s from Google Books. If you want to track journalistic style—this is an article about the AP stylebook after all—then Google Books is the wrong corpus to use. You want a corpus of news stories for that, not one that tracks usage in books.

Google_Internet.jpg

The article does say that “we passed peak ‘Internet’ and ‘Web’ sometime around 2002.” That is correct, as far as it goes, but it ignores the fact that the capitalized Internet and Web are still far more common in Google Books than their lowercased counterparts. Internet and Web started their rise in the Google Books data around 1989 and 1991 respectively. Both rose at nearly exponential rates until 2002, when their usage declined, although as of 2008 (the latest available year in the data set) Web was over twice as popular as web and Internet some six times as popular as internet. The lower case forms showed growth over the same period, but at a much slower rate, and in 2004 web showed a decrease in usage as well, and internet slowed its steady growth. What the data shows, therefore, is not a switch from the uppercase to the lowercase forms, but rather that people were writing less about the internet and the web. What had happened was the dot-com crash and the switch from the internet being a trending topic of discussion to simply being part of the background noise of our lives. (Again, this is Google Books. Journalistic usage may show something entirely different.)

Twelve years ago I said, “the significance of the Wired News style change should not be underestimated. The practice of capitalizing [internet] is clearly on the way out.” While that statement was not wrong, it was overly optimistic about the pace of that change. The capitalized forms are still very much the preferred form. But it was Wired that was the “cultural bellwether,” not the AP, which is in the middle of the flock.