Lodestar

6 September 2018

This week the New York Times took the unusual step of publishing an anonymous op-ed piece by someone identified as “a senior official in the Trump administration” that was sharply critical of Trump. The writer described the president as incompetent and out of his depth and said that they and other senior administration officials actively worked to keep Trump from making decisions. Needless to say, it was a rather explosive article and speculation about who wrote it began immediately.

One particular speculative claim, however, is of particular interest and relates to this blog because of its linguistic nature. A certain Dan Bloom took to Twitter with the claim that the piece was written by Vice President Mike Pence, claiming that the giveaway was the piece’s use of the word lodestar. The anonymous op-ed had praised the recently deceased Senator John McCain as being “a lodestar for restoring honor to public life and our national dialogue.” Bloom points to the fact that Pence has used lodestar on numerous occasions in the past, dating back to 2001, and that it is an unusual word. But he is just wrong in the way he conducts his analysis.

Now regardless of what one’s political leanings are, the idea that Pence scribed the op-ed piece is rather delicious. His authorship would raise all sorts of constitutional and political questions and issues. But, linguistically, the theory is a load of hooey. That’s simply not how one goes about ascribing authorship to an anonymous piece.

To start, lodestar is not all that unusual a term. The Oxford English Dictionary says it appears in current usage about 0.1 to 1.0 times per million words. That seems low at first glance, but that’s the same range as overhang, life supportregisterrewritenutshellcandlestickrodeoembouchure, and insectivore. The word is also quite familiar to lawyers, as the lodestar standard is a method courts use to estimate legal fees in a lawsuit, and there are a lot of lawyers working in the White House. (Most of the hits for lodestar in the Corpus of Contemporary American English are references to this method of legal fee calculation.)

Even more problematic is that ascription of authorship to an anonymous text does not rely on single, uncommon, content words (nouns, non-copulative verbs, adjectives, adverbs) like lodestar, and it doesn’t do it for one very good reason. The choice of these words is largely dependent on the topic being written about. One simply does not use the same content terms when writing about economics as opposed to biology, or about linguistics as opposed to a newspaper op-ed about the White House. Measuring the use of such words tells you the topic, not who wrote it.

Instead, a legitimate stylistic analysis relies on function words (prepositions, copulatives, pronouns, conjunctions), very common content words that do not rely on topic, and repeated collocations of words. Not only do patterns of use of these words not change depending on the topic, they are harder to fake or mask—something a writer of an politically explosive op-ed who wished to be anonymous would be likely to do. Stylistic analysis looks for the relative frequency of these words in an author’s writing and creates a “signature” that can be compared to the anonymous text. Needless to say, such analysis must be computerized.

And there are fundamental problems with applying stylistic analysis to this particular op-ed piece. For one thing, at less than a thousand words, it is simply too short to create a reliable signature. One needs a text of several thousand words before reliable results can be generated. Then there is the problem with ghostwriters. Undoubtedly, many of speeches and articles ascribed to a politician of Pence’s stature are written by staffers. One needs a large corpus of material known to be written by the person in question. Lack of that would frustrate any stylistic analysis.

Now, I have no idea who wrote the Times op-ed piece, but the idea that its use of lodestar demonstrates anything is just plain wrong. Such armchair linguistic analysis is simply not valid.


Sources:

Anonymous. “I Am Part of the Resistance Inside the Trump Administration.” New York Times, 5 September 2018.

Craig, Hugh. “Stylistics and Authorship Studies.” In Susan Schreibman, Ray Siemens, and John Unsworth, eds., A Companion to Digital Humanities, Blackwell, 2004.

Moye, David. “One Word Has People Convinced Mike Pence Wrote Anonymous New York Times Op-Ed.” Huffington Post, 5 September 2018.

Oxford English Dictionary, second edition, 1989, s. v. lodestar, n.

Major League Team Names

16 May 2018

It’s called the “great American pastime,” and baseball has been an integral part of life in the United States for, give or take, the last 160 years. So here are the origins of the names of the Major League Baseball teams, past and present.

For those not familiar with the structure of American professional baseball, the Major League Baseball consists of, and has consisted of since the early days of the twentieth century, two leagues, the National League (founded 1876) and the American League (founded 1901). But at various times, particularly in the nineteenth century, other leagues existed, and I make reference to them below when needed. There are also a number of minor leagues, which now exist primarily as “farm” teams to develop player talent for the majors. And until the middle of the twentieth century, professional baseball in the U.S. was segregated, with African-Americans not permitted to play in the two major leagues. There were separate Negro leagues, with the best teams every bit the equal in player quality with the white, major league teams. Following the integration of baseball with Jackie Robinson playing for the Brooklyn Dodgers in 1947, the Negro leagues folded. Where I could find the information, I’ve included the origins of Negro league team names.

The dates listed after the team names are the dates the name came into baseball use, not the date the modern organization that currently uses the name was founded.

Oakland Athletics (1859). Athletics is probably the oldest sports team name still in use, dating to 1859 when an amateur Philadelphia team dubbed themselves the Athletics. The name was used, off and on, throughout the latter half of the nineteenth century for a number of professional, Philadelphia teams. The modern American League franchise played in Philadelphia (1901-54) and Kansas City (1954-67) before landing in Oakland in 1968. Since moving to the Bay Area, the name has alternated back and forth between Athletics and A’s, depending on the whim of the moment. 

Cincinnati Reds / Red Stockings (1869). The Cincinnati Red Stockings were the first professional baseball team, playing 1869-71, before reverting to amateur status. Named for the color of their socks, they were also known as the Red Legs and simply as the Reds. The modern National League team traces its lineage to 1890 and is named after that earlier Cincinnati team. During the Cold War, some admonished the Reds for having an “unpatriotic” name. Lou Smith, sports editor of the Cincinnati Enquirer responded, “Let the Russians change, we had it first.”

Chicago White Sox / White Stockings / Black Sox (1870). The original Chicago White Stockings were a professional team that existed, off and on, in the first half of the 1870s, until the formation of the National League in 1876, when the team became a permanent fixture, playing under the White Stockings name until 1889. That National League team would, in 1902, become the Cubs. The National League team abandoned the name in the 1890s, and when Charles Comiskey moved the minor league St. Paul Invaders to Chicago in 1900 and joined the American League, the team took to calling themselves the White Stockings. The National League objected to the name because of its earlier use by one of their teams, so in 1904 the nickname was shortened to White Sox. The derogatory and very unofficial nickname Black Sox was applied to the 1919 team and more specifically to the eight players on that team who threw the World Series in that year. There is a belief that the Black Sox nickname doesn’t stem from this scandal but instead comes from Comiskey’s refusal to wash the team’s uniforms more than once a week. But there is no actual evidence of this actually being the case—although Comiskey was famously tight-fisted, an attitude that had a great deal to do with the players’ willingness to accept the gamblers’ money and throw the World Series.

Washington Nationals (1872). The name Nationals was adopted by several short-lived, late nineteenth-century, professional teams that played in Washington, D.C. National Association teams in 1872 and 1875 used the name, as did two teams in 1884, one in the Union Association and another in the American Association. In the next century, the American League Senators officially changed their name to the Nationals in 1905 and continued to officially use the name through the 1956 season, but the team never shook the Senators name, which continued to be used by sportswriters and fans. That franchise left the city for Minnesota following the 1960 season. The current Nationals team are the old Montreal Expos, who moved to Washington in 2005, changing their name to the Nationals in the process.

Philadelphia Phillies (1874). The name Reds may be older, but the National League Phillies are the oldest, continuously operating, one-name, one-city franchise in all of professional sports, playing since 1883. But the nickname Phillies is even older than the current team, originally belonging to a National Association team that played from 1873-75. The name, obviously, derives from the name of their home city.

Milwaukee Brewers (1878). Milwaukee is famed for its beer industry, so it’s no surprise that a number of teams from Milwaukee have adopted the name Brewers. Three teams, in three different leagues, used the name in the nineteenth century. In 1901, one of the charter teams in the American League also began life with that name, although they only played in Wisconsin for one year, before moving to St. Louis and becoming the Browns, and after another move becoming the modern-day Baltimore Orioles. In 1970, the Seattle Pilots, an American League expansion team, moved to Milwaukee after one year of play and were rechristened the Brewers. In 1998 that team switched to the National League.

Homestead Grays (1879). A Providence, Rhode Island, National League team played under the name Grays from 1879–85. But the more famous team of that name was the Negro league Homestead Grays. Founded in 1912, the team is named for Homestead, Pennsylvania, a steel-mill town outside Pittsburgh. The team played in the Negro National League from 1935–48. They primarily played in Pittsburgh’s Forbes Field when the Pirates had away games, but in the 1940s they played most of their games in Washington, D.C. Featuring greats like Josh Gibson and Buck Leonard, the team won nine consecutive Negro National League titles starting in 1937. The team disbanded in 1950 with the integration of the major leagues.

St. Louis Browns (1883). Three different St. Louis teams have gone by this name over the years, so called because of the brown trim of their uniforms. The first was an American Association franchise from 1883–91. Next up was the National League franchise that would later be renamed the Cardinals. They played under the name Browns from 1892–98. The third was the American League team that played in the city from 1902–53, before moving to Baltimore and becoming the modern-day Orioles.

Baltimore Orioles (1883). The Orioles name gets its start in 1883 with a now defunct American Association team. The name comes from the Maryland state bird, the Baltimore Oriole, which is allegedly named after Cecilius Calvert, the 2nd Lord Baltimore. When the present-day American League was founded in 1901, the name was revived for the new Baltimore team. In 1903, the Orioles moved to New York and became the Highlanders and later the Yankees. The current Orioles team dates to 1954 when the St. Louis Browns moved to Baltimore.

New York Metropolitans / Mets (1883). The original New York Metropolitans played from 1883–87. When the National League expanded back into New York in 1962, following the departure of the Giants and Dodgers for California, it revived the old Metropolitans name, shortening it to Mets.

Los Angeles Dodgers (1884). This team got its start in Brooklyn in 1884 as the Trolley Dodgers.  In the late nineteenth century, Brooklyn was crisscrossed with streetcar lines, and Brooklynites were so called from their need to avoid being hit by one. The team played under a variety of names over the years: Bridegrooms (1890–98); Superbas (1899–1910); Dodgers (1911–13); and Robins (1914–31); before officially and permanently becoming the Dodgers once again in 1932. The Dodgers moved to Los Angeles in 1958. Brooklyn has never been the same since.

Louisville Colonels (1885). The name Colonels has been applied to a number of teams that have played in Louisville, Kentucky. The name was first used by an American Association team from 1885–89 and in 1890. The National League had a team in Louisville that played under that name from 1892–99. And various minor league teams have used the name in the twentieth century. Colonel is an honorific title that the state of Kentucky has been bestowing on citizens since 1885, but the term had been an unofficial honorific before that.

San Francisco Giants (1885). The current National League franchise, and the first team to use the name, began life in 1879 as the Troy (NY) Trojans. The team moved to New York City in 1883, becoming the Gothams. The name Giants began being applied to the team by sportswriters in 1885. The team moved to San Francisco in 1958, taking the name with them. The story that the name was coined by manager Jim Mutrie in 1885 when he referred to his players as “My big fellows! My giants!” has little evidence to support it; there are second-hand and decades-after-the-fact reports of Mutrie using name, but no direct evidence. Other teams that have used the name Giants include:

  • Cuban Giants, first professional black team,1885–89

  • Chicago American Giants (Negro National League, 1920–31 and Negro American League, 1937–48 and 1950)

  • Baltimore Elite Giants (NNL, 1937–48; NAL, 1949–50)

  • Atlantic City Bacharach Giants (Eastern Colored League, 1923–28)

  • Brooklyn Royal Giants (ECL, 1923–27)

  • Yomiuri (Tokyo) Giants, Japanese Central League

Washington Senators (1886). The name Senators, after the upper house of the U.S. Congress, has been applied to a number of teams over the years. Two National League franchises in the nation’s capital used the name from 1886–89 and 1892–99. But the team with whom the name is most closely associated is the American League team that played in the city from 1901–60. Senators was the popular nickname for that team, although the official nickname was the Nationals for most of that period, 1905–56. That team decamped to Minnesota and became the Twins in 1961. The name was used by another American League expansion team in the city from 1961–71, but that team moved to Texas in 1972, becoming the Rangers. None of these teams had much success on the field, giving rise to the phrase, “first in war, first in peace, and last in the American League.”

Cleveland Spiders (1889). Spiders was a name for a National League franchise that played in Cleveland in 1889–99. They were so called because their skin-tight uniforms gave the players a spidery appearance.

Pittsburgh Pirates (1891). The Pittsburgh team joined the National League in 1887 under the name Alleghenies. They were redubbed the Innocents in 1890. In 1891, the team signed second baseman Lou Bierbauer away from the Philadelphia Athletics with a lucrative contract. Bierbauer’s old club saw this as theft and dubbed the Pittsburgh team the Pirates. The name stuck.

St. Louis Cardinals (1900). Founded in 1892, this National League team was originally named the Browns after the color of the trim on its uniforms. In 1899, the team changed both the color of its uniform trim to a bright red and its name to the Perfectos. That year an unknown woman in the stands remarked that the new color was “a lovely shade of cardinal.” A reporter overheard the remark and suggested in the next day’s paper that the team change its name to Cardinals and adopt the bird as its mascot. The team did just that the following year.

Detroit Tigers (1901). The American League Tigers are yet another team that derives its name from the color of their socks. The original colors of their uniform socks were black and yellow stripes, evoking the image of a tiger. There is some dispute over who coined the name, though, the team’s first manager George Stallings or Detroit sportswriter Philip J. Reid.

Chicago Cubs (1902). Founded in 1876 as the White Stockings, this National League team was also known as the Colts and the Orphans (after Cap Anson was fired as manager in 1898). In 1902, the Chicago Daily News suggested that the team be called the Cubs because of the number of young players on the team. The name became official in 1907.

New York Yankees (1904). The greatest franchise in professional sports got its start as the American League Baltimore Orioles in 1901. They moved to New York in 1903, taking on the name Highlanders and later the Hilltoppers after the elevated location of their Manhattan ballpark. The press started dubbing them the Yankees in 1904. Allegedly, the shorter name made for easier headlines. All three names were used until 1913, when the Yankees name became official. (The name Yankees was applied to the Boston Red Stockings in 1875 by at least one sportswriter, but the name doesn’t seem to have stuck to that team.)

Boston Red Sox (1907). When the original Cincinnati Red Stockings team disbanded in 1871, many of the players headed to Boston and formed a new Red Stockings team. That name did not last long in Boston, however. When an American League team was formed in the city in 1901, it played under a variety of names in the first few years, the PilgrimsPuritansPlymouth Rocks, and Somersets. Finally in 1907, the stockings reference was revived and the shortened Red Sox applied to the new team.

Atlanta Braves (1912). Few franchises have played under as many names (or cities) as has the National League Braves. Harry Wright and three Cincinnati Red Stockings teammates founded the Boston Red Stockings in 1871 when the Cincinnati team reverted to amateur play. In 1876, the team changed its name to the Red Caps, before becoming the Beaneaters from 1883-1907, after Boston’s long dietary association with beans. In 1907, the Dovey brothers bought the team and dubbed them the Doves. The team played under the name Rustlers in 1911, before being sold again in 1912, to Tammany Hall politician and building contractor James E. Gaffney. One of Gaffney’s partners, John Montgomery Ward, suggested the name Braves, because members of the Tammany Society were often referred to as such—the Democratic Party machine in New York City was named after Tammany or Tammend, a Delaware chieftain. The Braves moved to Milwaukee in 1953 and then on to Atlanta in 1966. The Braves name has remained with the team in all three cities.

Cleveland Indians (1915). The franchise dates to founding of the American League in 1901. Originally known as the Bluebirds or Blues, they became the Broncos in 1902. The following year they became the Naps, after Napoleon Lajoie, their star second baseman and later manager.  When Lajoie left the team in 1915, owner Charles W. Somers sought out the opinions of several sportswriters regarding a new name, finally settling on Indians. There is myth that the team ran a contest to determine the name and fans chose Indians in honor of Louis Sockalexis, a Penobscot Indian who had played for the old Cleveland Spiders and who had died in 1913, two years before. But this myth is untrue.

Kansas City Monarchs (1920). The Monarchs were a charter member of the Negro National League, playing in that league from 1920–31. It was an independent club from 1932–36, before becoming a charter member of the Negro American League in 1937. The team continued playing until 1959.

Baltimore Black Sox (1923). No relation to the infamous Chicago team of 1919, a number of Baltimore Negro league teams played under this name from 1923–34.

Kansas City Royals (1928). The International League franchise that played in Montreal from 1928–60 were the Royals. But the more famous team of this name is the American League expansion team that started playing in Kansas City, Missouri in 1969. The name was chosen from over 17,000 entries in a fan contest. Royals is a reference to the American Royal Association, which runs an annual livestock event, parade, rodeo, and barbecue in the city.

Pittsburgh Crawfords (1931). Named after the Crawford Bath House in a historically African-American section of Pittsburgh, the Crawfords were a Negro league team. Originally an independent club founded in 1931, they joined the Negro National League in 1933. They moved to the Negro American League and Toledo, Ohio in 1939, and then on to Indianapolis in 1940. The team folded the following year. The team was briefly revived back in Pittsburgh in 1945–46 as part of the Negro United States League. The 1935 Crawfords are widely considered the greatest Negro League team of all time, and by some as the greatest baseball team period. It featured stars Cool Papa Bell, Oscar Charleston, Josh Gibson, Judy Johnson, and Satchel Paige.

San Diego Padres (1936). The Padres were originally a minor league team in the Pacific Coast League. The name evokes the history of Spanish Catholic missions that first brought European settlers to California. In 1969, when the National League expanded yet again, the team moved up to the big league. 

Toronto Blue Jays (1943). The National League Phillies vainly tried to change their name to the Blue Jays in 1943–44, but fans and the press never accepted the name, and they went back to being the Phillies for the 1945 season. In 1977, the American League expansion team in Toronto picked up the name. The team’s original owner was Labatt’s brewery, which named the team after its Labatt’s Blue beer and which used a blue jay in many of its ads. The Blue Jays are currently the only non-U.S., major league franchise.

Minnesota Twins (1961). This American League team had originally been the venerable Washington Senators before moving to Minneapolis-St. Paul in 1961. The name is from the toponymic sobriquet Twin Cities.

Los Angeles Angels of Anaheim (1961). This American League team originally played in Los Angeles, hence the name. They moved to new digs in nearby Anaheim in 1965, changing their name to the California Angels. In 1996, they formally adopted the name of their home city, becoming the Anaheim Angels. But they changed it again in 2005, taking on the mouthful that is their current name.

Houston Colt .45s (1962). From 1962–64, the National League expansion team in Houston was officially named the Colt .45s, after the firearm that “won the West.” The Colt Arms Company, which was not associated with the team, objected, as did many fans who didn’t like the violent association and simply called the team the Colts.

Houston Astros (1965). But in the early 60s, Vice President Lyndon Johnson brought NASA’s manned space program to Houston and the city became space central. In 1965 when the Houston team moved into its new indoor digs in the Astrodome, the team changed its name to the Astronauts, which was quickly clipped to Astros, and popularly even further to simply the ’Stros.

Montreal Expos (1969). The first non-U.S., major league team is named after the Montreal World’s Fair, Expo ’67. The team moved to Washington, D.C. in 2005, becoming the Nationals.

Seattle Pilots (1969). Before Microsoft, Amazon, and Starbucks, Seattle was most famous for its aerospace industry, most notably represented by Boeing. Hence the name Pilots, a name for a short-lived American League expansion team that played in Seattle for one year before bankruptcy resulted in it being sold and moved to Milwaukee to become the Brewers.

Texas Rangers (1971). In 1971 when the Washington Senators moved to Arlington, Texas, they changed their name to Rangers, after the famed, nineteenth-century, Texas, law-enforcement organization.

Seattle Mariners (1977). The American League Mariners were named as the result of a newspaper contest. The name is a reference to Seattle’s seafaring tradition.

Colorado Rockies (1993). This National League expansion team is named, obviously, for the mountain range. They play in Denver, but chose to go with Colorado to increase the regional appeal.

Miami Marlins (1993). A National League expansion team named for the game fish found in Florida waters, the team began life as the Florida Marlins, changing the name to Miami in 2011.

Arizona Diamondbacks (1998). Yet another National League expansion team, this one is named for the diamondback rattlesnake, common to the deserts of the American Southwest.

Tampa Bay Rays (1998). This American League team is named for the fish (also known as a manta ray) native to Florida waters. The team was originally the Devil Rays, from 1998–2007.


Source:

Dickson, Paul. The Dickson Baseball Dictionary, third edition. 2009.

Is Two Better Than One?

8 May 2018

My Facebook feed has filled with people posting about this Washington Post article about a study that purportedly shows that “science” has shown that typing two spaces after a period is superior to typing just one. The number of spaces that should follow a period is one of those eternal topics of debate, with peevers and pedants on both sides assuredly proclaiming that their position is the correct one, but almost never with any evidence to show that they are, in fact, correct. So the idea that a study has definitively settled the question would be a welcome relief. The trouble is, the study in question does no such thing.

The study in question is “Are Two Spaces Better than One? The Effect of Spacing Following Periods and Commas During Reading” published last month in the journal Attention, Perception, and Psychophysics by Rebecca L. Johnson, Becky Bui, and Lindsay L. Schmitt. Unfortunately, the journal, like most academic publications, is behind a paywall, so most people don’t have access to it unless they want to fork over forty dollars. (One of the benefits of working for a major research university is that I have access to it through Texas A&M’s library.) The general public has to rely on what reporters say about it.

Now, when there is a mistake of this nature, the problem is most the fault of either the reporter or the university public relations department press release trying to make the study’s conclusion sound sexier than it really is. But in this case, the problem is with the researchers, whose conclusion does not follow from the data they present, and indeed, their experimental design precludes this study from making a definitive contribution to the question of whether or not two spaces are better than one. Blame should also go to the peer reviewers, because this paper should never have been published as it is currently written. Which is a shame, because the study does have other surprising and interesting things to say, even if it doesn’t answer the question that everyone seems to care about.

The experiment was conducted in two phases. In the first phase the participants were asked to type a short paragraph of five sentences (97 words). From the results, the participants were classified as either “one-spacers” or “two-spacers,” depending on the number of spaces they put at the end of each sentence. (All the participants used only one space following commas.) The second phase was an eye-tracking study. The participants were asked to silently read one practice and twenty test paragraphs of 71–166 words each. Each of the test paragraphs fell into one of four categories:

  • one space after both periods and commas;

  • one space after periods and two spaces after commas;

  • two spaces after periods and one space after commas;

  • two spaces after both periods and commas.

After reading each paragraph, the participants were tested on their reading comprehension and the researchers collected and analyzed data on reading speed and comprehension.

Now here is the important part. The paragraphs were presented in 14-point Courier New font with quadruple spacing between lines. Courier New is a monospaced font, like that on a typewriter, where each letter takes up the same amount of horizontal space, e.g, an < i > takes up as much space as an < m >. Most word processor fonts, and all fonts used by professional publishers, are proportionally spaced, where the horizontal space used by each letter varies with the size of the letter, e.g., an < i > takes up less space than an < m >.  The published article does not specify if the paragraphs were left-aligned or justified, how long each line was, the color of the type and background, nor the resolution the monitors were set at. All of these factors can affect readability, perhaps more dramatically that post-period spacing. So, the description of the experimental design is inadequate for replication.

These choices were made for the sake of the eye-tracking software, which works best with monospaced font and quadruple spacing, despite the fact that almost no document anywhere is actually written in monospaced font with quadruple spacing. At the core of the argument of those who advocate for single spacing following periods is the claim, unsubstantiated by any good evidence but almost universally accepted, is that two spaces are better with monospaced fonts, like typewriters, but that one space is better with proportional fonts. If you want to test which spacing practice is superior, you must test using a proportional font and single line spacing, which is how the vast majority of real-world documents are published. Also, whether or not the paragraphs were left-aligned or justified might make a difference. Two spaces can result in very odd and disruptive spacing when the text is justified, but most professionally printed documents are indeed justified. The monitors used in the study were also of 2002 vintage, and hardly representative of the present-day, digital reading environment, much less that of printed works. In essence, the researchers were on a proverbial hunt for lost car keys under the street lamp because that’s where the light was best.

The researchers recognize this problem, but they hand wave it away with the unsubstantiated statement:

If the facilitation from two spaces is due in whole or in part to increasing the space relative to other spaces (e.g., to indicate not only the end of a word, but also the end of a sentence), then two spaces should facilitate reading even when text is presented in a proportional font where a single space is the same size regardless of whether it follows a punctuation mark or not.

But they present no evidence to support that belief, and in fact the evidence that the effect claimed by the study exists at all is equivocal, at best. A good study does not go places the data doesn’t.

What does the study actually show? First, there was no significant difference in reading comprehension for any of the four types of paragraph. In short, it appears that the number of spaces following a period or comma has no impact on reading comprehension. That’s a result that provides evidence that it may not really matter whether you use one or two spaces. It’s not definitive evidence, because the test is in no way similar to real-world reading conditions, but it is evidence nonetheless.

It’s with reading speed that the results get interesting. First, the study showed that for all readers, two spaces following commas significantly slowed reading speed. This isn’t terribly surprising or controversial—after all, no one advocates for two spaces following commas—but it does show that spacing can, at least in some circumstances, impact reading speed.

There was a small, that is around 3%, increase in reading speed for paragraphs that had two spaces after periods and one space after commas, but this effect only applied to the two-spacers. The one-spacers had no significant difference in reading speed between the four types of paragraphs. So it would seem, that if you are accustomed to typing two spaces after periods, you read paragraphs typed in this manner a bit faster. But if you are accustomed to typing only one space after a period, then there is no difference. This is an odd result and deserves further looking into, although the effect size is quite small, and it may be that when the experiment is replicated this effect disappears.

The eye-tracking also showed that readers tended to dwell longer on the spots where there was only one space following a period, which indicates that readers’ brains take longer to process those characters, but this effect did not impact overall reading speed. The conclusion that seems most likely is that two spaces following periods is easier to read, but this effect, while measurable, is insignificant for any practical purpose.

The really interesting result, though, is that the two-spacers had an overall faster reading speed than the one-spacers, regardless of the type of paragraph. Again, that’s a really surprising result and deserves further investigation. At the least, the experiment should be independently replicated to see if this result appears again.

So what do we have? The study, as written, is flawed. It’s primary conclusion is not supported by the results, and indeed the experimental design used cannot produce the data necessary to reach the conclusion that two spaces is better than one. Also, the description of the experimental design is missing critical parameters that are required for replication. But, nonetheless, it does produce some interesting results, even if those aren’t what it is touted to produce.

As far as I know, all the good evidence available, which includes some of the evidence in this study, indicates that there is no reason to believe that either one or two spaces after periods is superior in any measurable way. That is, it really makes no difference. Given that, the best practice is, therefore, to follow convention. And since, pretty much without exception, all professional typesetters and publishers use only one space after periods, that’s what you should do too, at least until someone comes up with a study that provides solid evidence to the contrary.

For further reading, I recommend Matthew Butterick’s article on this subject, which reaches pretty much the same conclusions that I did. His website is also a great resource for all thing typographical.


Sources:

Butterick, Matthew. “Are Two Spaces Better Than One? A Response To New Research.” Butterick’s Practical Typography, 30 April 2018, https://practicaltypography.com/are-two-spaces-better-than-one.html.

Johnson, Rebecca L., et al. “Are Two Spaces Better than One? The Effect of Spacing Following Periods and Commas During Reading.” Attention, Perception, and Psychophysics, published online 24 April 2018.

Selk, Avi. “One space between each sentence, they said.    Science just proved them wrong.” Washington Post, 4 May 2018, https://www.washingtonpost.com/news/speaking-of-science/wp/2018/05/04/one-space-between-each-sentence-they-said-science-just-proved-them-wrong-2/.

The Oxford Comma and the Law

The legal dispute between the Oakhurst Dairy and its drivers has been settled. As widely reported in the media, the dispute hinged on the use, or omission, of the Oxford comma. But the media, or at least the New York Timesis still getting it wrong. The ambiguity in the law was never just about the Oxford comma. The court ruled that the law as a whole was badly worded and ambiguous and made its ruling based on the legislative intent of the law, not the punctuation.

The latest New York Times article says that because of the settlement we’ll never get a legal ruling on the Oxford comma, but again, that’s wrong. The court had resolved the ambiguity in the law in favor of the drivers, and the ongoing proceedings were to determine the facts of the case and what damages, if any, were to be awarded the drivers. The settlement puts an end to that process.

The story, in all its grammatical detail, as I wrote it on 17 March 2017:

The Oxford comma was in the news recently when a federal court interpreted a Maine statute regarding overtime pay for dairy truck drivers. In the case of O’Connor, et al. v. Oakhurst Dairy, the lack of a comma, or so the news stories would have it, resulted in a victory for workers’ rights. The Oxford comma (serial comma) is the comma after the penultimate item in a list, as in me, myself, and I; the Oxford comma is the one after myself.

The problem with the news reporting on this case is that the ambiguity does not rest solely with the lack of a comma. And, more importantly, the decision of the circuit court did not rest on the punctuation but rather relied on other methods to interpret the statute in question.

First, let’s get this out of the way: the Oxford comma is a style choice. It’s not a hard rule of punctuation. Whether or not one chooses to use it is optional. Those in favor of using it often argue that its use removes ambiguity, but that’s not necessarily the case. Its use can create ambiguity just as often as its omission. For every instance of ambiguity resolved by its exclusion, as in I’d like to thank my godparents, Jane Doe and John Smith (is the speaker thanking two people—Jane and John are the godparents—or four?), there is an instance like I’d like to thank Jane Doe, my aunt, and John Smith (again, is it two people—Jane Doe is the aunt—or three?). Both consistent use and consistent non-use of the Oxford comma will result in ambiguity at some point, and when faced with such ambiguity one must either be inconsistent, adding or deleting it as appropriate, or rephrase the sentence.

But back to the case at hand.

Maine law, 26 M.R.S.A. § 664(3), requires that workers be paid time and a half for work in excess of forty hours per week. But that law has several exceptions, one of them, subsection (F), being that the overtime rule doesn’t apply to:

The canning, processing, preserving, freezing, drying, marketing, storing, packing for shipment or distribution of:
(1) Agricultural produce;
(2) Meat and fish products; and
(3) Perishable foods.

Oakhurst Dairy in Portland, Maine did not pay its delivery drivers overtime, and the drivers sued for the overtime wages they felt they were entitled to under the law. The lack of a comma after shipment created ambiguity—are packing for shipment and distribution one or two distinct activities? If they are one activity—the packing in preparation for delivery—then the drivers, who only deliver and do not pack, are owed overtime because the actual delivery is not exempt from the overtime rule. If they are two activities, the delivery (i.e., the shipment or distribution) is exempt, then they are not owed the money.

The Maine Legislative Drafting Manual dictates that statutes not use the Oxford comma and advises that the sentence should be rephrased if ambiguity results from its omission. And the practice in Maine follows this guideline; Oxford commas are just not found in its laws. This preferred style practice would lead one to conclude that packing for shipment and distribution are two distinct activities, both exempt from the overtime requirement. This interpretation is reinforced by the use of the synonyms shipment and distribution. If they are not distinct activities then the use of both is redundant.

Furthermore, there is no conjunction preceding packing. In such lists we normally expect a conjunction before the last item. If the law intended shipment and distribution to be a single activity, it would have read ...storing or packing for shipment or distribution.

And indeed, the federal district court for Maine followed this logic and interpreted shipment and distribution to be distinct and denied the drivers’ suit.

But it’s not that simple. The drivers appealed and argued before the circuit court that shipment and distribution are not redundant, that shipment refers to use of third-party carriers, while distribution refers to delivery by the employer’s own drivers. The drivers cited dictionary definitions, the use by Oakhurst Dairy in its in-house communications, as well as other Maine statutes that treat the shipment and distribution as distinct activities and not redundant to support this distinction.

The drivers also argue that since all the other activities in the list, aside from shipment and distribution, are gerunds (e.g., canning, processing, preserving), the grammatical rule of parallel construction would indicate that shipment or distribution is a modifier of packing and are not activities subject to the exemption. If they were intended to be distinct activities, the law would read ...storing, packing for shipment or distributing, rather than distribution.

And as for the lack of a conjunction before packing, the drivers point to the use of asyndeton, the deliberate omission of a conjunction from a list, as in I came, I saw, I conquered. But the use of asyndeton in law, and in Maine statute in particular, is rare, so the circuit court concluded that this particular rejoinder was not very convincing.

After all this, the circuit court concluded that any plain reading of the text was ambiguous, and that methods other than grammar had to be used to determine its meaning. Instead, the court relied on the general principle in interpreting Maine law that ambiguous provisions “should be liberally construed to further the beneficent purposes for which they are enacted.” And since the overtime rule was enacted to further the health and well-being of workers, the circuit court ruled in the drivers’ favor and reversed the district court’s ruling.

Now that the decision has been made that the drivers are not exempt from the overtime law, the case goes back to the district court for trial to determine if they did in fact work the overtime hours, and, presumably, that any such overtime did not involve packing. Or, of course, it could be settled out of court.

This case is far from the first where the comma has played a role in how a statute is interpreted. In 1872, the Tariff Act of 1870 was revised with devastating consequence to U.S. federal tax revenue. The original act had exempted from tariffs fruit plants, tropical and semi-tropical for the purpose of propagation or cultivation. But the 1872 revision inserted a comma between fruit and plants. In the original act, the only thing that was exempt from tariffs were the fruit plants themselves. The fruit itself and other types of plants were subject to taxes. The addition of the comma, reading fruit, plants, tropical and semi-tropical, meant that all fruit as well as all plants were exempt from the tariff.

In 1989, in the case of United States v. Ron Pair Enterprises, Inc., the U. S. Supreme Court ruled that the comma did have interpretive weight. At question was the meaning of § 506(b) of Chapter 11 of the 1978 Bankruptcy Code. That law reads:

...there shall be allowed to the holder of such claim, interest on such claim, and any reasonable fees, costs, or charges provided for under the agreement under which such claim arose.

The court ruled that the comma after interest on such claim, meant that the interest did not have to be part of the original agreement and that the holder of the claim could receive interest that accrued after the bankruptcy filing. The interest is contrast to reasonable fees, costs, and charges, which are only due if they arose as part of the original agreement prior to bankruptcy.

And, of course the commas in the Second Amendment to the U.S. Constitution have been the subject of endless controversy—which I don’t care to re-litigate here. That amendment reads:

A well regulated Militia, being necessary to the security of a free State, the right of the people to keep and bear Arms, shall not be infringed.

But these other cases do not involve the Oxford comma. To my knowledge, O’Connor, et al. v. Oakhurst Dairy is the first legal case that involves this particular use of the comma, and the courts ruled here that the punctuation is not dispositive. In other words, the Oxford comma cannot, at least in and of itself, be taken as determinative of meaning. In this, the news reports have been getting it wrong—whether or not one uses the Oxford comma just doesn’t matter all that much, in law or anywhere else.