A New Type of Turing Test

25 July 2016

In 1950, computer pioneer Alan Turing formulated his famous test for determining whether or not a computer was true artificial intelligence (AI). It involved discourse between humans and a computer, and if the humans could not tell whether they were speaking to a another person or to a machine, then the machine was “intelligent.” A neat idea, but when put in to practice it’s been found to be too easy to fake.

Over the years various improvements to the Turing test have been suggested, and one recent AI challenge used a rather nifty linguistic approach, outlined by this article in the Neurologica blog. At its core, the test, known as the Winograd schema, asks the AI to determine the referent of an pronoun in a sentence. The pronoun would be ambiguous except for one word that provides the necessary context. For example:

The trophy would not fit in the brown suitcase because it was too big.

What does it refer to, the trophy or the suitcase?

In the sentence, big can be replaced with small, which alters the context and the identity of the referent. Humans have no difficulty getting the correct answer (it refers to the trophy when the adjective is big and the suitcase when the adjective is small), but in the challenge the AI performed dismally, with only the best scores equal to chance guessing.

While I suspect that there are probably as many issues with the Winograd schema as there are with the original Turing test, it’s a neat use of language to test reasoning ability.

Spelling Reform

22 July 2016

Anatoly Liberman is one of the leading etymologists out there, author of Word Origins and How We Know Them and the Analytical Dictionary of English Etymology. I did not know until recently, however, that he is also an advocate of English spelling reform. Linguist John McWhorter recently interviewed him regarding that subject for Slate’s Lexicon Valley podcast.

Now there is no denying that English spelling is a mess. There an in island for no phonological or etymological reason. Whole and hole are pronounced the same but spelled differently, and even the most skilled writers occasionally slip when it comes to lead led and principle principal. It would be nice if we could fix English spelling, but is such a project possible? And even if we could reform it, would it be worth the effort. The answer, to my mind, is no.

Liberman, however, thinks the opposite. He contends that an effort, if it is modest in its goals, has chance to make a real difference. He points to several past efforts at spelling reform that have been successful: the American spelling reform led by Noah Webster in the early nineteenth century, the reformation of the Icelandic spelling later in that century, the post-revolution reform of Russian spelling, and the 1990s spelling reform in German-speaking countries. But only the last of these supports his position, and then only weakly. The American and Russian reforms came in the wake of political revolution and a deep-seated desire to split from the old regime. (Plus, in the case of the Soviet Union, the power of the totalitarian state was invoked to enforce the changes.) Furthermore, the nineteenth-century American reforms took place in a much smaller nation, in both population and area, and it was only partially successful; most of Webster’s proposed reforms never caught on. As for Iceland, one cannot compare that language with English. What can be accomplished with a language spoken by a small, homogeneous population has no bearing on what can be accomplished with a global language like English. None of these situations obtain with the English language today.

On the face of it, though, the German reform would seem to provide a nice model. It was an international effort, including Germany, Austria, Liechtenstein, and Switzerland. Despite some push back and grumbling, the German reform has largely been successful, but its goals were quite modest. The reforms standardized the use of doubled and tripled consonants and the capitalization of nouns, and inserted a space into, splitting, some compound words. Furthermore, it was a multinational, government-led effort, not a grass-roots project such as Liberman is advocating—as with the Russian reforms, you can get a lot done with the power of the government behind you. Liberman takes this as evidence that a modest push for reform can fix some of the most egregious problems with English spelling. But even if he is right, and we can corral the leading English-language publishers, dictionaries, educators, and other authorities around the world and get them to agree on a handful of sensible changes, would it make any discernible difference? English spelling is in such state that it wouldn’t.

The German reform focused on a few systemic problems. Basically, it set a standard in a few areas where there was none. The rest of German spelling is, and always has been, rather regular and predictable—even the inconsistencies were systematic (very German, that). That’s not the situation with English, where there is little rhyme or reason to the inconsistencies. If one focused on a few fixes, there would be scores of others, just as important, left unattended. Then there is the problem of dialect. What spelling standard do we use? British? American? Do we split the difference and go Canadian? What about India, where there are more English speakers than Britain, Canada, Australia, and New Zealand combined? And let’s not even mention the myriad small, but vibrant and highly idiosyncratic, communities of English in places like Singapore.

And while we would be focusing on fixing a few problems, more would be arising. The language would continue to change. We would be borrowing more words from other languages, with spellings that defy English orthographic conventions.

English spelling is in the mess that it is in for four main reasons.

  • The first is historical. English spelling was standardized with the introduction of printing, but this was also a time when pronunciation was rapidly changing, the so-called Great Vowel Shift. As a result, some words were standardized using the old pronunciation and some the new (hence quirks like similar spellings but different pronunciations for words like police and policy).

  • Another problem is that we don’t have enough letters to represent all the sounds, especially vowels. There are some twenty vowels in British English (slightly fewer in American English), but only five letters that are used to represent them (six if you count Y). The resulting doubling and tripling up of phonemes to letters inevitably leads to problems.

  • Pronunciation changes faster than spelling. For the most part, English spelling was standardized around the London dialect of the sixteenth through eighteenth centuries. But most English speakers, including modern Londoners, use dialects that don’t pronounce the words in that manner. So global English has multiple pronunciations for the same word, many of which don’t match the spelling.

  • English is a great borrower of words. When it Anglicizes a word, the language usually retains the foreign spelling, which often uses a different scheme for matching pronunciation to spelling.

Finally, this may be the worst time to attempt to standardize and fix English spelling. The primary reason that English spelling is a mess is that the standardization came with printing—the new medium called for a standard. And just as when English spelling was first standardized during the shift from manuscripts to printing, we’re in the midst of another great media revolution, the shift from print to digital. Who knows what the fallout of that will be? If we were to change the rules now, it’s just as likely that the new spellings will be quickly outmoded.

The business world recognizes the concept of switching cost. Often, old and inefficient technologies and products continue to dominate the market because it’s simply too costly to develop an alternative. The QWERTY keyboard is difficult to learn and slow to type on, but since everyone knows how to use it, more efficient competitors can’t get a purchase on the market. Similarly, a competitor to Facebook is unlikely to appear because so many people already use that social media platform. Google Plus is a superior product, but no one uses it because everyone you want to talk to is on Facebook. The same is true with spelling reform. It would be too costly to implement. Liberman is probably right that a handful of modest reforms could be implemented successfully if we tried hard enough, but those few modest reforms won’t make dent in the problem.

Man vs. Marine

30 June 2016

The Washington Post reports that the U. S. Marine Corps is eliminating the word man from nineteen of its job titles. An infantryman will now be called an infantry marine, and what was once a field artillery man is now a field artillery marine. Some job titles are retaining the man, however. A marine can still be a rifleman. (How a rifleman differs from an infantry marine I don’t know. Perhaps someone with experience in the Corps can enlighten us.) But manpower officers and marksmanship instructors keep their existing titles.

The move is a result of combat arms positions becoming open to women and is in line with similar shifts in civilian nomenclature that happened decades ago, like policeman to police officer and fireman to firefighter. The changes are quite sensible and in a reasonable world would be uncontroversial. Even the retention of man in some job titles generally follows a logic: retained where it is part of a larger term with no clear gender-neutral replacement (e.g., marksmanshipunmanned) or in places where man is used to refer to staffing (e.g., manpower). Rifleman remains the anomaly. Perhaps it’s being retained for historical and cultural reasons—the identity of the rifleman is so central to the Corps’ vision of itself that it would be anathema to change the word. Or perhaps it was a bureaucratic sop thrown to those on the committee that resisted the changes.

The move is not without its detractors, though. The Post article includes the usual complaints about political correctness, but I haven’t seen any reasoned responses against the move. They all seem to be kneejerk reactions against change. And if anything, by replacing man with marine, the Corps is further strengthening its aura of being a breed apart. You’re not just a man, you’re a marine.

Woody Words

25 June 2016

This classic popped up on my Facebook feed today:


Wrong.

11 June 2016

Journalists love to write articles on language. Not only, since they make their livings with words, do they have a professional interest in the topic, but language is a popular topic. People, at least those who read newspapers, love to read about it. The problem is that journalists often get it completely wrong.

A case in point is an article by Dan Bilefsky that appeared on the front page of the New York Times on 9 June about how use of the period, that staid and boring punctuation mark, is changing. In some forms of discourse, the period does not simply mark the end of a sentence, it conveys urgency or emotion. He gets the facts right, but Bilefsky utterly miscategorizes what is happening, framing the period as “going out of style” and “being felled.” Nothing could be further from the truth.

phdPunctuation.gif

What is actually happening that in short, electronic forms of communications, such as texts and tweets, the period is not really needed to mark the end of the sentence—much as it isn’t needed in street signs (“Stop” not “Stop.” Or “Exit” not “Exit.”) or in newspaper headlines. Since the period isn’t needed to signal the end of a complete thought, it is available for other purposes, and that’s what texters and tweeters have done. In short, digital messages the period can convey that the writer is not happy about the statement that was just made. So if you arrange to meet a friend at Starbuck’s and she replies “OK” that signals agreement. If she replies “OK.” you had better find a locally-owned, fair-trade coffee shop in which to meet; she is coming, but she’s not happy about it. This type of change is a natural, and useful, adaptation to changing conditions.

But the period is not disappearing from standard prose. While the linguists that Bilefsky quotes (David Crystal and Geoffrey Nunberg) take pains to note that this shift in orthographic convention is restricted to short, electronic messages, Bilefsky frames it as occurring in all forms of prose, even going so far as to write his entire article without using any sentence-ending periods—a cute device, but not at all an illustration of the phenomenon. Crystal even went so far as to pen a blog post pointing out that Bilefsky misunderstood what he was saying.

Bilefsky is not only wrong, he’s late to the party. He was scooped by his own paper. Jessica Bennett wrote a much more accurate piece on the changing roles of punctuation marks in digital communications in the Times over a year ago. Ben Crair had a piece on the changing role of the period in the New Republic back in 2012. (Mark Liberman wrote a Language Log post about the comments to Crair’s piece that is well worth a read.) A week or two before Crair’s piece, PhD Comics took the subject on. All of these other articles got the subject essentially right.

So late and wrong. I think we can expect better from the New York Times.