Typos and Digital Publishing

20 July 2011

There’s a recent opinion piece in the New York Times Online about how digital publishing has created a boom in typos and bad spelling making their way through to appear in the final versions of books and other publications. It’s interesting, but I don’t buy the arguments put forth by Virginia Heffernan, the article’s author. The cause isn’t digital technology, it’s corporate economics.

As Ms. Heffernan points out, there have always been bad spellers, and the ability to spell correctly does not correlate with excellence in writing. She rightly gives the example of F. Scott Fitzgerald, who was a notoriously incompetent speller. There has always been pressure to publish quickly. What has changed? Ms. Heffernan says:

Before digital technology unsettled both the economics and the routines of book publishing, they explained, most publishers employed battalions of full-time copy editors and proofreaders to filter out an author’s mistakes. Now, they are gone.

To which I reply, post hoc ergo propter hoc. Digital technology did not fire the copy editors, management did. The cause is the surge in Wall Street mergers and acquisitions that began in the 1980s. Publishing was always a low-profit enterprise, with expected returns of 5–8%. But once assembled into huge media conglomerates, book divisions had to compete with more profitable divisions and owners demanded returns of 20–25%. To accomplish this, publishers churned out more product and cut overhead—all those copy editors and their princely salaries.

(The same economics are what is killing newspapers. Yes, the decline in ad revenue is unsustainable in the long term, but for the moment, most newspapers are still highly profitable. They’re cash machines. What is killing newspapers is the enormous debt they have accumulated in acquiring other newspapers and becoming media conglomerates.)

Ms. Heffernan also blames “writerly inattention.” Manuscripts are longer and more carelessly assembled on the word processor, says Ms. Heffernan. I find this hard to believe. The same economic incentives that kept published books to a certain length in the typewriter era still obtain, and successful online writers know that brevity is important to retaining a reader’s attention, perhaps more so than in print. And I can’t believe that manuscripts were better organized and structured in the days of the typewriter. Word processors allow a writer to edit and structure a text much more easily and consistently than is ever possible with a typewriter. Although it may be true that the apparent ease with which authorial editing can be done electronically encourages sloppy writers who would have been daunted by the prospects of doing it in the typewriter era. So editors may be seeing more terribly constructed manuscripts in their slush piles.

There is one area where digital technology does make a difference in spelling, but Ms. Heffernan doesn’t touch on it. That is poorly scanned e-books. Amazon and Apple’s iStore are filled with cheap editions of public domain books that have been hastily scanned and converted to text with optical character recognition software. While OCR software has gotten pretty good, it’s error rate is still considerable, and any OCR’d text needs a thorough proofreading before it is worthy of publication. But again, economics step in and the low prices these e-books command, typically around one dollar, don’t make it feasible to hire proofreaders. As a result, these texts are rife with errors, some to the point they are unreadable. Even here, it is economics and not technology that is creating the problem.

(Hat Tip: Barbara Need)

Anachronistic Scientists

18 July 2011

In a post that makes some nice points about semantic change, Pete Langman, writing in the Guardian’s “Mind Your Language” blog, makes a whopping error:

Old words evolve, too, by stepping out of the dictionary and back into oral culture. Johnson’s use of the word “science” perfectly illustrates his point. “Science” now means a specific mode of inquiry; indeed, it presents a certain type of knowledge guaranteed, so to speak, by the method that underpins it. It was first used in the modern sense in 1834. But when Johnson used it, he meant scientia, or knowledge in the broader sense. The use of the word “scientist” to describe anyone before 1834 is not only anachronistic, but erroneous.

Now the main point of this paragraph is absolutely correct and worth saying. We have to be careful when applying current definitions to works written in the past. But the example is problematic, and the final sentence is ludicrous.

First, Langman conflates the words science and scientist. It is scientist that is first attested in 1834, not science. Second, the invention of the modern concept of science cannot be pinned down to any particular date. The OED entry for science is problematic; it’s one of those entries that has been haphazardly updated over the decades and needs a thorough scrubbing, and you can’t tell when the modern sense of the word emerges. I’ll define the modern concept using Merriam-Webster: “knowledge or a system of knowledge covering general truths or the operation of general laws especially as obtained and tested through scientific method [i.e., systematic use of empirical and controlled observation].” There are plenty of examples of excellent science from before 1834. The names Galileo, Kepler, Newton, Jenner, and Davy spring immediately to mind. The Royal Society was founded in 1660. I can even point to flashes of scientific methodology by the eighth/ninth-century Bede and the tenth/eleventh-century Ælfric. Now it is true that most of the modern institutions of science-as-we-know-it-today (e.g., journals, professional/university laboratories) did not come into being until the first half of the nineteenth century, but that’s a historical or sociological issue, not a linguistic one, and it does not mean that there weren’t earlier examples of good science.

Langman seems to be saying that we shouldn’t call anyone a scientist because that word didn’t exist before 1834. He isn’t the first to make this claim. I’ve heard others make it, but it is just patently absurd. We can certainly apply words anachronistically. Just because Humphry Davy (d. 1829) didn’t have the convenient label to denote his profession doesn’t mean that we can’t look back today and describe him as a superb scientist. Are we to say that there is no such thing as the Middle Ages because the term wasn’t coined until 1605, and no one living the period would have described it as such? Now, people will point to the fact that Newton toyed with mysticism and alchemy in addition to his scientific pursuits, but Linus Pauling (1901–94), who is one of only two people to win Nobel Prizes in two different fields—the other is Marie Curie—also engaged in crank medical research involving vitamins. Great scientists are not immune from bad ideas and don’t always apply the scientific method to everything they do.

Now Langman is correct that we should be careful when applying terms like science and scientist to pre-modern eras. I mentioned Bede and Ælfric; now while I can point to examples of them using the empirical method in their work, I would by no stretch of the imagination label these medieval monks as scientists or describe their general approach to discovery as scientific. Which is why I wrote “flashes of scientific methodology” and not “science.” Use of modern senses of such words are indeed anachronistic, but that does not mean they are “erroneous.”

[Edit: Upon rereading Mr. Langman’s post and engaging in a lengthy discussion with him on the forums here, I realize my initial assessment of the piece is a bit unfair. So I’ve changed the opening paragraph above. — dw]

Vocabulary Test

17 July 2011

This site has a rather fun and quick vocab test that purports to give you an estimate of your total vocabulary size. I can’t vouch for the accuracy of their vocabulary estimates, but even if not accurate, it’s fun.

It’s supposedly part of linguistic research project, and your participation will help some linguists somewhere. Although the site is vague about who is doing the research.

My estimated vocabulary is 34,400, which evidently is very respectable, near the top of the range within which most adult, native-speakers fall within. I was surprised by the number of words I recognized, but couldn’t come up with a definition. Context matters. Seeing them in a paragraph is a very different experience from seeing them in a list of words devoid of context. There was one word that I don’t think I’ve ever actually seen, but I knew the meaning right away: uxoricide.

[Tip o’ the hat to Languagehat.]

The iPad as a Grad Student Tool

15 July 2011

It appears we’re on the cusp of tablet devices really breaking out and becoming truly ubiquitous. In past weeks I have fielded numerous inquiries from fellow grad students and professors who are considering whether or not to get a tablet. So, I’m going to try and capture my thoughts on the subject all in one place.

First, my experience is with the iPad. There are other devices like Kobo and Kindle. I haven’t really used these others, so I’m not going to do a comparison, and what I say about the iPad may or may not apply to other devices. I have the original iPad, wireless only. I decided not to plunk down the cash for a 3G model and plan, and I haven’t missed it. But then, between home and campus, I’m rarely out of reach of a wireless network, and if I really need one there is always a coffee shop nearby. Those in non-urban environments may find no-3G limiting, though. Since getting my iPad, I’ve pretty much stopped using my iPhone for anything other than as a phone, music/podcast player, and for Google Maps. (Both music and maps are more handy on the smaller device.)

I’ve had my iPad for some nine months now, so I’ve got considerable experience in using it and have settled into methods and routines that work for me. I haven’t exploited the full potential of the device and my comments are based on my idiosyncratic use. Finally, I have no personal stake in or affiliation with Apple. While I am generally impressed with the quality of the company’s products, I’m not an Apple fan boy, and even when living in San Francisco I never had any desire to attend Macworld, and my PCs have always been Windows-based. So I believe I’m objective on that score.

My bottom line assessment is that every grad student in the humanities needs a tablet. Keep that in mind when I make a negative comment or outline the device’s limitations. They are great, but an iPad is not a complete replacement for a laptop.

What the iPad is really good at:

  • Reading. Untethering from the PC and reading on a tablet is simply wonderful. The iPad is easy on the eyes and can be read in any indoor lighting conditions. And the ability to take and especially search digital notes is priceless. More on this below.

  • Web surfing.

  • Reading Email.

  • Social media apps, like Facebook and Twitter.

  • Video. While the iPad doesn’t do flash video, it can play other formats. The resolution is superb, and it’s much more relaxing to watch on the tablet than it is sitting at a computer. A television is still the best device for video, though.

  • Storing massive amounts of data. The storage capabilities are truly impressive. I use only small fraction of the available storage space. In fact, I can’t imagine that I’ll ever have to cull files to make room for more, unless I start using it as my preferred video device over my television. But I don’t have any music or many movie files on my iPad. If I did, storage would be tighter.

  • Inexpensive software. Say what you will about them, but iPad apps are very reasonably priced. A $5 or $10 iPad app would cost $70 on a PC, and most are under $5, if not free.

    Functions I’m on the fence about:

  • No Flash movies or animation. This is always brought up as the big iPad drawback, but frankly I don’t miss it all that much. (I only mention it because everyone else does.) It’s a minor annoyance to find a website with a video that won’t play, but then I realize that I should be working and not watching videos.

  • Presentations. My assessment here is limited because I haven’t used the presentation apps much, and it may move up to “really good” as I use it more. (Back when I worked in Silicon Valley, I would have worked this function to death. That’s one difference between corporate and academic culture.) I’ve used this function exactly once, and, after a bit of fumbling because I’d never used the prezo apps before, it worked fine. The iPad requires a special dongle for the video interface, another $30.

  • Games and distractions. They’re fun, but it’s too easy to slip from reading journal articles to playing solitaire or Scrabble.

  • Lack of wireless syncing. Having to physically plug into your PC to sink is a pain. But there is probably a good security reason for not allowing wireless sync. (I can’t imagine any other reason why they wouldn’t have this.)

What the iPad sucks at:

  • Reading critical editions of literary works. More below.

  • Finding the right edition of ebooks to buy. More below.

  • Reliance on iTunes as PC interface. It may work better on a Mac, but iTunes for Windows is the absolute suckiest piece of software ever put out by a major firm. It’s a train wreck. More below.

  • Multi-tasking. Switching between apps is slow and cumbersome. Other than the basic apps (e.g., the clock/timer, music player), you can’t run apps in the background. Switching from reading a book to look something up on the web is a pain.

  • Wordprocessing. More on this below.

  • Uploading docs to the device. It may be easier with a Mac, but it’s a pain on a PC.

  • Sharing docs between apps. It simply can’t be done. You load documents into individual applications and there they stay. Edit a doc in your word processer and the new version is not available in your reading app.

Reading
The iPad is a fantastic reading device. I’ve read all of Moby DickMiddlemarchMoll FlandersGulliver’s Travels, and a mess of Shakespeare and other works on the iPad, and it is much better than bulky books. The screen is easy on the eyes and you can use it all day without strain. For those who don’t like “reading on a computer,” the experience is entirely different. It’s much more like a book than a screen. In addition, it is ideal for journal articles, which are usually available in PDF format. Now all your reading is available on one easily portable device. The reader apps all allow you to highlight and make notes, and you can search your notes to quickly find that passage later on—a killer function that makes e-reading far superior to print, where you can spend hours searching for that annotation that you know you made somewhere. That’s something you can’t do with penciled notes in a margin of a physical book. The iPad is not, however, great in direct sunlight—too much glare. But while outdoor reading isn’t optimal, I’ve yet to find an indoor environment where the device doesn’t work like a champ.

But not all is perfect. While the iPad is great for general reading, it has its limits.

The reader apps are simply not up to handling critical editions of works. Because the reader apps like iBook and Kindle continually reformat the pages, you don’t have ready access to textual apparatuses, marginal glosses, or notes. Citing page numbers from e-editions is also a problem; they’re never the same. (Although I’ve noted that some Penguin e-editions also display the print page a passage corresponds to. But you still can’t tell where the page breaks in the print edition are, so finding precise location of a particular passage is still problematic, although you can get it down to within a page.) For instance, I plunked down the $13 for Jill Mann’s Penguin edition of The Canterbury Tales, even though I own the print edition, because I thought it would be handy to have on the device. It was a mistake. The marginal glosses were converted to notes, turning the text into an unreadable alphanumeric soup. Another factor is cost and obsolescence. Critical editions are expensive, and for ones that I really care about I don’t want to have to repurchase every few years. A print edition will still be readable in a few years, but you can’t say that for an e-book. Now if academic publishers would offer free e-editions to those who pay full price for the print version, it would be the best of both worlds.

There are some experiments to produce stand-alone apps for critical editions. As an experiment, I paid the $14 for the iPad app for T. S. Eliot’s The Waste Land to see if it handles the critical functions better. (It’s not medieval, but it’s on my comps reading list, so it is useful to me.) The app is fantastic. The software provides ready access to annotations, and you can flip back and forth between the edited version and images of Eliot’s typed manuscript with Pound’s comments scribbled on it. The app also includes videos of Fiona Shaw reading the poem and critical commentary by Seamus Heaney and others. It works great, but it still has portability (between devices) and obsolescence problems.

The iPad also cannot display PDFs that are formatted with the JPEG2000 format. This is not a problem with journal articles, which are generally text based and don’t use images of the pages, but many Google Books are actually PDFs of scanned images using this format, not text.  The iPad handles normal JPEG images just fine, but many Google Books files are scanned with JPEG2000 and are unreadable on the device. To read these, you need to convert them, and this requires the full version of Acrobat, hefty computing power, and a lot of time. (My PC, which has a lot of horsepower, crashed trying to convert some books.)

Another problem for those that are serious about their literature is that the Amazon and Apple store fronts are terrible. It is essentially impossible to tell what edition the e-edition you are acquiring is based on. This is not a problem with free versions because you lose nothing by downloading, but if you’re paying money, even if it’s only $0.99, the cost can add up. Many of the free editions are based on poorly scanned nineteenth-century editions and are rife with errors; some so much that they are essentially unreadable. Even my Penguin e-edition of Moll Flanders had the consistent error of goal for gaol (although that’s a spell-check/editorial problem, not scanning). The store fronts simply don’t give the necessary information about the edition being proffered, and what info they do give is often wrong. Many of the links from the Kindle version to the hardcopy versions on the Amazon.com site link to different editions. (The problem is the Silicon Valley mentality where engineers think they know it all: “Who needs librarians or people who study how books are made and used? I can read, therefore I can design a site/application for readers.")

As far as reader apps go, both the Kindle and iBooks apps are free and generally excellent (aside from the critical edition problems). Functionality is the essential the same on both, so there’s no reason not to have both. Buying e-books and downloading them to the device is a breeze. I would also recommend the Goodreader app. It’s cheap and great for PDFs and generic docs. One drawback is that it’s a pain to move docs from your PC to the iPad. Downloading directly is easy, but uploading from a PC is clumsy and time consuming. You can drag and drop from a directory to iBooks, but you can’t to Goodreader, which is generally a better app for PDFs. And the iPad architecture doesn’t allow apps to share documents. You have to load them separately for each app.

Wordprocessing
This is the main reason the iPad is not a replacement for a PC. The iPad is fine for typing quick notes and annotations, but I wouldn’t want to use it for any serious word processing. The virtual keyboard is very limited and takes up screen space that really should belong to the document. I also find that I need multiple windows for moving between my text and source documents and web pages when I’m writing, and you simply can’t do this with the iPad. And a proper desk and chair is really required for writing; setting up a ergonomic workstation around an iPad is difficult. (There is an external keyboard for the iPad, but I haven’t tried it. The extra device is also one more thing to lug around and makes the iPad as cumbersome as a full laptop.) All of which leads you to the conclusion that you need a real PC for any serious writing.

iTunes
Finally, iTunes. You use iTunes for managing files on your iPad and iPhone. Frankly, this is the worst piece of crap every produced by a major computer company. Its Windows implementation is unbearably slow, taking 2-3 minutes to load and up to 30 seconds to register key strokes or mouse clicks. It’s a bandwidth hog, slowing down your computer’s internet access to the point where it is impossible to work with iTunes running the background. I have seriously thought of abandoning my iPhone and iPad simply because this piece of software is so bad. I suspect the experience is different for Mac users, but if you use a Windows PC, think long and hard about going the iPad route simply because this essential piece of software is such a piece of crap.

Recommended Apps
I’m not going to list a wide spectrum of apps I like, but only the ones that I find relevant to my academic work. Prices are in U. S. dollars.

  • Goodreader. Indispensible for reading PDFs and other documents. $4.99.

  • Kindle. Amazon’s reading software. Free. (You pay for the books, of course.)

  • iBooks. Apple’s reading software. Free. (Not including books.)

  • Pages. Apple’s word processor for the iPad. I said the iPad is not ideal for word processing, but having the capability when you need it is worth ten bucks. $9.99.

  • Numbers. Apple’s spreadsheet for iPad. $9.99.

  • Keynote. Apple’s presentation software for iPad. I haven’t used it much, but I imagine once I start lecturing or delivering conference presentations, this will be more useful. $9.99.

  • 2Screens. An independent presentation app that is more flexible than Keynote, allowing you to easily play video, show websites, and other material you haven’t placed in a pre-prepared presentation. $4.99.

  • Lexidium. The ultimate Latin dictionary. It includes the full text of Lewis and Short and can parse words using Whitaker and Parsimonious. If you do Latin, you need this. $3.99, and the Parsimonious plug-in is another $3.99.

  • Old English Dictionary. Based on Bosworth Toller, this isn’t nearly as good as its Latin cousin, but it is handy. $1.99.

British Library to Acquire Oldest European Book

15 July 2011

Well, sort of. They already have it on long-term loan from Durham Cathedral, which is the owner. If you’ve got a couple million pounds to spare, you can help the British Library complete the acquisition.

The book is St. Cuthbert’s gospel, a late-seventh century copy of the Gospel of St. John in Latin. It’s believed to be the oldest European book that survives intact. Cuthbert, the bishop of Lindisfarne, is an enormously important figure in early English history.

The British Library press release and accompanying materials are a good example of how to produce a good press kit. Click on the video link to see the inside of the book.

Hat tip to the Medieval and Earlier Manuscripts Blog