Classifying Human Knowledge, Part 1

1 September 2006

I’ve spent the last week organizing my library, a task that, surprisingly, has turned out to be quite interesting. In an effort to find a classification scheme that works for me, I’ve been looking at an learning about the various systems in use in libraries around the world.

The most famous is perhaps the Dewey Decimal System. Invented by Melvil Dewey in 1876, it is the most widely used library classification in the United States, used primarily by public and primary school libraries. The DDS divides all human knowledge into ten major divisions, each of these have ten possible subdivisions, these each have ten more, and so on. Hence the decimal.

The top level domains are:

  • 000 – Computer science, information, general works

  • 100 – Philosophy and psychology

  • 200 – Religion

  • 300 – Social sciences

  • 400 – Language

  • 500 – Science

  • 600 – Technology

  • 700 – Arts and recreation

  • 800 – Literature

  • 900 – History and geography

In the language category, for example, the subdivisions are:

  • 400 – General

  • 410 – Linguistics

  • 420 – English

  • 430 – Other Germanic languages

  • 440 – French, Provencal, and Catalan

  • 450 – Italian and Romanian

  • 460 – Spanish and Portuguese

  • 470 – Latin

  • 480 – Greek

  • 490 – Other languages

English, again for example, is broken into:

  • 421 – Writing system and phonology

  • 422 – Etymology

  • 423 – Dictionaries

  • 424 – Not used

  • 425 – Grammar

  • 426 – Not used

  • 427 – Language variations (dialects and slang)

  • 428 – Usage

  • 429 – Old English

The same numbers are used across the various categories to denote similar subdivisions. So 432 is German etymology and 482 is Greek etymology.

These categories can be further extended by numbers following a decimal point to further classify the work. The number .73, for example, denotes the United States. So the call number 427.73 is a book about American dialect. This consistent use of the same numerical combinations across all subdivisions (e.g., 973 is history of the United States) makes it easy for those familiar with the system to see how a book is classified.

Since there are many different books in these broad categories, the category number is usually followed by a Cutter number (see below) that denotes the author’s name, e.g., T911 is Mark Twain and the category 813 T911 contains fiction by Twain (81 American Literature, 3 Fiction). For prolific authors, like Twain, this Cutter number is often followed by a alphabetic sequence that either represents the title or the order in which the library acquired the book–so that new acquisitions can simply be put at the end of the appropriate shelf. So The Adventures of Huckleberry Finn might have a call number of 813 T911 Ad, or 813 T911 Fi, or, as it is shelved in the Berkeley Public Library, 813 T911zb.

It’s often thought that the Dewey system is for non-fiction only. This erroneous notion is because many libraries don’t use Dewey to classify fiction. Instead they use the author’s last name alone. This is helpful to general readers who just want to find the book and don’t care if T.S. Eliot is classified as American or British. So Huck Finn is classified as Fic Tw in many libraries. Similary, biography is often not filed in the Dewey category of 920 and instead a book of Twain’s life is filed under B Tw.

The chief problem with the Dewey Decimal system is that it is very American and European focused. For example, most of the languages of the world are crammed into the 490 category. Arabic, Native American, and Finnish can all be found here. It is kept up to date by the Online Computer Library Center, which owns the rights to the system, with new categories, like computer science, added from time to time. But it is very much captive to a 19th century American view of the relative importance of various classes of knowledge.

An improvement over Dewey is the Universal Decimal Classification or UDC. Invented by Belgian bibliographers Paul Otlet and Henri la Fontaine at the beginning of the 20th century, it is a variation on Dewey’s original system. It rarely used in the United States, but is the primary system for library classification in Britain and other English-speaking countries and can frequently be found to classify libraries in non-English-speaking countries as well. Like the Dewey system, it is kept up-to-date by a consortium of libraries.

The high level categories are similar to Dewey, except that 4 is not used and language and linguistics are grouped with literature in 8. The subcategories are organized so they are more easily extensible. You can keep adding digits to become more specialized.

The UDC also includes a notation system for denoting the relationship between categories in a book. This is especially powerful.

  • + plus sign, means that the book is about the two categories

  • / slash, means the book covers all the categories between the two numbers given

  • : colon, means the book is concerned with the relationship between the two categories

  • [ ] brackets, combines categories into a single unit

  • = equals sign, denotes the language in which the book is written.

So 31:[622+669](485)=20 is a book of statistics on mining and metallurgy in Sweden that is written in English, i>Statistics:[Mining+Metallurgy](Sweden)=English.

A third system is the Cutter Expansive Classification. Invented by Charles Cutter in the 1880s and 1990s for Boston’s Athenaeum library, it is used by only a few libraries, mostly in New England. The top level domains of the Cutter system are:

  • A – General works

  • B-D – Philosophy, psychology, religion

  • E-G – Biography, history, geography

  • H-J – Social sciences, law

  • L-T – Science, technology

  • U-Vs – Military, sports, recreation

  • Vt-W 150; Theater, music, fine arts

  • X – Philology, language

  • Y – Literature

  • Z – Book arts, bibliography

The Cutter system also denotes the size of the volume in its call number, using points (.), pluses (+), and slashes (/) to denote books of small to large size. This is very useful if over or undersized works are stored separately or for quickly locating books on the shelves.

Cutter also devised an ingenious system for classifying author’s names. Cutter created tables of two or three digits that stood for the rest of the name of an author. A214, for example, is John Adams. These tables are in use in most libraries to form the basis of the author’s name portion of call numbers.

Next week: Library of Congress Classification and tags