Annotation requires writing for people and machines
(Editor’s Note: KHS staff working on the Civil War Governors of Kentucky Digital Documentary Edition recently annotated their 1,000th document. They have identified and written short biographies of each person who appears in 1,000 of the more than 10,000 documents that make up CWGK. Although their work continues, they are sharing their thoughts at this milestone in three blogs.) During grant-writing season, I drive to work listening to Lin-Manuel Miranda to inspire writing Civil War Governors of Kentucky's (CWGK) financial systems into existence. I like intricate, playful prose.
One of my mentors told me that I had to watch myself because I liked to build clauses and metaphors upon one another―palaces of paragraphs, cathedrals―layer upon layer until a section takes flight or collapses ignominiously under its own weight. I took it as a compliment, mostly. I like a high upside. Over the past few years, though, I’ve developed and taught a very different school of writing. CWGK’s biographies are not great literature, but they have buried within their short, predictable statements and prescribed order, an interpretative potential far beyond the sum of their sentences. Our biography style is clunky and often repetitive.
How it converts tax records, obituaries and government documents into structured data, pinned to specific geographical coordinates and bounded by start and stop dates. They are not a narrative, per se, but they are meant to be structured statements that both people and machines can read. Our biographies work. They build upon one another to create hundreds of thousands of discrete records of historical events large and small. Each treated equally. The world we capture in our biographies can be set into motion; viewed from the perspective of a town, of a day, of people who journey together on a specific steamboat. The number of these stories that we encode, both mundane and the world-changing, are almost limitless. The scale of this data is so great that we can’t yet fully imagine how we are going to use it.
Has the historian who will build the system that starts and stops this network in time or peeks into the totality of a community on a critical month or day been born yet? New staff don’t like our biographical style. It’s nothing like what they’ve been trained to write. We have to sell them on the vision before we can sell them on the structure. But I can usually convince them, because I’ve convinced myself. Our verbal architecture is space age, not Baroque. We are building a machine out of words, not painting a picture. CWGK stares ahead into a new future for historical research. Although we didn’t quite understand it at the time the project launched, we were sketching the outlines of a world just over our technological horizon.
Ours is a project of radical potential and scale. It’s big―bigger than we can probably complete with the technology and staff we currently have. But we’ve built it to grow into a new era where new software finds connections and patterns between our people and our texts at a rate far beyond what we can do with mouse and keyboard today. That is both comforting and terrifying. But it brings me immense pride to think about Kentucky standing at the forefront of this new era of history research. We pioneer. (The visual elements at the top of this page come from the biography of Moses Block found at discovery.civilwargovernors.org/document/N00007370. Check it out to see the types of information CWGK provides.)