Chronicling America: They gave us an API. What do we do now?
May 13th, 2010 | Douglas Knox
Chronicling America is a brilliantly engineered digital collection of historically important material that, because of its API, and because of the understanding that “the Web is the API,” could be an exemplary part of an open digital infrastructure for American history. But for the API to matter the rest of us have to actually build on it. The scale of what is already digitized and accessible is enough to make it a major new resource for American history. So how do we follow through on that, in a distributed way, at large and small scales? I have played with making network graphs of newspaper business genealogies from the bibliographic data, without trying to do much of anything yet with the images and OCR text, and could share bits of that exercise to get some modest hackery into the mix, but there’s a lot more that could be done. Certainly we could speculate about text mining and high-performance computing, but I would hope we could also brainstorm where the opportunities for consequential innovation are as much social as technological. Even with the methods and technologies available now and yesterday, Chronicling America is finished enough to start changing history already. They flipped the switch on the API. So what? If we think it matters, then how do we start hacking the interface between the API and the world to demonstrate how it matters?
May 17th, 2010 at 12:28 pm
I would love to discuss this. The Center for Digital Research in the Humanities (where I work) was one of the participants in Chronicling America, and I’ve been fascinated with what we might be able to do with the API.
May 19th, 2010 at 10:50 pm
The NEH is the major funder behind Chronicling America, so I was very jazzed to see the Library of Congress release the API to the public last year. Doug makes some great points here and he shows off some cool uses of the API.
In the past, I’ve argued that, in the long run, the Chronicling America API might prove to be more valuable than the standard UI most people use to search the site. But I haven’t yet seen many really cool uses of the API. I wonder if a few cool demo hacks showing off the API might inspire people?
It seems to me that there is all kinds of untapped potential to numerous audiences. Yes, of course I can see text mining applications and such. But I could also see amateur historians using the API. (Imagine a website on, say, the history of baseball that taps into the API to present news articles on baseball players? Say local news coverage from every state of every World Series?) These are hometown newspapers — the potential is crazy.
May 20th, 2010 at 8:29 am
By the way, the next Digital Humanities Start-up Grant deadline is October 5, 2010. (Jus’ sayin’).