By Susie Allen, AB’09
Photo by Jason Smith

"I’m hoping to see it continue as a locus of research driven by humanistic concerns, but with access to intellectual preoccupations we see across many disciplines.””
—Mark Olsen, assistant director
ARTFL Project

How do you make sense of a million books? What about two million books?

Those are the kinds of questions that occupy the international team headed up by French literature professor Robert Morrissey, director of the ARTFL Project, short for American and French Research on the Treasury of the French Language. Founded in 1981 to create a database of French texts, the collaborative project has grown into a hub of innovation and is now a key player in the burgeoning field of the digital humanities.

Now, the ARTFL database boasts an ever-expanding collection of digitized texts in multiple languages. PhiloLogic, a powerful search engine designed for humanities research developed at ARTFL, is in use at academic institutions worldwide. The collaboration’s flagship project, a digitized version of philosopher Denis Diderot’s massive Encyclopédie, has given scholars new insights into a text once seen as impenetrable.

As more texts are being digitized and become available online, ARTFL is forming new partnerships and seeking new ways to navigate the thicket of texts for the benefit of scholars. At the same time, the project is making strides in interacting with users at its 350 subscribing institutions.

New technology “allows you to investigate certain hypotheses in ways you would never have been able to do,” says Morrissey, the Benjamin Franklin Professor of French Literature. “It allows you to adduce new evidence with greater force.”

Yet Morrissey believes the new methods are only strengthening scholars’ core intellectual ambitions. “Humanists have always asked big questions,” he argues.

Rediscovering the Encyclopédie

The research implications of ARTFL’s approach to digitized texts are powerful. A scholar interested in the development of French nationalism might search for occurrences of the word “nation” during a particular historical period or in the work of a particular author. He could find words that occur near the word “nation,” or determine when the words “nation” and “God” appear near one another.

Morrissey’s own research has been shaped by his work at ARTFL. Recently, Morrissey and Stanford professor Dan Edelstein teamed up to study the thinkers who influenced the authors of the Encyclopédie. Using sequence alignment—the same technique used to study genomes in biology—Morrissey and Edelstein were able to determine when language and ideas had been borrowed from other thinkers without citation.

They hypothesized that this “loose attribution” was a means of including subversive ideas without attracting the ire of the French government. Readers who were “in the know” realized what the authors were doing, “but it didn’t raise the eyebrows of the censors,” Morrissey explains. As a result of this research, “we’re getting a more balanced picture of what [in the Encyclopédie] was subversive, who the sources were, and how subversive they were.”

It was the kind of study that simply wouldn’t have been possible without a searchable, digitized Encyclopédie. Because of its sprawling, labyrinthine structure, “it was impossible to know [the Encyclopédie] in the way we know it now,” Morrissey says.

“This larger vision of the work … has allowed us to ask questions regarding its general structure and the internal relationships or networks of relationships between articles, questions that would have been impossible to ask without a digitized version of the Encyclopédie,” agrees Glenn Roe, PhD’10, a senior project manager at ARTFL.

The Early Days

The idea of “lighting the labyrinth,” as Morrissey describes it, wasn’t on anyone’s mind in the late 1970s, when the French Center for National Scientific Research began to painstakingly digitize classic French texts. At the time, the aim was to create a resource for lexicographers interested in writing a new French dictionary.

Morrissey realized the digitized texts had enormous potential for literary critics. “I was doing some research, and I said, ‘This could really be useful to people other than lexicographers, why don’t we try to make this available to a larger research community?’” he recalls.

Morrissey spearheaded the effort to bring the database to Hyde Park, where it found an appreciative community of users. As both the number of texts and ease of access increased, the ARTFL team turned its attention to improving the methods of searching texts for users.

The result was PhiloLogic, a fast, flexible, and easy-to-use search engine developed under the direction of ARTFL’s assistant director Mark Olsen. PhiloLogic, which functions across many languages, is designed to allow humanists to undertake sophisticated searches and return results that are customized to their interests.

“PhiloLogic understands how humanists need to search, and how humanists need to be able to navigate complex documents,” says Olsen, a historian. “It represents in many ways the kinds of things I wanted for my own research environment.”

Looking To the Future

As it has always done, ARTFL is looking forward. The collection of texts in French and many other languages continues to grow. ARTFL recently joined the Computation Institute, and became involved with Project Bamboo, a new initiative aimed at improving technological resources for humanities scholars.

ARTFL developers began to work on the Dictionnaire Vivant de La Langue Francaise, a new French dictionary that allows users to contribute definitions and select useful examples.

The dictionary project was a natural fit for the ARTFL team, which wanted to develop new ways of reaching users and had a large collection of historical dictionaries at their disposal. “We had the peanut butter and the chocolate and we just mixed them together,” jokes Charles Cooney, PhD’04, one of the DVLF developers.

For Olsen, these new efforts and partnerships are vital for the future of ARTFL. He’s proud that their eclectic team of linguists, literary critics, and historians is finding common areas of interest with scholars and users in so many fields.

“I’m hoping to see it continue as a locus of research driven by humanistic concerns, but with access to intellectual preoccupations we see across many disciplines,” he says. “That, I think, will continue to make it very powerful.”

Originally published on January 31, 2011.