So how do you turn a comic book, or a collection of comic books, into a research object using TEI, an XML markup language?

 

In this piece, Brian Flota and Steven Holloway from Libraries & Educational Technologies explain how digital approaches open up new insights into comic books, allowing scholars to identify trends and provide quantifiable answers to qualitative questions such as how gender and racial attitudes are expressed in popular culture.   

TEI (Text Encoding Initiative) markup can help humanities scholars pay greater attention to the details of a comic book’s text and art. Marking up a comic book in this fashion necessitates a very close reading. TEI tags can be modified to capture specific verbal or visual details, such as gender or race markers, narrative style, weapons, clothing, body posture, or even the types of transitions between comic book panels. Here is a panel from Phantom Lady #17 (1948):

This is what the TEI markup of this panel in Atom, an open source text editor, can look like:

 

One thing we can focus on is gender-markers used by the creators of this particular comic: Matt Baker and Ruth A. Roche. It should be noted that Baker and Roche are two unusual comics creators from the period. Matt Baker was one of the few African American comic book artists from the period, and Ruth A. Roche was one of the few female comic writers from the era. In this example, the “cbml:panel” tag supplies a unique ID for the panel and identifies the characters involved in the action.  The multiple note tags provide a description of the panel action as a whole and the gender markers for the characters, all using a controlled vocabulary specified elsewhere in the TEI document; one describes a type of weapon, a knife.  The three balloon tags capture the speeches of the protagonists, two of which are spoken aloud, but one is a silent “thought” balloon.

 

Once we have created the TEI markup, we can formulate research questions using tools such as XQuery, an XML database language. For this example, we put this question to our markup for the entire comic book: how many gender markers have been assigned each character (statistics ordered by character), and which ones are they (a list of gender markers)? And here are our results: [run XQuery “gender-markers-characters.xql” in Atom]

Unsurprisingly, the heroine Phantom Lady garners the most face-time with 10 different binary gender markers and 259 aggregate hits, BUT: Sandra Knight, her alter-ego, has twice as many distinct markers (20), since she is portrayed as a “girly-girl.”  The closest male figure has 8 distinct markers with 115 occurrences of these gender markers.  Even though male figures vastly predominate numerically in this comic book, our artist lavished attention on gendering the female characters. This is an example of small-scale research that can be conducted by analyzing just one comic book. Imagine about how much more could be accomplished by marking up and analyzing an entire corpus of comics!

What can this methodology accomplish for researchers?

It can generate instantaneous bird’s-eye scans of copious amounts of material, identify trends and provide quantifiable answers to qualitative questions such as how gender and racial attitudes are expressed in popular culture; plot the penetration of linguistic registers, like mid-20th century criminal argot, in crime-noir comics, using corpus linguistic analysis; and provide hard data on one of the abiding mysteries of sequential art: how character and plot develop from one comic book panel to the next.

The applied art of TEI comic book markup makes one a more deliberate and astute reader. Creating verbal “digests” of the visual action in a comic book, panel by panel, could create accessibility for those who are visually impaired. TEI markup enables researchers, both faculty and students, to make their humanities research richly collaborative, and by using comic books in the public domain that same markup can be freely published and remixed by other researchers.

 

How else can “big data” methodology be applied to comic books?

Commercial Artificial Intelligence tools like IBM Watson can perform intriguing personality assessments of invented characters, like Phantom Lady circa 1948.

Reverse image lookup tools (ex: Google, Bing, Yandex) can help identify the cultural bleed of a visual meme, character or fictional entity, like Captain Nemo’s submarine from Jules Verne’s Twenty Thousand Leagues Under the Sea.

All of these complex operations, including TEI markup, can be accomplished indirectly by using spreadsheet data entry, Artificial Intelligence demos, and simple online image editors.

What’s next?