When I first saw Gephi (in a talk by Micki Kaufman on Kissinger), it completely blew my mind. Like many, I was wowed by the pretty graphs. There were shapes, colors, and who doesn’t like to see a lot of important-looking circles connected by all sorts of lines? Although I had little to no experience in Digital Humanities, I wanted to do that. Badly. And I did: I found out the first rule of gephi: it’s easy to make a pretty visualization that signifies almost nothing.
Well, there was a dissertation to finish, work to pursue, and a million distractions, so, while I concentrated on Digital Pedagogy, I never quite got back to Network Analysis for a while. Now, after a lot of reading in Network Theory, Social Network Analysis, and experimenting with software, I am beginning to use network visualization for good. This is the beginning of a research/visualization project that I’m working on currently to answer some historical questions about nineteenth-century New York theatre.
I’m a bit of Jekyll and Hyde with Nineteenth Century theatre. Sometimes, I love the 1890s because it feels very much like today. There’s a lot of big business, fan clubs, merchandizing, and gossip magazines (I have a twitter account that circulates some news stories and chatter from the 1880s and 1890s @c19theatrenews). On the other hand, the antebellum period is wild by contrast, especially its audiences.
Histories of antebellum New York theatre tend to focus on the contrasts between houses like the Park (mostly elite) and the Bowery (mostly working class). While there’s been a good bit of work distinguishing the differences between theatres, there’s little scholarship It has long been noted that all classes went to both, but there is very little specific work on what theatres had in common in the period. So, for the past couple of months, I have been assembling data on 1840s New York City productions to find what kinds of networks comprised the New York Theatre. In this task, I have been greatly helped by a former student, Irene Lazaridis, who has taken time out from directing a production of Jean Anouilh’s Antigone to contribute her impressive skills at grinding out data (note: in using student labor, I’m strictly following the UCLA Student Collaborator’s Bill of Rights).
This is very much a work in progress: overall, I have transcribed about 1000 entries of actors performing in New York, along with Irene’s contribution of 600. As a disclaimer, I’d consider myself a novice with Network Analysis and Gephi, so this is in part a learning process of creating one type meaningful data out of another.
The 1840s is often referred to as a time of transition from the heyday of Jacksonian America to a more docile, respectable audience, perhaps culminating in the Astor Place Riots of 1849. Authors like Walt Whitman opined that “Awhile after 1840 the character of the Bowery as hitherto described completely changed.” I’m curious if subjective opinions like this and a historical process described by could be measured with other methods.
In this project, I’m looking to test the extent actors or theatres were “branded” with particular genres. For example, I plan to look at what plays the “elite” Park and “rowdy.” Was there a distinct difference over time between the two, or did they share plays, stars, and genres? More importantly, there were several other major theatres in the city at the time—how related to each other were their repertoires and companies? Ultimately, there are relatively few playscripts from the period surviving, so I’m curious if there can be said to be a canon of 1840s New York performance?
The Source Data (and its Discontents)
For most of my data, I have relied on volume 4 of George Clinton Densmore Odell’s Annals of the New York Stage (1927). For anyone unfamiliar with this work, it’s a 15-volume chronicle of the history of New York theatre from the earliest mentions to 1894. For anyone who has worked in the period, it’s a blessing and a curse. Odell has done an impressive amount of work. He’s included the information from previously written chronicles, as well as gone through what could only have been immense stacks of period newspapers. His data is just about the best that’s out there pre-made, but it’s also got it share of flaws flaws:
- It can be woefully incomplete. The nineteenth century is a big place and there isn’t a complete record surviving anywhere.
- His data is often limited to theatres that advertised in newspapers.
- Odell had opinions. He favored the big theatres, so he followed them with more care than others. In particular, if he thought a production was garbage, he might ignore it or not bother naming it.
- Odell’s organization is confusing even to someone who is used to it. He goes theatre-by-theatre, year-by-year. Except when he doesn’t.
- There is no consistent methodology. Odell often assumes you know exactly what he is talking about. For example, he may just mention the name of a character in a play and assume you know the title, author, and main points. This is all well and good, but the U.S. in the nineteenth century has an abysmal record of surviving plays.
- It’s inconsistent. Although Odell sometimes includes cast lists, he will often only mention a star or stars in the production. This sucks for someone looking at social networks, as you are left guessing who the members of the company were.
That said, there’s nothing else like Odell. He is a vast, untapped resource.
The Work (so far)
The visualizations below are based on the records I’ve collected of 135 productions from August to December 1839 in 5 New York Theatres: the National, the National at Niblo’s (it moved after a fire and had a slight change in repertoire and company), the Park, the Bowery, and the New Chatham. To the best of my ability, I’ve recorded play titles, authors, casts, and genres. In creating this visualization, I was looking at whether theatres had distinct genric identities, i.e. if you wanted to go out to see a romantic melodrama, would you know where to go without having to read a newspaper?
In preparing this version of the data, I have counted each individual play done in that genre, not the number of times it was performed. I did this for two reasons: one, I’m not interested in what was popular, but how diverse the theatres were in their repertoire. Second, Odell is highly inconsistent with how he measures number of performances. Basically, he never gives you a straight answer on how many performances, so before I go that route, I have to complete a rubric for how to weight his notes that a play was “done frequently” or “a regular visitor in the middle of the month,” etc.
A link to the data used for this post is here.
A note on genre: Figuring out nineteenth-century genre is a tricky prospect that I still haven’t quite solved. We tend to label a lot of plays “melodrama” but the audiences at the time would have seen them as “tragedy” or just “drama.” I am still fine-tuning my system as I discover what is out there and try to find copies of many of the plays mentioned. For this visualization, I made some rough categorical choices. Every evening of theatre that I looked at included at least one main performance and an afterpiece. Although the afterpieces were sometimes farces and sometimes comedies, I folded the term “afterpiece” and “farce” together, as audiences at the time would have seen those terms as fairly interchangeable. This is to be distinguished from “Comedy,” which I have only used to denote the main pieces of evenings.
Likewise, I invented some of the terminology, i.e. “Classical Tragedy” denotes a tragedy that was set in antiquity, even if we sometimes might call them “melodrama.” For this, I’m thinking something like the popular Damon and Pythias by John Banim.
My data has a lot of applications, so before I headed to gephi, I tried almost everything else first. Gephi is great to look at connections, but it’s excessive to try to use it for basic calculations. Here are some simple graphs of genre representation in my data. First, a chart of all the productions, irrespective of theatre, to see what genres were performed the most during the opening of the 1839 season.
This is fairly unsurprising for someone who has studied the period. Farce is by far the most common because everyone did farces after the main pieces. Next is Shakespearean tragedy (12%), which, as we will see, was popular at a few theatres. This is pretty much received wisdom on the period, but I found the equal presence of Classical Tragedy (12%) to be unexpected. Following that, there’s British Comedy (9%) and a pretty steep drop off to genres around the 3-5% mark.
And genre by theatres:
This gets a bit more interesting. We can see that the high representation of British Comedy is due almost exclusively to the Park theatre doing that genre. Also, while the Bowery performed some classical tragedy, the National was really the focal point of that genre (probably due to a few engagements by Edwin Forrest in the period). The Bowery tried an almost equal range of genres, while the Park was the least diverse. That is particularly interesting in that the Park was struggling in this season. It is telling that while they tried more individual plays, they remained within a fairly narrow genre. By contrast, the Bowery had fewer new plays than any of the others, yet it spanned a wider genre spectrum. Perhaps unsurprisingly, the New Chatham season was chaotic and almost haphazard.
On to Gephi. Here is the visualization for the data above. In more technical terms, this is a directed, bi-modal network where the nodes are weighted by outdegree. That is, it is a network based on two main categories: what genre was produced at what theatre (see data link). As opposed to some more straightforward social network analysis, which would draw connections between individual nodes (people), this is using two different types of nodes, “Genre” and “Theatre,” which makes it a bit messier. Here, if a genre was produced in a theatre, a line will connect the two. You can’t see how everything is connected to everything else, but you can see which nodes (theatres, in turquoise) other nodes (genre, various colors) are connected to. In other words, one can see which theatres share which genre.
So that’s a lot to take in at once!
This image is also about node and line size. The lines connecting the theatres are thicker depending on their weight, i.e. how many different productions of that genre each theatre staged. The bigger the node for each genre, the more it was produced overall. In addition, I’ve colored the nodes by what I’ve considered their genric families, so Classical, Shakespearean, and Verse tragedy are all one color, while Romantic and Historical Melodrama are another, but British Comedy and Farce are not since they were staged for very different purposes.
By standards of good visualization, this image is pretty hard on the eyes. I have kept the outliers in here (Burletta, Jacobean Tragedy, etc.) because in this smallish dataset, those make the theatres they are connected to more unique. Since a visualization probably shouldn’t be the end of inquiry, noticing those single-production nodes makes me want to dive in and figure out how and why they were only done once.
A cool thing about Gephi is its dynamic qualities. If there’s a way to host the interactive visualizations, I don’t know it, but you can also look within the graph to find out more. In this case, hovering over a node will highlight just that node and the other nodes it is connected to (i.e. the genre and the theatres that staged it).
Farce was performed at each theatre, but with less variety at the Bowery. The Park and National (by Niblo’s, on Broadway), by contrast, relied a great deal on changing up the Farces they performed, perhaps because both were not doing very well in these months.
On a similar note, this shows that there are many ways to be well-connected. While British Comedy was in each theatre, it served a subordinate role in different theatres.
One emerging pattern is that the Bowery partakes relatively little of well-connected genres. Looking back at the pie graph, this is due in part to its diversity of genres, but also its reliance on Shakespearean tragedy and various melodramas. This was the only theatre in this time period that was not struggling, according to Odell, so perhaps the way to succeed in 1839 is to dabble in everything, but favor a wide variety of tragedy and melodrama.
Here’s the Bowery node selected. Keep in mind, the size of the genre nodes are connected to the graph as a whole, not to the Bowery (where its the size of the arrows that matter). This basically tells us the same thing as a the pie graph, just in a different way.
One small point to conclude with was an unexpected occurrence. Since the Park and Bowery are supposed to be the big rivals in terms of repertoire, audience class, and ideology, I was somewhat surprised to find that they are the only theatres engaging Comic Opera at the time (different pieces and singers, but same genre).
This is just the beginning of my work, with only a few key theatres in a short time period, so there’s more to come. If you’re reading this and have any suggestions, I’d love to hear from you!
Browse older posts:
What do you do with a footnote to a footnote to history?One of my side-side projects is to look at a window in…
Towards Games as ScholarshipThere was a point in my graduate school career when I stopped…
D3plus and Theatre HistoryAfter reading Anastasia Salter‘s ProfHacker post on D3plus visualizations, I was intrigued…
A Network Perspective on Male/Female Co-Stars in New York City Theatre (1839)Following my previous post, which looked at Genre Networks in five antebellum…