Classification of Objects by Artists in the Harvard Art Museums Collection

by Carey Gib­bons @careygibb

These stacked bar charts and pie charts show the per­cent­age of draw­ings, prints, pho­tographs, and paint­ings pro­duced by spe­cif­ic artists in the Har­vard Art Muse­ums col­lec­tion (based on the first 50 results). The top two charts show the clas­si­fi­ca­tions by artist, while the bot­tom two show the per­cent­age of works by each artist with­in each clas­si­fi­ca­tion. I made the charts using the web­site after using to pull data from the Har­vard Art Muse­ums API and gen­er­ate spread­sheets in Google Sheets. 

This screen­shot above shows the sce­nario I cre­at­ed in Inte­gro­mat, which includes Tools (“Set Vari­able”), HTTP (“Make a request”), Iter­a­tor, and Google Sheets (“Add a row”).

This screen­shot above cap­tures what appears when you click on “Tools.” As you can see, the Vari­able name is “search_query,” the Vari­able life­time is “One cycle” (the default set­ting), and the Vari­able val­ue is the name of the artist (in this case “James Abbott McNeill Whistler”). I changed the Vari­able val­ue depend­ing on which of the 6 artists I was searching. 

This screen­shot above cap­tures what appears when you click “HTTP.” I fol­lowed these instruc­tions in order to fig­ure out the URL, tai­lor­ing my results to the num­ber “50” (the num­ber of results that appears on a page) and also edit­ing the link to include my Har­vard API key (blacked out above for pri­va­cy), which I obtained by fol­low­ing the instruc­tions here. I used the “GET” Method and entered “q” for the Name of the Query String and select­ed “1. search_query” for the Val­ue. I also clicked “Parse response.”

This screen­shot above cap­tures what appears when you click “Iter­a­tor.” The map set­ting was on by default, and I entered “2. data: records[]” in the Array field in order to tar­get the infor­ma­tion that was most use­ful for me. 

This screen­shot above cap­tures what appears when you click “Google Sheets.” I con­nect­ed to my Google account and cre­at­ed a new spread­sheet titled “DAHSS,” which I select­ed in the Spread­sheet field. “Select spread­sheet and sheet” was the default in the Mode field. The spread­sheet con­tained dif­fer­ent sheets ded­i­cat­ed to dif­fer­ent artists. In this case, I select­ed the “Whistler” sheet in order to make the Whistler search results appear there. 

This screen­shot above shows the data that I chose to pull from the Har­vard Art Muse­ums API and place in the columns of the sheets that I generated. 

This screen­shot above cap­tures the sheet that was gen­er­at­ed auto­mat­i­cal­ly for Whistler when I clicked “Run” on Inte­gro­mat. This screen­shot does­n’t cap­ture the full sheet – there are 50 rows total, reflect­ing the 50 first results on the Har­vard Art Muse­ums web­site. Each object has a row, with columns show­ing artist, clas­si­fi­ca­tion, title, date, URL, and search query. I end­ed up with sheets like this for each of the 6 artists I chose. 

This screen­shot above cap­tures the tal­ly of draw­ings, pho­tographs, prints, and paint­ings that I did at the bot­tom of each sheet. I used the for­mu­la “=UNIQUE(B1:B50)” to list all the dif­fer­ent clas­si­fi­ca­tions above, and then used the “COUNTIF” func­tion to count the num­ber of times each clas­si­fi­ca­tion appeared. 

I then made anoth­er final sheet that gath­ered all the results togeth­er in a for­mat that would be easy for Datawrap­per to uti­lize. The infor­ma­tion was then fed to Datawrap­per to gen­er­ate the charts at the top of this page. 

This screen­shot above cap­tures one prob­lem I encoun­tered when search­ing the Har­vard Art Muse­ums col­lec­tion. The results for Albrecht Dür­er (1471–1528) includ­ed pho­tographs, which clear­ly Dür­er did not cre­ate! These pho­tographs appear to be conservation-related. 

Over­all I found the Har­vard Art Muse­ums web­site to be user-friend­ly, how­ev­er. I start­ed work­ing with the Rijksmu­se­um web­site but found that the Har­vard Art Muse­ums pro­vid­ed more options/categories of data. (Har­vard pro­vides 94 options/criteria to pull v. the Rijksmu­se­um’s 28 options.) I was also more famil­iar with the Har­vard Art Muse­ums col­lec­tion, which was help­ful when select­ing artists to search. 

I found Inte­gro­mat chal­leng­ing to use ini­tial­ly, but as the DAHSS went on, I became more famil­iar with it and end­ed up gen­er­at­ing some inter­est­ing infor­ma­tion. I am look­ing for­ward to explor­ing it more in the future. I would like to exper­i­ment with larg­er datasets and search for more spe­cif­ic infor­ma­tion (i.e. lith­o­graphs by women or the most viewed/searched for nine­teenth-cen­tu­ry paint­ings). Addi­tion­al­ly, it could be use­ful in point­ing out strengths or gaps in muse­um col­lec­tions and/or show­ing col­lect­ing trends.

I’d love to hear from any­one who has used Inte­gro­mat for art his­to­ry research or with a broad­er inter­est in dig­i­tal art his­to­ry. Say hel­lo on Twit­ter: @careygibb!