Automatic discovery and recommendation systems are often designed with one of two audience groups in mind: in academia, the target is the dedicated researcher who actively seeks out particular sources, whereas companies like Amazon or Netflix design recommendations for the casual, passive browser, with convenience as the top priority. Often, however, a user is both browser and researcher in separate tabs; while diving into research in a scholarly database, a user can simultaneously peruse news aggregators or Amazon. For-profit companies often recommend cultural products such as books and movies, but do so with a single goal—increasing the company’s profit. As digital humanists, we should rethink the structure of recommendation algorithms to make them more appropriate for audiences interested in deeper explorations of cultural heritage.
At HyperStudio, we are investigating how digital tools can encourage discovery and serendipity in the humanities, with a focus on art objects and museum collections. For this short paper session, we propose to share our research on the process of discovery, assessing algorithms used in research and recommendation tools on both scholarship and industry platforms. We will survey existing projects that allow scholars and casual users alike to discover new art. We will also discuss a tool that we are building, tentatively titled ArtX, that empowers users to discover cultural events, exhibitions, and art objects in the Boston area. Informed by our theoretical research into cultural recommendation systems, we are prototyping and testing this tool this spring and will be sharing our results at DH2014.
Recommendation systems are typically divided into two approaches: collaborative filtering and content-based filtering.1 While many digital tools use these in combination, here we outline the approaches and their limitations separately. Content-based filtering approaches, such as traditional tagging systems, look at the properties of the content rather than the user. Whether human- or machine-powered, tagging involves inferring what an object is “about” and how one might search for it, and assigning keywords of names, topics, or entities. The act of classifying culture is by its nature restrictive; when an art object is called “surrealist” or “American,” it is placed in a particular discourse and others are implicitly excluded. Even outside-the-box descriptions such as “hazy” are just different boxes. Artsy’s “Art Genome Project” offers a more nuanced approach to tagging (with gradients from 0 to 100, rather than 0 to 1), but this runs into the same problem.2 When an authoritative institution such as a museum produces tags, the tagging system lacks dynamism. User-generated tagging, or folksonomies, add a dynamic element but require that users actively and continually contribute to building up the tags, a process that is difficult to maintain.
Collaborative filtering attempts to sidestep these limitations, focusing instead on the user and their online behavior, similar users, and social networks. User history-based approaches like Amazon’s maximize efficiency at the sake of variety, assuming that a user has no desire to try something new. Social curation tools such as Curiator, ArtStack, Pinterest and Tumblr allow users to build their own collections and share with others, but they perpetuate what is already popular or the most reblogged. Collaborative filtering may work when shopping for a product, but risks creating a filter bubble for art. It shepherds audiences into identical routes of understanding, stifling productive conversation and undiscovered treasures in the process. At the heart of these approaches is the notion that more personalization leads to higher quality, and that existing networks and canons should be reinforced; these are meaningful signals, but they should not be the only ones.
One alternate approach is to include a serendipitous chance in the discovery process. The role of serendipity in scholarly research has been a growing topic of investigation in recent years.3 Serendipity has historically played a significant role in science, mathematics, and the humanities. As resources are increasingly digitized, an oft-cited lament is the lack of serendipity, yearning for the days when a scholar would go to the library stacks looking for one book and happen upon another that sparks his or her thinking in new directions.
While serendipity is chance-based and cannot be controlled, perhaps it can be engineered. A few existing digital humanities and cultural heritage projects experiment with engineering serendipity. Serendip-o-matic, launched in August 2013, aims to re-incorporate chance into the scholarly research process. On the website, users input a text; the tool identifies key words in the text and responds with primary source images from several online collections. The goal of Serendip-o-matic is to yield happy accidents for a wide range of users, whether students in search of inspiration for a paper topic or scholars looking for materials to enliven a current project.4 Another example is Magic Tate Ball, a mobile application designed by digital studio Thought Den to encourage a general audience to discover works of art in the Tate’s collection. Using GPS location, time of day, weather, and analysis of ambient noise, the application returns an artwork, explaining why this work was selected and providing content that allows the user to learn more.5 Magic Tate Ball enables users to engage with works they would not have sought out otherwise while infusing play in the discovery process.
At HyperStudio, we hope to incorporate a similar sense of serendipity in ArtX. Serendipity has the dual advantage of skirting traditional boundaries and adding a playful element to the user experience, which serves both browser and researcher. As we aim to make meaningful and creative connections between the art objects that comprise our past and the events of the present, we believe we can incorporate both audience groups without sacrificing archival rigor. To do so, we will need a holistic, audience-centered approach to digital curation and recommendation.
To achieve this goal, we plan to start small. Through specific partnerships with museums in Boston, we are building a closed and controlled system that can serve as a testing ground for new models of recommendation. Free from industry demands such as growth and scale, we can perfect our schemas and our assumptions before expanding to other institutions. We are also hopeful about creating a collaborative, open-source approach to art recommendation, particularly given the close secrecy with which proprietary recommendation algorithms are guarded. By encouraging open conversation around the ways we recommend art, we may find unique approaches and ways in which current recommendation systems are insufficient or misleading.
We have many questions and challenges ahead. It will be important to understand our audience: How much control over the discovery process do users want, and how can we best balance the sliding scale between browser and researcher? We expect our primary audience to be Boston-area residents and university communities—a casual but informed audience that bridges aspects of both. We hope to instill a scholar’s depth of interest and rigor in the casual user and we hope scholars too can employ the tool as serendipitous inspiration for their own work. But how transparent can we be about the logic behind our recommendations? How can we scale such a strategy, connecting artworks to books, lectures, music, movements and ideas?
Perhaps most importantly, while we have explained “why serendipity,” we must address the “how.” Serendipity involves more than simply selecting objects at random, but what signals are important? How can we prime a user for the mindset of serendipitous discovery, rather than rote research? Moreover, is it truly serendipitous if we are closely engineering the suggestion? We look forward to addressing these questions, but with care to not create our own faulty algorithms. One of the challenges in this process is to avoid reducing cultural objects to the level of products, and museum audiences to consumers. Looking past the current limitations of discovery will be vital for generating new connections and ideas.
1. A.A. Kardan and M. Ebrahimi, A novel approach to hybrid recommendation systems, Information
2. Interview: Matthew Israel on The Art Genome Project, September 21, 2013, Museum Geek, museumgeek.wordpress.com/2012/09/21/interview-matthew-israel-on-the-art-genome-project.
3. Scholarship includes Allen Edward Foster and Nigel Ford, “Serendipity and Information Seeking: An Empirical Study,” Journal of Documentation, 59 (2003): 3, pp. 321-340; Sebastian Chan, “Tagging and Searching – Serendipity and museum collection databases” (paper presented at the annual meeting for Museums and the Web, San Francisco, California, April 11-14, 2007); and Anabel Quan-Haase and Kim Martin, “Digital Humanities: The Continuing Role of Serendipity in Historical Research” (paper presented at the annual meeting for iConference, Toronto, Canada, February 7-10, 2012).
4. One Week | One Tool Team Launches Serendip-o-matic, Roy Rosenzweig Center for History and New Media, Friday, August 2, 2013, chnm.gmu.edu/news/one-week-one-tool-team-launches-serendip-o- matic.
5. Ben Templeton (2012), Mobile Culture and the Magic Tate Ball, The Guardian, July 16, www.theguardian.com/culture-professionals-network/culture-professionals- blog/2012/jul/16/mobile-culture-magic-tate-ball-app.