Meet Carlos Acevedo, Information Sciences Intern on The Art Genome Project, Spring 2015

The Art Genome Project
Apr 27, 2015 4:23PM

Carlos joined The Art Genome Project during his second semester at the Pratt Institute, where he is pursuing a Master’s of Science in Library and Information Science. Before joining Artsy, he had experience in the contemporary art world as the Associate Director of a project gallery space in Austin, Texas. He also worked at the Austin Public Library and The Wittliff Collections, the special collections of his alma mater, Texas State University. Read what Carlos has to say about his experience on The Art Genome Project below.

I joined The Art Genome Project because of the groundbreaking vocabulary and cataloging process used to power the site. From an information science perspective, I can appreciate the complexity and nuance of the project to recommend related artworks and artists based on the content and concepts of an artwork. There is truly nothing else like it, so I was very motivated to join the team and contribute!

Artsy, and especially The Art Genome Project, represents an incredible intersection of art and technology. I found the application of software development methodology to the process of developing and managing all aspects of The Art Genome Project to be a revelation. Daily stand-up, monthly sprints, and various project management tools were all new to me, and they contributed to the overall culture that incubated a sense of openness and sharing that I found admirable. I genuinely felt like I was part of the team every day I went to the office, and that my contribution to the team really counted.

During my tenure, I was able to work on immensely challenging individual projects, as well as projects that encouraged collaboration with other teams within Artsy. As a Genome team member, I had the responsibility of contributing to the application of descriptive metadata to artists and artworks based on Artsy’s internally developed vocabulary—or, more simply put, the process of “genoming.” It involved a lot of research on artists and galleries to make sure I fully captured the context and concepts in the artworks. Along the way, I found myself more in tune with current trends in the global contemporary art world, and even discovered new artists and galleries in New York that I now love and follow (Jordan Tate, whom I came across while genoming, is one of my new favorite artists).

The internship also allowed me to apply my information science education to projects I worked on. In addition to running The Art Genome Project, the genome team is responsible for metadata related to the 40,000 artists and over 250,000 works on the site (this includes most of the non-editorial additional information you see around artists and works, such as an artist's nationality or exhibition history). I worked with large datasets, cleaning and normalizing some of this metadata, which manifested itself as a better, richer experience to users of the site. One specific example involved the addition of collections, exhibitions, and publication histories to artists on the site. Using OpenRefine, an open-source tool for cleaning messy data, I was able to apply metadata best practices to the information that would eventually make its way to the artist pages. I normalized ambiguous gallery and institutional names based on the Library of Congress Name Authority Files and Virtual International Authority, as well as standardized city and country names for consistency on the site. For example, all the separate entries for “NYC,” “Manhattan,” “New York City,” etc. were changed to New York, NY. This helped insure the accuracy of the information we provide, as well as to standardize entries for collections across artists.

OpenRefine screenshot working with Cindy Sherman collection entries

Collections page for Cindy Sherman

The Art Genome Project surprisingly gave me the opportunity to learn and practice Ruby, regular expressions, and SPARQL. For a long-term project I queried linked open datasets and wrote over a hundred Ruby scripts to make harder-to-discover works of art and artifacts (generally by anonymous or unknown creators) more visible on the site—a perfect example of art and technology intersecting to enrich the experience for users. This project allowed us to more systematically and comprehensively populate category pages for e.g. Middle Africa, Hopi Art, or Benin art. It’s worth noting that we use such processes to supplement the process of genoming, not to replace it, and doing this initial categorization allows us to better assign genoming to specific experts, while also making the works more discoverable before a genomer can fully annotate them.

Page for Middle Africa, with artworks being pulled from my Ruby script

However, the most rewarding aspect of my internship was the experience of working with the Genome Team members. They are all incredibly intelligent, thoughtful, friendly, and I find their passion for the project and the work that they do inspirational. They treated me as an equal, let me propose and tackle my own projects, respected my background and subject expertise, and always considered my input and opinions. Their support and guidance throughout my tenure at Artsy was vital in making the internship one of the most unique and gratifying experiences I’ve ever had.



—Carlos Acevedo

The Art Genome Project