Python Metadata Validator

Output of the validator using the sample metadata
(as a txt file)

The project built upon python programming skills I developed across a semester, in the course Programming for Cultural Heritage, and incorporated work I completed concurrently in another class, Metadata Design. In Metadata Design, alongside another team member, I developed a metadata application profile, which included cataloging guidelines, a domain model, an element set, and an entry mechanism, for a collection of oral histories held at NYU. As part of this project, our guidelines and entry mechanism were tested by a group of our peers. In programming for cultural heritage, I wrote a metadata validator, through which I tested the quality of the metadata written by our peers. The project also served as an experiment to explore the components of metadata which a machine could or could not validate.

The final project is available as a Google colab workbook, which is designed to run a series of tests on an accompanying csv file. In practice, the guidelines and entry mechanism I developed in Metadata Design resulted in very high-quality metadata, thus I added several errors to the sample dataset in order to assess the ability of the validator to flag errors. Throughout the project, I developed skills applying python to a practical context, built capacity working with regex, and with python libraries including pyspellchecker, and pandas. In this project I iteratively developed portions of the script to ultimately address a complex problem, by dealing with one part of the problem at a time, thus practicing breaking down complex technological tasks into achievable components.

I researched, wrote, and modified the python scripts for this project independently, and presented the project in class.

Python (libraries: pyspellchecker, pandas, re), Google CoLab, Regex, PBCore