Projects
Research Projects
T5 meets Tybalt: author attribution in Early Modern English drama using large langauge models
Advised by Professor David Mimno @ Cornell University
Winter 2023
This project explores the ability of large language models like T5 to recognize authorial style. In particular, in this project we ask T5 to attribute 1-450 word excerpts from Early Modern drama (plays written in the 1500s and 1600s) to 28 authors. The fine-tuned model attains an accuracy score of ~34%, vastly better than randomized baselines and noticably better than logistic regression. In addition, variations in the accuracy of individual authors and plays reveals interesting information about T5's ability to analyze style and histories of disputed authorship. This work is currently in submission.
Tools for Tracking Police Misconduct Data
Advised by Professor Sarah Chasins and Professor Aditya Parameswaran @ UC Berkeley
Summer 2021 – Spring 2022
Tools for Tracking Police Misconduct Data works on creating a database to help the National Association of Criminal Defense Lawyers track police officer misconduct. As part of this project, I iteratively designed MongoDB database queries to find officer misconduct cases, wrote code to output the results of those queries to a Postgres database, and created a website hosted with Flask to allow users to validate those results and then output the results to the database. I have also worked on extracting information from difficult-to-process PDFs. During the summer, I also participated in the PLAIT Lab reading group and social events.
Word Clouds in the Wild
Advised by Professor Eric Alexander @ Carleton College
Summer 2020 – Summer 2022
This project aims to discover how word clouds are being used with the eventual goal of designing guidelines for creating visually pleasing but effective word clouds. Along with Professor Alexander and another undergraduate researcher, I used grounded theory, a method of qualitative data analysis that involves hypothesis formation through data collection, to study word cloud usage in digital humanities academia and journalism. I also analyzed the collected data. A paper describing these results has been accepted for publication at the 2022 Vis4DH worshop at IEEE Vis.
Senior Thesis Projects
Practicum 2.0: An Interactive Tool for Practicing Introductory CS Topics
Advised by Professor Aaron Bauer @ Carleton College – Link to Project
Received the grade of Distinction
Winter 2022 – Spring 2022
This was be the senior thesis project for my Computer Science major. Along with four other senior CS majors, I extended existing educational CS software called Practicum to include problems on classes and objects. To do this, we changed the underlying structure of Practicum, including it's two parsers and simulators, to handle classes, designed and coded visualizations and practice problems to teach these concepts, and ran an experiment to test the effectiveness of the tool. We found that Practicum was a more effective study tool than paper worksheets, although these results were only significant for students still in Introduction to Computer Science. My focus throughout this project was designing, implementing, and integrating educational visualizations of classes and objects, although I was also involved in other aspects of the project.
"Let every work weigh heavy of her worth": Examining How Women Enact Power in Shakespeare’s Comedies through Interactive Speech Pattern Visualizations
Advised by Professor George Shuffelton @ Carleton College – Link to Project
Received the grade of Distinction
Fall 2022 – Winter 2022
This was my senior thesis project for the English major. In it, I analyzed how women enact power in Shakespeare’s comedies by visualizing their speech patterns. I hand-annotated each Shakespearean comedy for who talks to whom, designed and coded visualizations to present the speech pattern data, and then wrote an essay analyzing the visualizations and contextualizing my work within previous literary and DH analysis of Shakespeare. The results of this work have been accepted for publication through the 2022 Workshop on Computational Drama Analysis.
Portfolios
Digital Arts and Humanities Minor Capstone Portfolio
Advised by Professor Austin Mason @ Carleton College
Spring 2022
I created this online portfolio of my digital arts and humanities experience as part of the capstone class for the minor.
Class Projects
Using Agglomerative Clustering to Group PDFs by Format
Fall 2021
As the final project for my Artificial Intelligence course I wrote an algorithm that uses agglomerative clustering to group PDFs by format. The inputs to the algorithm were abstracted images of the first page of each PDF.
Scheme Interpreter in C++
Created in partnership
Spring 2021
Over the course of a term, my partner and I created an interpreter for Scheme in C for the Programming Languages class. We wrote a tokenizer, parser, and interpreter.
"What care I for words?": Visualizing Characters' Speech in As You Like It
Winter 2021
I created this digital humanities project for Shakespeare II. I hand annotated Shakespeare’s As You Like It for who talked to whom and processed the data using a Python script. I then created two interactive visualizations in the computational notebook Observable. One allowed users to examine who a character talked to throughout the play and the other showed how two characters talked to each other. I also made an interactive version of my annotations so users could validate my work.
Biodiversity in National Parks Visualization
Created in partnership
Winter 2021
The Biodiversity in National Parks Visualization is an interactive visualization that presents comparative data on biodiversity in coastal and inland national parks. It was created with JavaScript in Observable and allows users to select what data to compare, click to see more detailed comparisons, and hover over categories to see more information. It was the final project for Data Visualization.
Seattle Library Explorer
Created in partnership
Fall 2020
The Seattle Library Explorer was a full stack project created for my Software Design class. My partner and I used data from kaggle on book checkouts in all the Seattle library branches to populate a database we created with Postgres. We also wrote a website hosted with Flask that uses psycopg2 to query the database.