Posts

Showing posts with the label Open Access

2024-10-02: MS Thesis: Surfacing Text Changes in Archived Webpages

Image
Thesis defense, July 29, 2024. Picture courtesy of Dr. Michele Weigle. My master’s thesis, “Surfacing Text Changes in Archived Webpages” explores how users can better find and view changes on webpages in web archives. The thesis contributes to the area of information seeking behavior in web archives, and addressed three research questions.  1. How can we make changes in webpages discoverable and understandable? We presented a change text search interface for web archives that allows users to find changes in webpages. This interface also includes an animated deletion tool and a sliding difference tool, which help users view the changes in context. This part of the thesis was informed by our formative investigation “ User Tasks of Journalists .” We presented this work in our paper, “ Making Changes in Webpages Discoverable: A Change-Text Search Interface for Web Archives ” at JCDL 2023 , and the paper earned the best student paper award. 2. How can we increase efficiency in web archi...

2022-09-13: A Hybrid Classifier to Extract URLs linking to Open Access Datasets and Software for Computational Reproducibility Study

Image
It has become common practice to include open access datasets and software (OADS) in computational research publications. Emily Escamilla , discusses the increasing trends of including web and software repository platforms in scholarly publications in her One in Five arXiv Articles Reference GitHub blog. OADS are essential resources for replicating computational experiments and making the work more transparent. OADS are also crucial for building repositories that support computational reproducibility. The process of manually examining a large number of research papers in order to extract URLs linking to OADS is time-consuming and labor-intensive. Thus an automatic approach should be adopted to identify and extract OADS-URLs (URLs linking to OADS) from scientific papers. We proposed a hybrid OADSClasssifier consisting of a heuristic and a supervised learning model to identify OADS-URLs in a research paper automatically. The classifier achieves a best F1 of 0.92. The source code is av...