Posts

Showing posts with the label timemap

2021-10-20: Evolution of a Childhood Newspaper

Image
Cover Pages of Wijeya newspapers When we ( Gavindya  and  Yasith ) were young adults, we read the  Wijeya newspaper weekly back in Sri Lanka.  Wijeya is a native language (Sinhala), weekly newspaper in Sri Lanka, published by  Wijeya Newspapers Ltd . Back in the days, the  Wijeya newspaper  was published only on paper. However, today most newspapers are also published on websites as online newspapers as well. We remember reading the Wijeya newspaper weekly, whenever we got free time. Since our parents did not allow us to use computers much, we dedicated our leisure time mostly for reading.  The Wijeya newspaper included sections for science, fiction, drawings and creative writings of children, news about schools, and general news (local and international). Sometimes, the Wijeya newspaper was sold out, and our parents had to go to a different store to buy it for us. It was a very popular newspaper among young adults. We loved to read it mostl...

2021-10-19: Proving I was an Etown ILA using the Internet Archive

Image
To commemorate the 25th anniversary of the  Internet Archive , I decided to use the Wayback Machine to dig through my past related to Elizabethtown College , PA, mainly when I was featured as an International Leadership Assistant (ILA) on their international student website . I graduated from Elizabethtown College (Etown) in 2018 with a Bachelor of Science in Computer Engineering. In 2016, I had the opportunity to join the International Leadership Team to help international students promote global culture through various activities. I organized Culture Through ART  every Fall semester and assisted my colleague  Alexandria Takahashi  in developing  US Culture & Slang teaching materials. It shows a great example that international students were not only expressing their culture through crafts (e.g., origami) and paintings, but they were also learning to adopt US culture (Figure 1). At present, as a Ph.D. student researching in the area of Digital Libraries,...

2020-05-21: Visualizing Webpage Changes Over Time With TMVis

Image
Embed code for the Image Grid. Home page of  tmvis.cs.odu.edu This work has been supported by a  NEH/IMLS Digital Humanities Advancement Grant ( HAA-256368-17 ).  The web is dynamic, meaning webpages that exist today may not exist tomorrow. Even if a webpage continues to exist, it could display completely different content than it used to. Web archives, such as the  Internet Archive  (IA),  Archive-It  (AIT), and  many others , preserve past versions of webpages for use by scholars, researchers, and the general public. Using Memento terminology, an archived version of a webpage at a particular time is called a memento, or URI-M, and the list of all mementos for a particular webpage is called a TimeMap. Different web pages have different sized TimeMaps. For example, the TimeMap for odu.edu contains over 2000 mementos, while the TimeMap for cnn.com contains around 300,000. Analyzing such large TimeMaps is nearly impossible to do manually. Bas...

2018-07-02: The Off-Topic Memento Toolkit

Image
Inspired by AlNoamany's work from " Detecting off-topic pages within TimeMaps in Web archives " I am pleased to announce an alpha release of the Off-Topic Memento Toolkit (OTMT). The results of testing with this software will be presented at iPres 2018 and those results are now available as a preprint . Web archive collections are created with a specific purpose in mind. A curator will supply seeds for the collection and create multiple versions of these seeds in order to study the evolution of a web page over time. This is valuable for following the changes in an organization or the events in a news story. Unfortunately, depending on the curator's intent, sometimes these seeds go off-topic. Because web archive crawling software has no way to know that a page is off-topic, these mementos are added to the collection. Below I list a few examples of off-topic pages within Archive-It collections. This memento from the Human Rights collection at Archive-It create...

2018-03-12: NEH ODH Project Directors' Meeting

Image
Michael and I attended the NEH Office of Digital Humanities (ODH) Project Directors' Meeting and the "ODH at Ten" celebration  ( #ODHatTen ) on February 9 in DC.  We were invited because of our recent NEH Digital Humanities Advancement Grant,  "Visualizing Webpage Changes Over Time"  (described briefly in a previous blog post when the award was first announced), which is joint work with Pamela Graham and Alex Thurman  from  Columbia University Libraries and Deborah Kempe from the Frick Art Reference Library and NYARC . The presentations were recorded, so I expect to see a set of videos available in the future, as was done for the 2014 meeting  ( my 2014 trip report ).    Update: 2018 Lightning Round Talk Videos The afternoon keynote was given by  Kate Zwaard , Chief of National Digital Initiatives at the Library of Congress. She highlighted the great work being done at  LC Labs . Kate Zwaard is today's #ODHatTEN ke...

2017-10-16: Visualizing Webpage Changes Over Time - new NEH Digital Humanities Advancement Grant

Image
In August, we were excited to be awarded an 18-month Digital Humanities Advancement Grant from the National Endowment for the Humanities (NEH) and the Institute of Museum and Library Services (IMLS) .  Our project, "Visualizing Webpage Changes Over Time", was one of 31 awards made through this joint NEH/IMLS program ( award announcement ). Michele C. Weigle and Michael L. Nelson - ODU Deborah Kempe - Frick Art Reference Library and New York Art Resources Consortium Pamela Graham and Alex Thurman - Columbia University Libraries Oct 2017 – Mar 2019, $75,000 As web archives grow in importance and size, techniques for understanding how a web page changes through time need to adapt from an assumption of scarcity (just a few copies of a page, no more than a few weeks or months apart) to one of abundance (tens of thousands of copies of a page, spanning as much as 20 years). This project, a joint effort among ODU, the New York Art Resources Consortium (NYARC), and...

2017-03-24: The Impact of URI Canonicalization on Memento Count

Image
Mat reports that relying solely on a Memento TimeMap to evaluate how well a URI is archived is not a sufficient method.                            We performed a study of very large Memento TimeMaps to evaluate the ratio of representations versus redirects obtained when dereferencing each archived capture. Read along below or check out the full report . Memento represents a set of captures for a URI (e.g., http://google.com ) with a TimeMap. Web archives may provide a Memento endpoint that allows users to obtain this list of URIs for the captures, called URI-Ms. Each URI-M represents a single capture (memento), accessible when dereferencing the URI-M (resolving the URI-M to an archived representation of a resource). Variations in the "original URI" are canonicalized (coalescing https://google.com and http://www.google.com:80/ , for instance)...