ePADD is an open source and freely available software package, funded by the Institute of Museum and Library Services (IMLS), that allows individuals and institutions to analyzea nd provide access to email of potential historical or cultural value. The software primarily accomplishes this goal by incorporating techniques from computer science, including natural language processing, named entity recognition, and other algorithmic processes. We will demonstrate/discuss the software in the context of personal digital archiving.
The purpose of the publisher in a digital era is to follow the principles of correct transmission of all the features of the historical source. At the same time work with texts from family collections needs to strike a balance between personal and public to avoid publishing the information that could possibly compromise third parties. This is one of the most difficult tasks of private archives publishing, the key to which can only be found in close cooperation with the heirs and administrators of family archives. We offer some solutions to these problems within the « Prozhito» (prozhito.org) – the first global database of 400 diverse non-authorized private diaries (150,000 entries), tied to a chronological line, representing personal narratives from the XIX-XX centuries in Russian and Ukrainian. “Prozhito“ blends the structural experience of blog platforms and archival tradition of curating personal writings. User can work not only with particular texts but with the whole collection of diary entries, building complex search queries by author’s gender and age, journal types (f.e, war, tourist, dream etc.) and filtering results by exact dates and places of records. In Prozhito the manuscript owners (person or family) continue to participate in its preparation for publication and control the text on all the steps of its transformation from the manuscript to the machine-readable database unit. They have the right to exclude fragments, considered unappropriate due to ethical reasons. Working with a family history often activates intrafamily communication, but the information, stored in the family archives, is of interest not only for the family members. Prozhito project allows any user to explore the diaries data and gives huge research material for researchers of everyday life.
ePADD is an open source and freely available software package, funded by the Institute of Museum and Library Services (IMLS), that supports the ability of individuals and institutions to analyze and evaluate email of potential historical or cultural value. The software primarily accomplishes these tasks by incorporating techniques from computer science, including the fields of natural language processing and named entity recognition. The software also supports the creation and use of customizable lexicons, attachment browsing, regular expression search, and other related features.
This workshop will provide participants with the knowledge and experience to use ePADD to analyze and evaluate personal email archives, including their own email. The workshop will include discussion on overcoming potential implementation challenges, as well as opportunities to participate in ePADD‘s development.
Peter Chan (ePADD Project Manager, Digital Archivist, Stanford University) and Josh Schneider (ePADD Community Manager, Assistant University Archivist, Stanford University) will co-lead the workshop, orienting participants to the software, demoing its capabilities, and taking participants through the steps of using ePADD to analyze and evaluate personal email archives. A reading list and agenda will be distributed to participants in advance of the workshop.
Attendees will need to bring a laptop meeting the following minimum specifications (which may be updated prior to the workshop):
OS: Windows 7 SP1 / 10, Mac OS X 10.10 / 10.11
Memory: 4096 MB minimum (2048 MB RAM allocated to the application by default)
Browser: Chrome 50/51, Firefox 47/48
Windows installations: Java Runtime Environment 8u101 or later required.
Please note that attendees will also need administrative privileges for their machine to be able to run the software. Flash drives containing the latest ePADD release, as well as a test email archive, will be provided by the presenters.
About the Presenters
Peter Chan is Digital Archivist at Stanford University. He is also Project Manager for ePADD, an open-source software package that supports archival processes around the appraisal, processing, discovery, and delivery of email archives.
Josh Schneider is Assistant University Archivist at Stanford University, where he acquires and provides access to Stanford University records, faculty papers, and collections documenting campus and student life. He is also Community Manager for ePADD, an open-source software package that supports archival processes around the appraisal, processing, discovery, and delivery of email archives. He is an advisory board member of BitCurator NLP, and an editorial board member of American Archivist and Journal of Western Archives.