An archivist in the lab with a codebook: Using archival theory and “classic” detective skills to encourage reuse of personal data (Carly Dearborn, Purdue University Libraries)
Presentation Details:
Name that File! An Active Learning Approach to Promoting Thoughtful Filenaming Practices in Personal Digital Archives
While developing filenaming schemes can be a mundane activity, sometimes the long-term usability and findability of personal digital assets depends solely on the mere naming of files. So how might personal digital archives (PDA) instructors teach the importance of filenaming strategies in an engaging manner? “Name That File” is a brief (approximately 15 minute) group activity that PDA instructors can use to teach filenaming concepts such as description and choosing data elements by which to organize by. Using everyday-use items like printed photos and manila folders to stand in for abstract notions like computer files and directories, the activity aims to promote a more engaged thoughtfulness to how one organizes and names their personal digital assets. This poster will describe the components and learning objectives for the "Name That File" activity that can be included in personal digital archiving workshops and other related programming.
ePADD is an open source and freely available software package, funded by the Institute of Museum and Library Services (IMLS), that allows individuals and institutions to analyzea nd provide access to email of potential historical or cultural value. The software primarily accomplishes this goal by incorporating techniques from computer science, including natural language processing, named entity recognition, and other algorithmic processes. We will demonstrate/discuss the software in the context of personal digital archiving.
The online interactive digital archive to be demonstrated is an exhibit of the Stanford University Libraries: the “Edward A. Feigenbaum Papers.” Although a finished project and product, it is really a prototype of what libraries should be building and hosting to make available materials that record the intellectual life of eminent scholars. The target audiences are people doing historical research, and students examining the history of particular people and ideas. The “Edward A. Feigenbaum Papers” collection primarily concerns his work in artificial intelligence (AI) at Stanford University, and in his public service. It includes administrative and project files, correspondence, proposals, reports, reprints, AI Lab preprints, audio tapes, video tapes, and files on computer programs, including EPAM, DENDRAL, MOLGEN, MYCIN, the language IPL-V, and others. The collection includes papers documenting the histories of the main laboratories in which he did his collaborations: Heuristic Programming Project, Knowledge Systems Laboratory, and SUMEX-AIM. Finally, there are documents related to Feigenbaum's public service to the US Air Force (as Chief Scientist), the National Institutes of Health, the National Library of Medicine, the National Science Foundation, and the Defense Advanced Research Projects Agency. The physical materials are stored in 78 boxes, with access delays of days. In the online version, all materials have been scanned into PDF files with OCR backing, so that every word of most materials is searchable by keywords, using a Google-like search. Other navigation tools offer alternate paths of access, including a “similarity” search based on word frequencies in documents. Every item is downloadable by the user. This digital archive was built using the Zotero software for the editing and annotation of metadata (done by Feigenbaum); and collection management software developed by Stanford Libraries’ Digital Libraries Systems & Services (DLSS).
Personal Web archiving requires enabling individuals to preserve Web content at will. In previous work, we introduced Web Archiving Integration Layer (WAIL), a tool that integrates an archival crawler (Heritrix) and replay system (Wayback) to facilitate individuals' preservation. In this work, we have vastly revised WAIL using modern Web technologies and introduced the concept of collection-based personal Web archiving that can be accomplished on a user's machine. Unlike subscription-based Web archiving services, like Archive-It, WAIL provides an interoperable mechanism to accomplish this without reliance on an external service. We rebuilt WAIL using its core concept into an Electron-based native application for a more consistent and accessible interface with better integration with Heritrix, OpenWayback, and other personal Web Archiving tools. Electron has allowed WAIL to be more consistent across the Mac, Windows, and Linux platforms. A central focus of this update has been on social media, mainly the preservation of users' Twitter feeds. In contrast to the previous versions of WAIL, our revision leverages a native Chromium browser via Electron to surface content specific to sites like Twitter for more accurate preservation. This additional functionality will allow users to focus on preservation of personal Web content that may have previously been difficult to archive.
Personal, Political, Public: Digital Archiving the Present
Introduction by Karen Biestman, Associate Dean and Director, Native American Cultural Center, Stanford University.This panel will examine the narratives and biases that have impacted PDA in the past, and approaches that some are taking to push the needle towards social justice, including citizen documentation.
Citizen documentation is increasingly becoming inextricable from the work of many activist and social justice communities. This documentation can act as counter-evidence, a way of speaking back to the means and methods of evidence gathering by the state. Over the past few years, citizen documentation of fatal encounters with police have served as a catalyst for many overlapping communities, exposing the urgency of confronting police violence, the techno-utopian allure of surveillance technologies and the speed with which technologies of dissemination can disperse, disconnect and re-contextualize.
Building tools for collecting, authenticating, organizing, storing and accessing myriad forms and formats of documentation within proposes both challenges and opportunities and with instances of police violence we must confront the ways in which calls for authentication might force us to operate within the juridical framework of evidentiary value or whether we need to redefine evidentiary value in community terms. Recordings made with a smart phone are certainly personal digital records, but they immediately become embroiled in a network of legal and technological issues through the use of corporately owned infrastructures and through evidence law to name a few.
Andrea Pritchett, co-founder of Berkeley Copwatch, Robin Margolis, UCLA MLIS in Media Archives, and Ina Kelleher, PhD student in Comparative Ethnic Studies with a Designated Emphasis in Gender, Women and Sexuality Studies, will present a proposed design for a digital archive aggregating different sources of documentation toward the goal of tracking individual officers. Copwatch chapters operate from a framework of citizen documentation of the police as a practice of community-driven accountability and de-escalation.We designed “Publishat” based on personal data as a paradigm. We developed a framework for managing personal data for a lifetime. We organised personal data into 6 major areas: Academics, Personal, Professional, Health, Financial and Legal. There is another dimension, Communication, which is mandatory for all the major areas. The content is created using the Personal Lifecycle Management framework, is context-based, and is therefore effective in aiding decision making.
The purpose of the publisher in a digital era is to follow the principles of correct transmission of all the features of the historical source. At the same time work with texts from family collections needs to strike a balance between personal and public to avoid publishing the information that could possibly compromise third parties. This is one of the most difficult tasks of private archives publishing, the key to which can only be found in close cooperation with the heirs and administrators of family archives. We offer some solutions to these problems within the « Prozhito» (prozhito.org) – the first global database of 400 diverse non-authorized private diaries (150,000 entries), tied to a chronological line, representing personal narratives from the XIX-XX centuries in Russian and Ukrainian. “Prozhito“ blends the structural experience of blog platforms and archival tradition of curating personal writings. User can work not only with particular texts but with the whole collection of diary entries, building complex search queries by author’s gender and age, journal types (f.e, war, tourist, dream etc.) and filtering results by exact dates and places of records. In Prozhito the manuscript owners (person or family) continue to participate in its preparation for publication and control the text on all the steps of its transformation from the manuscript to the machine-readable database unit. They have the right to exclude fragments, considered unappropriate due to ethical reasons. Working with a family history often activates intrafamily communication, but the information, stored in the family archives, is of interest not only for the family members. Prozhito project allows any user to explore the diaries data and gives huge research material for researchers of everyday life.
Webrecorder is a free online tool that allows users to create their own high-fidelity archives of the dynamic web. Current digital preservation solutions involve complex, automated processes that were designed for a web made up of relatively static documents. Webrecorder, in contrast, can capture social media and other dynamic content, such as embedded video and complex javascript, while putting the user at the center of the archiving process.
.
Over the past decade, broad public understanding of an appreciation for personal digital archiving has increased in surprising ways. Facebook’s Timeline is in use by nearly 2 billion monthly users, Prince’s digital archive recently went on the market for $35 million, billions of smart phones have been put into to use to record daily experience, and concern about the future of personal digital belongings has become a staple of mainstream news reporting.
This discussion between early observers and practitioners of personal digital archiving will look back on the last decade, and forward to the next, covering changing social norms about what is saved, why, who can view it, and how; legal structures, intellectual property rights, and digital executorships; institutional practices, particularly in library and academic settings, but also in the form of new services to the public; market offerings from both established and emerging companies; and technological developments that will allow (or limit) the practice of personal archiving.Digital audio and video permeate every aspect of our social experience - capturing familial memories, communicating news, documenting civil rights abuses, and weaponizing political campaigns. To many individuals, archiving is a mystery. To many organizations, it is a challenge. Born digital audiovisual collections are increasingly at risk of loss due to rapid format obsolescence, proliferation of content without sufficient preservation planning, and popular software-as-a-service archiving models limiting the public's knowledge and participation in the archiving process.
Three fundamental questions will be addressed: What constitutes an archive of born digital AV? How can a person or small organization curate and steward a born digital AV collection for preservation and access? How can archiving born digital AV be practical, efficient, and cost-effective? The workshop curriculum will include characteristics of born digital AV, how methods of generating born digital AV influence archiving, efficient and practical digital tools, preservation strategies for retention and access, and ethics and privacy.
Tools and services that will be covered or referenced include but are not limited to VLC, QuickTime, iTunes, Windows Media Player, MediaInfo, Handbrake, MPEG Streamclip, FFMPEG, Adobe Media Encoder, Exact Audio Copy, VOB2mpeg, Guymager, Forensic Toolkit, IsoBuster, Toast, cloud-based archiving services, and online video platforms for access.
Instructors will support Mac/PC attendees.
About the Presenters
Stefan Elnabli is UC San Diego Library's Media Curation Librarian and digital reformatting operations supervisor, providing strategic direction in the Library's development, management, and preservation of moving image collections. Elnabli's engagement with visual culture spans the areas of cinema studies, archival preservation, and film programming/projection. His past appointments include positions with WNET Channel 13 Digital Archive, Anthology Film Archives, Doc Films at the University of Chicago, and preservation units within major university libraries including New York University, Stanford University, and Northwestern University. Elnabli holds an MA in Moving Image Archiving and Preservation from New York University.
Annalise Berdini is UC San Diego Library’s Digital Archivist, providing library-wide expertise on workflows, tools, and best practices to support the management and preservation of born-digital content. She previously served as Manuscripts/Archives Processor for Special Collections and Archives at UCSD, Project Assistant and Processor for the PACSCL/CLIR Hidden Collections Project in Philadelphia, and worked on the Z. Taylor Vinson Collection at Hagley Museum and Library.
ePADD is an open source and freely available software package, funded by the Institute of Museum and Library Services (IMLS), that supports the ability of individuals and institutions to analyze and evaluate email of potential historical or cultural value. The software primarily accomplishes these tasks by incorporating techniques from computer science, including the fields of natural language processing and named entity recognition. The software also supports the creation and use of customizable lexicons, attachment browsing, regular expression search, and other related features.
This workshop will provide participants with the knowledge and experience to use ePADD to analyze and evaluate personal email archives, including their own email. The workshop will include discussion on overcoming potential implementation challenges, as well as opportunities to participate in ePADD‘s development.
Peter Chan (ePADD Project Manager, Digital Archivist, Stanford University) and Josh Schneider (ePADD Community Manager, Assistant University Archivist, Stanford University) will co-lead the workshop, orienting participants to the software, demoing its capabilities, and taking participants through the steps of using ePADD to analyze and evaluate personal email archives. A reading list and agenda will be distributed to participants in advance of the workshop.
Attendees will need to bring a laptop meeting the following minimum specifications (which may be updated prior to the workshop):
OS: Windows 7 SP1 / 10, Mac OS X 10.10 / 10.11
Memory: 4096 MB minimum (2048 MB RAM allocated to the application by default)
Browser: Chrome 50/51, Firefox 47/48
Windows installations: Java Runtime Environment 8u101 or later required.
Please note that attendees will also need administrative privileges for their machine to be able to run the software. Flash drives containing the latest ePADD release, as well as a test email archive, will be provided by the presenters.
About the Presenters
Peter Chan is Digital Archivist at Stanford University. He is also Project Manager for ePADD, an open-source software package that supports archival processes around the appraisal, processing, discovery, and delivery of email archives.
Josh Schneider is Assistant University Archivist at Stanford University, where he acquires and provides access to Stanford University records, faculty papers, and collections documenting campus and student life. He is also Community Manager for ePADD, an open-source software package that supports archival processes around the appraisal, processing, discovery, and delivery of email archives. He is an advisory board member of BitCurator NLP, and an editorial board member of American Archivist and Journal of Western Archives.
Embark on a self-guided tour of Stanford University Libraries' cross-disciplinary exhibition, Terraforming: Art and Engineering in the Sacramento Watershed.
The exhibition, co-curated by Laura Cassidy Rogers (PhD Candidate in Modern Thought and Literature, Stanford University) and Emily Grubert (PhD Candidate in the Emmett Interdisciplinary Program in Environment and Resources, Stanford University), is on view in the Peterson Gallery and Munger Rotunda in the Green Library Bing Wing, from January 26 to April 30, 2017.
Terraforming: Art and Engineering in the Sacramento Watershed examines the history of freshwater in the Sacramento Watershed, juxtaposing materials from the archive of California artists Helen and Newton Harrison with materials from local, state, and national archives that document the development of water resources in California’s Central Valley and the West. Presented as discrete, parallel displays—with Art on one side of the gallery and Engineering on the other—the exhibition demonstrates that social and environmental consciousness has manifest in both professions, and that artists and engineers can work together to rethink and reimagine freshwater landscapes and ecology in a sustainable way.
The Rumsey Map Center, named for its leading donors, David and Abby Rumsey, complements Stanford Libraries’ long history of working with cartographic materials. The combined holdings include the David Rumsey Map Collection of some 150,000 maps and their digital surrogates as well as other cartographic collections and materials long held at Stanford, including the Glen McLaughlin Collection of maps of California as an Island, the Dr. Oscar I. Norwich Collection of Maps of Africa and over 10,000 antiquarian maps collected over the years by Special Collections.
Location
The entrance is located off of the Rotunda area in the Bing Wing of Green Library (second floor). Proceed through the door to the stairwell and up to the fourth floor entrance.
If you are unable to join the tour, the DRMC is also accessible to visitors on Friday from 1-5 PM.
AAre you a podcaster with a hard drive full of files? Have you considered how future historians, researchers, archivists, and audiophiles will find and listen to your work?
This hands-on workshop will demonstrate low-cost, easy-to-use storage and media asset management tools and techniques to ensure the longevity of your digital audio files. Since podcasting “best practices” have not yet been developed, this workshop also aims to publish a set of basic guidelines that can be re-purposed for future workshops, or be used by individuals or groups to archive a collection of audio files. Facilitators will guide participants through basic principles of audio file formats, metadata and checksum generation, and cloud vs. physical storage solutions. In addition to this, we will discuss advocacy techniques to promote and make unique content discoverable. The focus will be on low-cost, user-friendly tools. Participants will be encouraged to bring laptops, as well as their podcast audio files for a hands-on experience (with setup instructions provided prior to the workshop); however, laptops are not a requirement and participants may also follow along with the demos.
Although this workshop will be targeted towards podcasters and independent audio producers, it will be suitable, useful and fun for anyone working with a personal collection of digital audio files.
About the Presenters
Mary Kidd currently works at New York Public Library’s Special Collections Division, and was an NDSR resident at New York Public Radio. She is also an active member of the XFR Collective. XFR is a non-profit organization that partners with artists, activists, individuals, and groups to lower the barriers to preserving at-risk audiovisual media.
Dana Gerber-Margie is an A/V and Digital Archivist for Recollection Wisconsin’s Listening to War: Uncovering Wisconsin’s Wartime Oral Historiesgrant project. She is also a founding member of the Bello Collective, a publication about podcasts, and routinely asks probing questions about producers’ digital preservation habits.
Anne Wootton is the co-founder of Pop Up Archive, a platform for making sound searchable. She holds a Master’s in Information Management and Systems from the University of California Berkeley. She is a winner of the 2012 Knight News Challenge: Data and has spoken internationally about audio search and discoverability, including SXSW Interactive, the Radcliffe Institute for Advanced Study at Harvard University, and the Aspen Institute.
Danielle Cordovez is the Audiovisual Librarian at the New York Public Library’s Rodgers and Hammerstein Archives of Recorded Sound. She is currently serving on the Board of Directors of the Association for Recorded Sound Collections (ARSC) as well as the Steering Committee of the Society of American Archivists (SAA).
The goal of this workshop is to teach and interact with the local community to think about digital personal archive. To share ideas, techniques and easy conservation practices I have developed as professional photographer and photography archivist for families. It is important to start a conversation around quantity vs quality of image production, generate an understanding of pixels, a debate and a dialogue around the pros and cons of a 100% digital photographic archive, the costs and the priorities that need to be raised to define personal conservation practices. Questions and introduce simple ideas and techniques to seek for a longer preservation of the family history.
Main topics to be covered:
About the Presenter
Júlia Pontés is a Brazilian/Argentinian photographer and personal photographic archive consultant currently living and working in NY. She holds Masters Degrees in Business from Sorbonne, Paris I and in Law and Economics - Public Policies from Universidad Torcuato di Tella, in Argentina. Photography didn’t become the main focus of her professional life until 2013, when, among other things, she inherited an important photographic analog archive that had been untouched for almost 20 years. That led her to pursue a strong photographic education at the International Center of Photography in New York, where she graduated in the general studies in photography and later became an Exhibition Coordinator and was a teaching assistant at the International Center of Photography to the classes: “What is an archive” taught by Claudia Sohrens, “Digital Seminar” and “Images and Ideas” taught by Fred RItchin, one of the greatest mind in contemporary digital image making. I addition to that, she was chosen as an Emerging Immigrant Artist by the New York Foundation for the Arts, where through a competitive process she has been chosen to attend a free mentoring program for artists with social practices.
In 2016 an opportunity was presented to focus great part of her professional practices to archives. She started to help photographers and families to start thinking about their personal archives, both analog and digital. This work led to her current project on called “Saveit.Photo” where she tries to introduce the principles of personal archive in the digital era to the general public. By spreading simplified archival techniques she aims to contribute to the conservation of photographs as they are, undoubtedly, an important element of the collective memory and family histories.
Have you ever wondered what will happen to all the data you will create over your lifetime?
Consider your email, social media data, websites, photos, videos, documents, and all of the other files and traces you create and interact with on an ongoing basis. Do you know how you will preserve the data, understand it, and use it in the future?
Join friends and fellow coders to create innovative solutions to the ongoing challenges facing individuals (including digital humanists and cultural heritage researchers) in the digital age, including how we can best analyze, visualize, and use the immense variety of personal data that we are all creating.
How can we best make sense of the digital strands and data that comprise a 21st century life?
Join your friends! Eat our food! Win our prizes! Keep the present safe for the future! Read about requirements, judges, prizes, and more at our Devpost page.
Note: In order to participate in the Hackathon, you must register via Devpost as well as via Eventbrite by 4 PM on March 31.
Sponsors for this hackathon include Stanford University Libraries' Department of Special Collections and University Archives, Center for Interdisciplinary Digital Research (CIDR), and ePADD.