Pages

Showing posts with label patrons. Show all posts
Showing posts with label patrons. Show all posts

Aug 15, 2012

Solving Problems with Authority and Sharing: Developments and Prospects #saa12

The Social Networks and Archival Context (SNAC) project is an ambitious one that seeks to locate records of historical importance across repositories and make them available to patrons on a massive scale. Our panel updated us on its fascinating progress. Look at what we records and information management professionals can do.

The Society of American Archivists (SAA) 2012 annual meeting, “Beyond Borders," concluded Saturday, August 11, 2012 in San Diego.

Tammy Peters of the Smithsonian Institute introduced her panel:

  • Ray R. Larson (University of California, Berkeley)
  • Daniel Pitti (University of Virginia, Institute for Advanced Technology in Humanities) 
  • Jerry Simmons (National Archives and Records Administration).

The Social Networks and Archival Context Project: Status Report

Ray R. Larson

Mr. Larson delivered an update to SNAC. Officially, the goals of the project are to further the transformation of archival description and to separate description of records from description of people documented in them. Translation: the project is meant to make available records of historical importance and

  • enhance access to archives resources, through all cultural heritage resources; and
  • enhance understanding of those resources.

We’re talking big data. With a sample of 150,000 EAD-encoded finding aids contributed from around the world by national libraries and others, including:

  • Library of Congress
  • National Archives and Records Administration
  • Smithsonian Institution
  • British Library
  • Archives nationales (France)
  • Bibliothèque nationale de France
  • OCLC WorldCat and VIAF 
  • Getty Vocabulary Program.

Institutes like the Getty Vocabulary Program have contributed a union list of artist names (make that: 293,000 personal and corporate names).

The problem: a proliferation of the forms of names (for example, different people with the same names). EAD records are full of family names and within the structure it notes the creator of the archive (typically the complete autobiography is provided). This autobiography is extracted to the Encoded Archival Context for Corporate Bodies, Persons, and Families records (EAC-CPF) record.

We’re given names — sometimes multiple names. Identical names means a complete Library of Congress record with attributes is available. If it’s an exact match, it’s marked. But marking doesn’t work for everything. Abbreviations are troublesome — think transliteration of non-roman characters. We take names where we didn’t get an exact match, then test against library authority files. Do we find an exact match? We flag it as a potential merge. Is nothing matched by this stage? We create overlapping segments of three characters. Finally, we take all flagged as potential matches, do a find, make sure these are the ones we want. With the authoritative form of the name, we combine all EAC-CPF records. To give you an idea of volume, a recent test merged 93,033 person names from 114,639 person records," said Larson

In other words, the names are extracted from EAC-CPF and from existing EAD. If the EAC-CPF records match against one another and against existing authority records (for example, VIAF), then prototypes of historical resources and accessibility are created.

 

Continue reading this article:

 
 

Source : cmswire[dot]com