Reindexing Project

This page will list the possible changes we might like to see if we do a reindexing project.

Please email Lloyd with any additional ideas for reindexing to add to this page.

1) Remove 074 from ISN index

  • It is very common for many government documents to have the same 074, so this is not a unique identifier. This results in many unhelpful hits in the Headings Report duplicate checker. Removing the 074 from the ISN index would unclutter Headings Reports making it easier to use.

2) Move 019 from ISN index to BIB UTIL index

  • The 019 is a field for old OCLC numbers. Currently when we download a new version of an OCLC record with a new 001 field, it will not recognize that there is an old version of the record with the same number in the 019. This both creates a duplicate, and leaves an outdated record in the system. Putting the 019 in the BIB UTIL index would prevent that and make the new record overlay the old one.
  • Currently the 019 is indexed in such a way that only the first $a gets into the index. The 019 often has many more $a. They should all be put into the index.
  • Putting the 019 into the BIB UTIL index will also make duplicates between the 001 and 019 appear in Headings Reports.
  • A danger of this is that it becomes possible to overlay a new version of a record with an old one. That would only happen if someone loaded an old version of a record, perhaps supplied by a vendor. A loader for vendors that supply records not directly from OCLC could be set to ignore matches on the 019, that would prevent the most likely source of this problem.

3) Add the 035 to the BIB UTIL index

  • The 035 field is a field were control numbers from other systems are sometimes stored. SkyRiver now includes some 035 fields with OCLC numbers in their database. Theoretically, indexing the 035 in the BIB UTIL index would allow records loaded from SkyRiver to match their OCLC doppelganger in our system, thus preventing a duplicate.

4) Move 020$z into the keyword index instead of ISN index.

  • Currently we have a problem that records have ISBNs for many different versions of things, paper, audio, ebook etc. This means that it is possible for things to overlay on ISBN of the wrong format. People want to keep all the ISBNs to improve searching despite the risk of overlay errors. However, ISBNs for different formats are commonly put into $z of the 020 field. What if we put 020$z into the keyword index instead of the ISBN index? That way it won’t be used for overlaying, but Pika doesn’t care about Sierra indexes. Pika could be set to treat it like any other ISBN for searching in the public catalog. It would only limit ISBN search functioning in Classic catalog and Sierra searching. All the ISBNs could still be searched as keywords there too.
  • Pika was previously set up specifically not to index the 020$z because it created problems. We need to decide which set of problems are worse.

5) Reindex patron data

  • Add separate indexes for City, State and Zip codes.
  • Maybe telephone and email address

6) Create separate index for 995 field

  • This is the field were we are noting the load profile used to load a record and the month and year it was loaded. Putting this in an index would make this information more useful.

Last updated on 06/08/2017