This article analyzes the procedures involved in the creation of the subject headings of Trieste pole of the national union catalog SBN. The change-over to the SKOS RDF format of the Nuovo Soggettario thesaurus has facilitated the updating of the obsolete terms present in the authority file. The change-over has also enabled the inclusion of the terms from the New Subject Index that were not present in Trieste pole of the national union catalog SBN until now. Moreover it has been possible to incorporate the references, the hierarchical links, the add-ons and the new terms as well as the corrected terms already present in Trieste pole of the national union catalog SBN.
The TSA Polo of Trieste and Friuli Venezia Giulia, established in 1993, is one of the two poles of the National Library Service (SBN) currently present in Friuli Venezia Giulia, and is coordinated by the University of Trieste.
Through the current visualization of the results in Biblioest, it is also possible to filter the results thanks to descriptive facets and the semantic tag cloud.
To allow this, the cluster has for some time proposed to operators the inclusion of optional data, which institutions can add to the mandatory ones, thus enriching the search fields relating to a document.
Certainly the semantic access most used by the libraries of our Polo is that by subject. In SOL the subjects come together in a dedicated authority file, distinct from that of the terms; each subject therefore gives life to one or more terms, which are generated automatically.
Over the years the terms entered, and the processing of the subject strings, have followed different rules and this has caused the creation of a heterogeneous environment, difficult for the operator to use.
The operation, carried out manually, had given minimal results over the years: out of 82,000 terms accepted, in February 2017 only 6,400 had been settled; and out of 7,800 terms of reference, only 2,800. According to the calculations of the Polo Center, there were at least 50,000 terms still to be checked and remediated; furthermore, the problem of the links between terms remained, since in order to make the work, already titanic in itself, as agile as possible, it was decided to work only on the accepted forms of the terms and on any references (discarded forms), and to set aside temporarily all the possible hierarchical and associative relationships (BT, NT, RT), present instead on the Tesauro di Firenze.
The situation of the terms of the pole subject, in February 2017, was therefore the following:
This situation, in addition to being confusing for the back-office operator, did not lead to any benefit to the user, since in the OPACs he was prevented from searching and navigating within a network of relationships between the most wide.
Starting from March 2017, an automatic remediation of the subject terms was launched, made possible by a functionality offered by the release of SOL 3.1 (currently in use in the TSA Hub). Thanks to this, it was possible to use an automatic method of transferring the terms from the New Subject of Florence, downloadable by importing the metadata, using the SKOS / RDF format, directly into the catalog.
The operation allowed us to transform the TSA Pole Subject Archives into the TSA Pole Tesauro, including all the hierarchies contemplated in the New Subject of Florence, and by equipping the terms with all the elements contemplated by the New Subject. On the one hand, therefore, it was possible to reclaim the archive of the Subject Terms of the Polo TSA in order to make it uniform, coherent and correct, and to allow easy consultation for the operator; on the other hand, a richer and more satisfying research environment has been made available to the end user.
Although the data of the two environments are not coincident (since the data of the real environment is updated every day), the internal features are the same. Therefore, within SOL Trial, there is an authority file of the subject terms, the same in structure as the real SOL, which presents the same problems as the one in the real environment.
This fictitious authority was used to verify the feasibility of the project; all the fundamental operations for the success of the operations were carried out within it: the start / end statistics, the automatic import, and a partial reclamation; these operations made it possible to verify the number of terms subject to import, and the time necessary to carry out the operations.
It was also possible to test the usefulness of the work from the user’s point of view, thanks to the presence of a test Opac (related to the test SOL) which allowed to view the data entered and modified.
The operations (in the test environment first, and then in the real environment) were carried out in the following order:
3. Once the import of each category has been completed, the number of imported records, problematic and blocked records, made available by program in a list, was checked.
4. The blocked records (which the procedure could not import because the presence of some discrepancy in the arrival data did not allow it) were subject to an initial manual reclamation.
5. Sample tests were carried out to verify the correct generation of the hierarchical structure of the terms, created following the import; both in the back-office and in the front-office;
Thanks to this import, it was also possible to know the number of Terms used in G but not present in the New Subject Book. This is about 68,000 terms. Of these, the vast majority involve proper names, and only a portion will be subject to manual remediation.
The library community that makes up Polo TSA has been waiting patiently, for countless years, for a satisfactory reclamation of the archive of terms of thesaurus. Until now, the office of the center pole had only carried out targeted remediation in recent years, based on the specific requests of the operators; this only made it possible to solve the most problematic situations or to normalize the most commonly used terms. The realization of this work has therefore certainly brought benefits to the work of many librarians.
But the realization of this project has also made it possible to offer users of the catalog an enhancement of the search methods.
Users can now carry out searches more responsive to their needs, being able to navigate within the term and the links related to it; it will also be possible to access reading advice related to the topic of the document sought, thanks to the possibility offered by the module integrated in Biblioest “Maybe you might be interested”.
This optional feature, foreseen in the Biblioest evolution package for 2017, will soon be activated; this function will allow the user, after carrying out a search, to find reading suggestions based on the terms of the subject similar or linked to those of the source document.
Once the first, onerous phase of importing the terms of the New Subject is completed, it will be necessary to continue to import the new terms which, with six-monthly updates, are made available on the website of the Tesauro of the new Subject.
Furthermore, the procedures for manual remediation of the Nuovo Subject extra terms (proper names of persons and bodies, geographical locations, works …) must be highlighted in order to manually proceed to a remediation of duplicate or incorrect terms and subject strings.
Therefore, although the work certainly cannot be considered completed, this first phase has already brought benefits to both the front office and back office environments.