Reengineering Thesauri for New Applications: The AGROVOC Example

Dagobert Soergel*, Boris Lauser, Anita Liang, Frehiwot Fisseha, Johannes Keizer and Stephen Katz

*College of Library and Information Services, University of Maryland, College Park


Food and Agriculture Organization (FAO) of the United Nations,
Library & Documentation Systems Division,
00100 Rome, Italy
Email: {frehiwot.fisseha; boris.lauser; anita.liang; johannes.keizer; stephen.katz}

Table of Contents


Existing classification schemes and thesauri are lacking in well-defined semantics and structural consistency. Empowering end users in searching collections of ever increasing magnitudes with performance far exceeding plain free-text searching (as used in many Web search engines), and developing systems that not only find but also process information for action, requires far more powerful and complex knowledge organization systems (KOSs). The paper presents a conceptual structure and transition procedure to support the shift from a traditional KOS towards a full-fledged and semantically rich KOS. The proposed structure also complies with other interoperability approaches like RDFS and XML in the Web environment. AGROVOC, a traditional thesaurus developed and maintained by the Food and Agriculture Organization (FAO) of the United Nations, serves as a case study for exploring the reengineering of a traditional thesaurus into a fully-fledged ontology. We start the process of developing an inventory of specific relationship types with well-defined semantics for the agricultural domain and explore the rules-as-you-go approach to streamlining the reengineering process.


1 From thesauri to rich ontologies

1.1 The problem
1.2 The relationship of traditional KOS to ontologies
1.3 Potential benefits of future generation KOSs
1.4 The process of reengineering: The rules-as-you-go approach

2 AGROVOC: A multilingual agricultural thesaurus

2.1 Background
2.2 Applications and related terminologies
2.3 Conceptual structure of AGROVOC

2.3.1 Equivalence relationships
2.3.2 Hierarchical relationships
2.3.3 Associative relationships
2.3.4 Scope notes
2.3.5 Top level structure

2.4 Semantic problems of AGROVOC

2.4.1 Ambiguous descriptor to non-descriptor relationship
2.4.2 Ambiguous hierarchical definitions
2.4.3 Ambiguous associative relationships

2.5 The need for reengineering AGROVOC into an ontology

3 Conceptual model: Combining thesauri and ontologies

3.1 The basic model
3.2 Model extensions
3.3 Limitations
3.4 Implementation
3.5 Related approaches

4 The AGROVOC case: Exploring conceptual relationships in the agricultural domain

4.1 The logical generic relationship
4.2 The part-whole family of relationships

4.2.1 X <containsSubstance> Y/Y <substanceContainedIn> X and X <hasIngredient> Y/Y <ingredientOf> X
4.2.2 X <yieldsPortion> Y/Y <portionOf> X
4.2.3 X <spatiallyIncludes> Y/Y <spatiallyIncludedIn> X
4.2.4 Y <hasComponent> Y/Y <componentOf> X
4.2.5 X <includesSubprocess> Y/Y <subprocessOf> X
4.2.6 X <hasMember> Y/Y <memberOf> X

4.3 Further relationship examples (some from Schmitz-Esser 1999)

4.3.1 X <causes> Y/Y <causedBy> X
4.3.2 X <instrumentFor> Y/Y <performedByInstrument> X
4.3.3 X <processFor> Y/Y <usesProcess> X
4.3.4 X <beneficialFor> Y/Y <benefitsFrom> X
4.3.5 X <treatmentFor> Y/Y <treatedWith> X
4.3.6 X <harmfulFor> Y/Y <harmedBy> X
4.3.7 X <growsIn> Y/Y <growthEnvironmentFor> X
4.3.8 X <hasProperty> Y/Y <propertyOf> X
4.3.9 X <similarTo> Y/Y <similarTo> X
4.3.10 X <oppositeTo> Y/Y <oppositeTo> X
4.3.11 Concluding comment

5 Exploring the rules-as-you-go approach for the case of AGROVOC

6 Implications and further work



Appendix: Towards an XML/RDF specification for KOS