FAO/COAIM-2/Inf.7


Second Consultation on Agricultural Information Management

Information Document

Rome, Italy, 23 - 25 September, 2002

Towards better Semantic Standards for Information Management AGROVOC and the Agricultural Ontology Service (AOS)

Table of Contents


Background
Report on the maintenance and further development of AGROVOC
Brief overview of current and next generation semantic systems
Some limitations and drawbacks with the current vocabulary systems and tools
What benefits do we expect from the next generation knowledge organisations systems?
The way forward
Progress so far with the AOS
The AOS Prototype
The Fishery Ontology
The Food Safety Ontology
The Crop-Pest Ontology
The Anti-Microbial Ontology
Towards a Consortium for semantic standards
Glossary
References


Background

1. The First Consultation on Agricultural Information Management (COAIM) recommended that FAO assume a leading role as a clearing-house for international agreed information management standards in the agricultural sector. The Consultation further recommended that FAO co-ordinates and facilitates the promotion and adoption of such standards with Member Nations and in collaboration with other stakeholders in the agricultural sector.

2. Following these recommendations, FAO has been committed to improving existing information management methodologies, developing strategies and guidelines for agricultural information management and closely following new developments in the area.

3. A key area for the furtherance of standards in web information management has been the development of semantics for knowledge description and organisation.

4. This document reports FAO’s current activities in this area and puts forward suggestions and recommendations for further development/elaboration/enhancement/strengthening of knowledge organisation systems the consultation may wish to consider.

Report on the maintenance and further development of AGROVOC

5. AGROVOC is one of the semantic tools used in the agricultural sector whose usage develops and expands with time. AGROVOC has been in use for decades for document indexing and facilitating information retrieval. The further development of AGROVOC in terms of subject and language coverage and addition of new concepts is one of the priorities of FAO in its efforts to deliver better technology for agricultural information management. It is with this responsibility in mind that FAO strives to continuously improve AGROVOC. The continuous evolvement of AGROVOC brings a number of enhancements including addition of new concepts, mapping concepts to subject categories, and expanding language coverage.

6. The recent enhancement of AGROVOC includes Arabic and Chinese translations and many more additional concepts in the AGROVOC vocabulary. The Chinese and Arabic translation were done with close collaboration with Chinese Academy of Agricultural Sciences (CAAS) and the International Centre for Agricultural Research in the Dry Areas, Library and Information Services (ICARDA) in Syria. Proposals for additional concepts to AGROVOC were received from numerous institutions. These are the technical departments of FAO, Le Centre international de hautes études agronomiques méditerrannéennes, Institut agronomique méditerranéen de Montpellier (CIHEAM, France), the International Food Policy Research Institute (IFPRI), the Institut National de la Recherche Agronomique (INRA, France), Service de documentation, Swiss Federal Research Station for Agricultural Economics and Engineering (Switzerland) and the International Service for National Agricultural Research (ISNAR). The proposed terms were published for comments before being added to AGROVOC database.1

Brief overview of current and next generation semantic systems

7. Agricultural thesauri like AGROVOC are playing a substantial role in helping information managers and information users in document indexing and information retrieval tasks. With the advancement of the web, information management tasks require still more versatile and flexible tools that facilitate the task of systematically and logically organizing and finding information on the web.

8. The need for better semantic tools can be analysed from two problem perspectives: the "Information Organisation" problem faced by Information Managers and the ‘Information Retrieval’ problem faced by Information Users. At present most information management tasks are performed by humans, with little automation. Cataloguing and indexing task are good examples here. These tasks are labour-intensive processes for high skilled specialists. Tools for automating or semi-automating the task of coding, cataloguing, indexing and classifying documents in a dynamic way are much in demand. Developing highly specialised vocabularies and representing the same in a machine-understandable way is the main prerequisite for automating machine-assisted text classification and document indexing processes.

9. From the information retrieval perspective, it is important that tools should be developed to allow search engines to do semantic–rich, context-sensitive, concept-based search instead of mere keyword matching searches.

10. Solving this twofold problem from both the information organisation and the information retrieval standpoint requires well-developed approaches, methodologies and tools. More importantly the standard vocabularies used for information management purposes should be multilingual, domain-specific and cross disciplinary at the same time. The standards should also be developed in a non-proprietary, application independent and machine process-able format to ensure interoperability among different systems.

11. Recently the terms ‘knowledge organisation systems’ (KOS) and ‘ontologies’ are being used widely to refer to methodologies and approaches that improves information management on the web. These methodologies and approaches promise to provide semantically rich vocabularies and metadata for describing and discovering information resources. The difference between a conventional KOS such as thesauri and categorisation schemes and the so-called emerging knowledge systems such as ontologies lies in the level of enriched relationships contained in the latter. Ontologies promise to provide the capacity to explicitly specify the semantics needed for information categorisation, integration and retrieval. The semantics are developed through unambiguously defining a concept, by associating a descriptive metadata to the concept being described, and by specifying an elaborated relationship among the concepts of a specific domain.

Some limitations and drawbacks with the current vocabulary systems and tools

12. There are various vocabulary systems/tools that are widely used in the agriculture sector. However, these tools suffer from some drawbacks that should be solved. The following summarises some of the limitations.

What benefits do we expect from the next generation knowledge organisations systems?

13. Future information management technologies and methodologies should provide better ways of managing information resources. The following potential benefits are expected from such systems.

The way forward

14. It cannot be overemphasised that only agreed semantic standards guarantee optimum information sharing, knowledge discovery and re-use. Developing such standards requires agricultural information producers’ agreement and participation. Such standards should be developed in co-operative, distributed, and co-ordinated environments to promote active participation and ensure the adoption of widely accepted standard vocabularies. This calls for a synergy of collaboration from various interested groups. Cognisant of this fact, FAO has been working to suggest the way forward to realise this vision. The Agricultural Ontology Service (AOS) initiative mirrors FAO’s efforts directed towards this goal. The AOS is a co-ordinated effort and FAO’s approach to develop multilingual and multidisciplinary domain vocabulary system for promoting food security and sustainable development.

Progress so far with the AOS

15. After closely following the international trends and development on information management, the AGRIS/CARIS and Documentation Unit of FAO prepared the AOS concept paper and initiated discussions. A series of AOS workshops2 were conducted in the years 2001 and 2002 to gauge interest and create AOS awareness. The first Agricultural Ontology Workshop was held in FAO from November 14 to 15, 2001. The goal of the workshop was to discuss the AOS concept paper and to explore the possibility of launching a project in collaboration with ontology experts and agricultural information producers.

16. This workshop recommended proceeding with the AOS initiative and proposed the formulation of the AOS Launch Group with the mandate to follow up activities and prepare a work plan and project proposal document for future action. The Launch Group is comprised of content providers (FAO, CABI Publishing), information management solution providers in agriculture (AgroTechnological Research Institute-ATO, The Netherlands, and Institute of Food and Agricultural Sciences, University of Florida), and ontology experts from various universities and research organisations (Research Centre for Information Technologies at the University Karlsruhe, Germany, Consiglio Nazionale delle Ricerche- CNR, Italy).

17. The second AOS workshop was held in CABI, Oxford, England from January 24 to 25, 2002 as a follow up to the first AOS workshop. The theme of the workshop was 'Shaping the AOS to meet user needs'. The 'AOS Users Focus Group' workshop provided the opportunity to describe the AOS and its possible applications to potential users and to gather feedback from information professionals in agriculture and related subject areas. The workshop recommended the development of a number of prototypes as proof of concept. Following this recommendation, various prototypes are being developed under the co-ordination and supervision of the AGRIS/CARIS and Documentation Unit. The prototypes are the Fishery Ontology Service (FOS), the Food Safety Ontology, the Crop-Pest Ontology, and the Anti-Microbial Ontology.

18. The third AOS workshop was held at the University of Florida in Gainesville from May 9-10, 2002. Participants included representatives from North and South America, agricultural information technology specialists, librarians and cataloguers, computer scientists, and people involved with organising and collecting information in any domain of agriculture and natural resources. The workshop recommended the set-up of an AOS consortium to strengthen the AOS project.

The AOS Prototype

19. The aim of the prototypes3 is to demonstrate the usefulness of ontology’s in improving the management of agricultural information resources. A brief account of the prototypes follows:

The Fishery Ontology

20. The Fishery Ontology prototype focuses on integrating existing fishery and aquatic resources terminology in the Fishery domain. The primary goal of the prototype is to map and harmonise fishery terminology from various resources to make information integration possible from different fishery information systems and portals.

21. The systems to be integrated are: the "reference tables" underlying the FIGIS portal, the ASFA online thesaurus, the fishery part of the AGROVOC thesaurus, and the oneFish community directory. The Fishery Ontology Service is assumed to be a key feature of the Enhanced Online Multilingual Fishery Thesaurus, which undertakes the problem of accessing and/or integrating fishery information.

The Food Safety Ontology

22. The primary goal of the food safety ontology is to provide multilingual domain vocabularies to facilitate search and information retrieval from food safety related portals. The ontology will be used to annotate documents with semantic information and to structure Web portals. The core food safety ontology is created by extracting food safety vocabularies from food safety related documents. Food safety experts were also involved in suggesting concepts and relationships that should be included in the ontology. A user interface that allows searching and browsing the ontology is implemented.

The Crop-Pest Ontology

23. The prototype is expected to improve searches in citrus related information systems and databases. The ontology will cover publications and images in the domain of citrus, citrus pests and other pest domains. The goal of the prototype is to develop an experimental information extraction system for analysing image captions associated with images, to explore exchanging information and merging ontologies with other domain ontologies, to experiment the cross-disciplinary feature of ontologies. Expected results of the prototype include an ontology for the crop-pest domain, which is based on citrus but can be generalised to other crops, a user interface for ontology assisted search, a natural language processing system for information extraction, a demonstration of exchanging information among domain ontologies and an evaluation of the effectiveness of an ontology assisted search in an established electronic agricultural information system.

The Anti-Microbial Ontology

24. The Anti-Microbial ontology will mainly be used in the microbiological domain. The project demonstrates ontology usage and ontology unification by implementing a combined ontology for two domains: namely, natural anti-microbial compounds and growth behaviour of micro-organisms. The motivation of this prototype is to demonstrate the use of ontologies for searching information, to show the process of cross-domain mapping of ontologies, and to prove the advantage of integrating ontologies across adjacent domains for searching information.

Towards a Consortium for semantic standards

25. Semantic standards were once an issue mostly for librarians; now they are essential for anyone handling information management on the web.

26. It is expected/envisaged that further development of the AOS will be undertaken through a consortium. With this approach the experience of the collaborative approach in the maintenance of AGROVOC, based on the activity of the AGRIS centres will be brought to a new level.

27. The aim of the proposed consortium is to support and promote the development and application of semantic standards (vocabularies, glossaries, definitions, thesauri, metadata schemas, and ontologies) in all subject areas that contribute to knowledge for achieving food security and sustainable development.

28. The consortium will act as a clearinghouse between data providers (international organisations, research institutes, universities, and the private sector) and service providers (international organisations, governments, and information services) in the area.

29. The activities of this consortium are envisaged in the organisation of workshops, the maintenance of a library of semantic standards, and in the participation in projects for the development of tools for the semantic web.

Glossary

Metadata: The term "metadata" refers to any data used to aid identification, description and location of information resources. Metadata facilitates information retrieval. An example of metadata is a library catalogue, which provides value-added information about the content and location of books.

Ontology: Ontologies promise to provide the capacity to explicitly specify the semantics needed for information categorisation, integration and retrieval. The semantics are developed through unambiguously defining a concept, by associating a descriptive metadata to the concept being described, and by specifying an elaborated relationship among the concepts of a specific domain.

Semantic: The word ‘semantic’ refers to the intended meaning of data or information. In the information systems context, semantics can be viewed as a mapping between an object modelled, represented and/or stored in an information system and the real-world object(s) it represents. Semantic can be understood as the “meaning of data” and “a reflection of the real world”. It is a set of mapping(s) from our representation language to agreed concepts (objects, relationships, and behaviour) in the real world.

Taxonomy: In information management context, taxonomy refers to categorising subject areas in broad categories often using hierarchical relationships. Taxonomies aid to organise information in a logical and systematic way there by provide improved information access and efficiency in information retrieval.
XML: XML is a textual encoding system for creating structured documents that can be easily understood by browser software on the Internet.

Knowledge organisation systems (KOS): KOS is a generic term that encompasses conventional information management methodologies and standards such as thesauri, controlled vocabulary lists, subject categories, glossaries or the more elaborated vocabulary systems such as ontologies.

Web mining: Web mining refers to the discovery and analysis of useful information from the web. The key objective is to develop tools for information retrieval to help the user in finding, extracting, filtering and evaluating the desired information and resources.

References

1. The Agricultural Ontology Service website
http://www.fao.org/agris/aos

2. The Agricultural Ontology Service (AOS) Concept Note-A Tool for Facilitating Access to Knowledge: http://www.fao.org/agris/aos/Documents/AOS_Draftproposal.htm

3. Overview of Approach, Methodologies, Standards, and Tools for Ontologies: http://www.fao.org/agris/aos/Documents/BackgroundAOS.html
4. Integration of knowledge organisation systems in the Fishery domain:
http://www.fao.org/agris/aos/Documents/FisheryConceptPaper.doc

5. Ontologies Come of Age : http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-mit-press-(with-citation).htm-(with-citation).htm

_________________________________

1 The AGROVOC website is accessible on the Internet at http://www.fao.org/agrovoc/ .
The proposed terms were documented at
http://www.fao.org/agris/input/agrovocnew/propdescriptors.asp?Language=EN

2 For a detail account of the workshops, refer to the AOS site at
http://www.fao.org/agris/aos/Workshops/Workshops.htm

3 For more information on these prototypes, please refer to http://www.fao.org/agris/aos