Introduction
Purpose and Need
Features and Functionality
The Proposal
Participation
Knowledge Organization Systems (KOSs)
Standards

Ontological Relationships
Multilinguality
Tool Development and Testing
User Testing
Appendix A: Activity List
Appendix B: Timeline

Introduction

Purpose and Need

At FAO, we are committed to helping combat and eradicate world hunger. Information dissemination is an important and necessary tool in furthering this cause, and we need to provide consistent, usable access to information for the community of people doing this very work. The wide recognition of FAO as a neutral international centre of excellence for agriculture positions it perfectly to lead in the growth and improvement of knowledge representation systems in the agricultural domain, and to lead in developing more functionality for users looking for information.

Users are finding it harder and harder to retrieve information, particularly on the web, where growth has been exponential and unmanaged. Users are unsure where to go to retrieve the resources they need and how to retrieve them once they get there. There are numerous, independently created knowledge systems that contain large numbers of resources, each with their own method for describing, defining and relating these resources.

The information science community has long known that terminology that is controlled improves information retrieval by standardising terminology and by providing a structure for languages. Controlled terminology, known as controlled vocabularies, improves the finding of information by increasing recall-helping users find resources that use different terminology for the same concept-and by increasing precision-defining the structure of the terminology so users can understand the scope of resources to be found.

As the amount of information has increased, we have needed more and better tools to help us manage, access and share it. Once, we had to use the same hardware in order to do this. Later, we were able to use different hardware, provided we used the same software. Now we can use different software, provided we structure the information exactly the same way. But, not all systems use the same kind of structure. We need to be able to communicate different information structures among a community of systems.

How can we develop a structure that is controlled, yet allows differences among systems?

We need to develop an Agricultural Ontology Service (AOS) that will function as a reference tool that structures and standardises agricultural terminology in multiple languages for use by any number of different systems. The AOS will serve the following purposes:

  • increasing the efficiency and consistency of describing and relating multilingual agricultural resources

  • decreasing the random nature and increasing the functionality for accessing these resources

  • enabling sharing of common descriptions, definitions and relations within the agricultural community
Stated more simply, the purpose of the AOS is to achieve:
  • better indexing of resources,

  • better retrieval of resources, and

  • increased interaction within the agricultural community.
Top

Features and Functionality

The goals of the AOS are realised by assisting community partners in building ontologies. An ontology is a system that contains terms, the definitions of those terms, and the specification of relationships among those terms. It can be thought of as an enhanced thesaurus-it provides all the basic relationships inherent in a thesaurus, plus it defines and enables the creation of more formal, more specific and more powerful relationships. An ontology captures and structures the knowledge in a domain, and by doing so captures the meaning of concepts that are specific to that domain. This meaning is then extended to end-users through the use of tools (e.g., indexing, retrieval) that apply the ontologies.

Using the services provided by the AOS, community partners can increase the functionality of their current or planned knowledge representation systems by:

  • building their own subject-specific ontologies for indexing and retrieval
  • integrating their ontologies with the AOS, thereby increasing the knowledge of the service
  • making their ontologies available to others for re-use of components in building other ontologies

The following diagram illustrates how community partners can use the services of the AOS. They can retrieve, check, map, suggest, re-use and maintain their ontologies using components of the AOS. Several of these mechanisms are shown in the diagram. The diagram also shows how tools, such as an indexing application or a searching application for an end user, take advantage of the AOS by utilising the ontologies created using AOS components.

Thus, the AOS will provide terms, definitions and relationship components that can be shared among associated partners, thereby enhancing communication and interaction within the community. Use of these components increases the functionality for indexing and retrieving resources by providing a standard source for terminology and offering richer, more powerful ontological relationships. The following diagram illustrates the current use of relationships and the potential use of ontological relationships, using WAICENT as an example system.

In the current system, the WAICENT Info Finder retrieves all resources indexed with the topical category "forestry production." In the proposed system, WAICENT will be used to retrieve resources indexed using AOS ontological relationships that result in topical categories that are more finely described, e.g., "types of forest products," "pesticides used in forests." As a consequence, the user of the proposed system is able to retrieve a more granular and more relevant set of resources. In addition, along with the result set, the user can be shown other possibly relevant categories, again developed using ontological relationships, that they might want to use to retrieve other relevant resources.

The AOS should be built using AGROVOC as its platform, since this thesaurus was developed at FAO and has the appropriate scope and basic relationships to serve as the underlying structure for the AOS.

Because the AOS is designed to serve as a focal point for the vocabulary of the agricultural domain, and to codify and standardise the knowledge within this domain, we will need to build associations with community partners for its development. For instance, in the fisheries area, the AOS could partner with oneFish, FIGIS, FIDI, ASFA and SIFAR, among others.
Top

Also, there should be no reason to work in a vacuum. Other organizations have developed knowledge services-for instance, Cycorp 2 and UMLS 3-and it is important to learn from the development of these other services, both in terms of process and issues.Therefore, to achieve the stated goals of the AOS, we need a service that includes the following:

  • a standardised multilingual agricultural controlled vocabulary with rich relationships
  • partners with interest and capabilities for development of ontologies
  • use of state-of-the-art interoperability standards for effective communication of components
  • development of tools for development, storage, management and retrieval

This proposal discusses the steps to be taken in achieving these goals.

The Proposal

The AOS should provide an avenue for effective resource description and retrieval, and efficient knowledge system development and integration. The proposed project to develop the AOS would:

  • Strategise about participation of agricultural community partners
  • .
  • Utilise all possible knowledge organization systems within the agricultural domain.
  • Utilise current and developing state-of-the-art interoperability standards.
  • Develop formal ontological relationships among topics
  • .
  • Build in functionality for describing and finding multilingual resources
  • .
  • Develop and pilot test development, storage, management and retrieval tools
  • .
  • Test functionality of the service with end-users.

Each of these steps is discussed below.

Participation

An inventory of possible participants will need to be made before any project planning can start. It is important that we develop a plan that ensures wide involvement of the agricultural community. For instance, both the CABI Information for Development program and the National Agricultural Library (NAL) AgNIC5 program are organizations that would benefit from partnership with FAO WAICENT on the AOS. Their missions are similar to ours: AgNIC focuses on providing agricultural information in electronic format over the World Wide Web. The CABI Information for Development program helps design, build and sustain information and knowledge management systems and helps to provide access to knowledge tools in developing countries.As part of this inventory, we will need to determine the methods of partner participation. Specifically:

  • What is their level of participation? Whether they will be involved in the preliminary planning or participate through to the technical implementation should be determined from the start. Naturally, involvement through to the end will imply a greater share of the results.
  • Do they have systems they can share with us? Our project may allow them to test the application of systems they have developed-e.g., collection, indexing, retrieval, management and dissemination-that could in parallel inform the development of the AOS. (These systems are discussed in more depth in project steps 2 and 7.)
  • What resources (staffing, funding) can they offer? This, of course, will vary depending on the level of participation of the community partner. The support of these partners will help determine the scope and shape of the project.
Top
After taking the inventory, we will need to form a working group of participants. Invitations to a meeting of the working group at FAO should occur in early fall 2001. (More information on phases of the project is discussed in Appendix A.)

Knowledge Organization Systems (KOSs)

Since the service will be the reference for terminology and relationships, we should inform the development of this reference with as many knowledge organization systems (KOSs) in the agricultural domain as possible. A KOS is any system that attempts to classify the information within its boundaries, whether that is a web site, a domain, a network, or any other type of environment.

Examples of types of KOSs are:

  • classifications-lists of terms often using hierarchical relationships
  • controlled vocabularies-controlled lists of preferred and variant terms based on concepts
  • thesauri-controlled vocabularies containing hierarchical relationships
  • authority files-controlled lists of preferred and variant names
  • glossaries-lists of terms with definitions
  • gazetteers-dictionaries of place names
  • subject headings lists-broad categorisations of subject areas

To start, we will need to take a complete inventory of potential KOSs. In our inventory, we should determine the degree of overlap of the terminology and relationships in the KOS with AGROVOC, the platform of the AOS. This will give us an idea of:

  • the amount of effort required to analyse and use the systems
  • the benefits that can be obtained by both the KOS and the AOS, e.g., sharing of terms and relationships
  • the process for mapping terms, definitions and relationships between the AOS and the KOS

This will result in the development of ontologies. Some KOSs, such as those developed by such organizations as CABI (the CAB thesaurus) and NAL (the AgNIC thesaurus), have a high degree of subject overlap with AGROVOC, yet will want to remain separate entities. These systems may add the richer functionality of the AOS components to their KOS, creating a separate, newly created ontology. Other KOSs may choose to be fully integrated into the AOS and use the AOS as their primary point of departure for agricultural terminology and relationships. The following graphic illustrates this development.


Top
A number of KOSs are already available to us, although this is by no means a complete list:

Examples of KOSs
Type of KOS
Subject Area
AGROVOC thesaurus agriculture
AGRIS/CARIS Categories classification agriculture
WAICENT Information Finder classification agriculture
FAOTERM authority list agriculture
FIGIS classification fishery
oneFish classification fishery
IUFRO SilvaVoc thesaurus forestry
... ... ...

In summary, the use of KOSs will be two-fold:

  • To help grow the AOS. By using a variety of different types of KOSs from many sub-domains of agriculture, we can develop the AOS beyond its AGROVOC platform. This can only create more functionality for indexing and retrieving resources
  • .
  • To assist in re-deploying KOSs as ontologies. Some KOSs may become integrated into the AOS, while others will be re-worked and enhanced by the inclusion of AOS components to become separate ontologies. Use of these subject-specific ontologies increases the opportunity for retrieving relevant resources-our word for these types of ontologies are micro-ontologies.

Top

Standards

There are now opportunities to use and share controlled vocabularies in web environments because of new standards that offer more power and flexibility. The advent of XML (eXtensible Markup Language) offers a common method for sharing knowledge across different tools. The RDF (Resource Description Framework) standard allows storage and sharing of metadata (the data that describes resources) across systems. The topic mapping language, XTM (XML Topic Maps), currently in development, may offer even stronger functionality for the use of such metadata.

These new standards are part of the Semantic Web that allows machines to share information the way humans currently share information on the web. The syntax and schema behind the Semantic Web gives us interoperability-the means to share resource descriptions among systems. The AOS is dependent on this ability, since it will need to share information between the AOS and the micro-ontologies.

To ensure this interoperability, components of the AOS will need to be encoded within the RDF framework. Terms and definitions and their associated relationships will be identified by Universal Resource Identifiers (URIs) and stored in this common framework. (We're considering the use of XTM instead of RDF if it proves that XTM will provide richer associations for better encoding.) The AOS will then use XML language to exchange these URIs among systems. Systems will need to use XML to be capable of communicating with the AOS.

The conjunction of these standards will enable the sharing of machine-readable URIs among a variety of different systems. For the AOS, this communication will allow components of ontologies-their terms, definitions and relationships-to be shared, evaluated and maintained using the AOS.
Top

Ontological Relationships

A thesaurus has equivalence (USE/UF), broader term (BT), narrower term (NT) and related term (RT) relationships. These relationships provide the scope and structure for the thesaurus. For instance, knowing that a broader term for "cereals" is "plant products" and that narrower terms are "maize" and "rye" defines the scope of information represented by these terms.

Recently, there has been considerable discussion relating to extending this core set of relationships. In the late 1990s, the American Library Association Subcommittee on Subject Relationships/Reference Structures examined over 165 relationships within the English language alone and from these produced a checklist of twenty candidate subject relationships for information retrieval.

We can use an extended set of relationships to perform more granular and more consistent indexing, and to enable more effective searching and browsing for users. We need to formalise rules for their development and implement processes for using them in indexing and retrieval.

For example, for the topic "plant production" we can describe the associations the topic has with other topics. In the table below, for instance, "raw product" is formally associated with the topic "plant production." In practice, an indexer would see all the appropriate topics and relationship types when describing a resource-a resource about "cotton balls" might receive the topic "plant production" and the relationship type "raw product." A searcher requesting the topic "plant production" could be presented with the option to limit his search to particular kind(s) of relationships, e.g., "Would you like to see raw products?". The prospect for retrieval of more relevant resources is greatly increased.

Relationship Type
Examples
Primary activity seasonal cropping
Type of plant cotton
Cultivar "U-name-it" cotton cultivar
Location Southern States (USA)
Production system agropastoral systems
Raw product cotton balls
cottonseed oil
Derived product protective clothing
By-product mattresses
oilseed cakes
Derived activity handicrafts
milling
Resources cotton gin machines
Environmental impact soil degradation
fertilizers
insecticides
Infecting agent cotton boll weevil
Infection early blight (alternaria)
Limiting factor drought
Conflicting activity re-forestation

Top
Ontological relationships also help eliminate the need to do multiple searches. For example, a researcher might be interested in finding resources about the types of infestations of tomatoes. Instead of having to do multiple searches for each type of infestation (e.g., "tomatoes AND tomato mosaic tobamovirus," "tomatoes AND fungal wilt") he can request the use of a formally defined ontological relationship "infecting agent" with the topic "tomatoes." Each tomato infestation resource in his system has been indexed using this relationship. By using it, he saves himself the work of having to do multiple searches, and instead retrieves just what he needs through a single request.

To effectively develop this functionality, a list of potential relationships needs to be compiled, in conjunction with the inventory of KOSs. These relationships should be compared among the KOSs, with the end result of a common set of relationships. These common relationships will be stored in the AOS for future re-use by other ontologies. As a result of continuous sharing, development and maintenance, a refined collection of commonly used ontological relationships will be available. (The compilation of relationships will need to be compared with those being considered by NISO and ISO for inclusion in a future standard for electronic thesauri.7)

Multilinguality

A key aspect of the AOS is that it will be multilingual. For any user in any country who needs access to resources, we should provide the ability to index and find information in any language needed. The AOS needs to collect and co-ordinate terminology, definitions and relationships in the five official languages of the FAO-English, French, Spanish, Arabic and Chinese-at least. Additional languages should be added by those partners developing their own ontology, as needed.

There are two large issues surrounding multilinguality:

  • The collection of terms in different languages. This obviates the need for the use of a variety of KOSs that each may contain different language variations. The AOS obviously benefits from the collection of terms in different languages, since it can offer them for re-use in other ontologies.
  • The mapping of terms and concepts from different languages. Often there is no one-to-one mapping of terms between languages. The same will be true for ontological relationships. We will need to perform concept-based mapping and cross-concordance (the fusing of clusters of similar concepts from different social groups into a new concept) to effectively map partially equivalent terms.

During the collection of terms and relationships, special attention will need to be given to the multilingual aspect. For the AOS to be a functional multilingual reference, it needs to be comprehensive. We will need to keep in close communication with partners to be aware of changing needs, and write thorough policies to be followed by those managing the AOS.
Top

Tool Development and Testing

We will need to develop a suite of ontology tools to be used for accessing the AOS and its set of ontologies. This suite should include development, storage, management and retrieval tools, which can be further defined as:

  • mapping-discovery of overlap in terminology and mapping of common terms and definitions
  • relationship building-creation of ontologies using common AOS relationships and building ontology-specific relationships
  • indexing-using ontologies to automatically and manually index resources
  • encoding-storage of terms, definitions and relationships in a standard, interoperable format
  • maintenance-quality assurance and upkeep of ontologies by managers
  • resource discovery-searching and browsing resources using an ontology

For some of these tools, off-the-shelf products may be sufficient, in which case a chosen few will need to be tested and evaluated for adequacy. In certain cases, it may be possible to tweak the product to fit our needs, depending on the availability of staff to assist in this. Otherwise, if there are no off-the-shelf products, we will need to develop our own (preferably web-based) tools using staff knowledge and skills.

As a proof-of-concept, we need to test AOS functionality at an early stage. The oneFish/CDS application provides a good opportunity to do this. oneFish uses CDS to perform many of the tools listed above, however, it does not use a controlled knowledge representation system as a reference for terminology and relationships. A separate concept paper is currently being written by the oneFish team to propose the development of a multilingual fisheries ontology to improve the current oneFish application. Many of the same steps will need to be taken by the oneFish team and the AOS team. Working closely with the oneFish team will make it possible to test a large part of the functionality of the AOS.
Top

User Testing

It wouldn't make sense to develop these tools without developing interfaces that address the needs of the end-users. To do this, we need to know:

  • Will only partner end-users be using the tools, or is there a broader audience?
  • Will we be developing interfaces for all types of tools, or only a subset of them?
  • How expert are end-users with the subject and with the types of tools?
  • What level of detail do they want to see when indexing or when retrieving?
  • What kinds of relationships will they be most interested in viewing?

These are only a portion of the questions that need to be asked when developing interfaces. To appropriately answer these, we need to strategise and we need to do usability testing. Effective usability testing involves meeting the users and can include such methods as focus groups, one-on-one evaluation, observation and task analysis. User profiling-e.g., demographics, use of tools, level of subject knowledge-will aid in making sure the appropriate end-users are chosen.

The results of this testing will make it possible to update and enhance AOS functionality. Testing also informs the methodology for training end-users in how to operate the tools, which will need to be initiated before the end of the project. Testing should be done iteratively throughout the lifecycle of the project to ensure that the AOS continues to be suitable for the needs of end-users.
Top

Appendix A: Activity List

The project should be completed in a 2-year time span. The following is a tentative time frame for a 3-phased approach. Many of these activities will take place concurrently. See Appendix B for a more detailed timeline.

Project Steps
Activities
Plan
Gather
Generate
Participation
Identify partners and other stakeholders, and establish contacts.

Create working group of partners and other interested parties.

Establish electronic communication (e.g., mailing lists) for exchanging ideas among partners.

Invite partners to working group meeting.
Research roles, availability and resources of these partners. Sensitise stakeholders on reasons for use of AOS and its components.

Ensure funding for development, testing and implementation.

Create marketing plan for use during implementation.
KOSs
Develop a survey within the agricultural domain to inventory which KOSs are available and which are being developed. Run survey.

Evaluate types of KOSs inventoried.

Evaluate uses of KOSs for AOS.

Evaluate potential KOSs for overlap in terminology and relationships.
Decide which KOSs should be integrated into the AOS and which should stand alone as ontologies.
Standards
Develop survey of suitable encoding standards. Run survey and evaluate results.

Research mechanisms needed for communicating between the AOS and the KOSs.
Determine a suitable set of standards for use in the AOS.
Vocabulary & Relationships
Develop a process for mapping common terms, definitions and relationships between the KOSs and the AOS.

Discuss with partners the types of new relationships that need to be developed.
Analyse the AGROVOC thesaurus for necessary modifications.

Analyse current thesaural relationships.

Research new relationship development.
Determine what constitutes the platform vocabulary (AGROVOC plus integrated KOSs).

Map terms from KOSs to AOS.

In conjunction with partners, develop ontological relationships.

Determine a common set of relationships that are best served in the AOS.
Multilinguality
Create policies for appropriate gathering of multilingual terms, definitions and relationships.

Develop a process for mapping multilingual components from KOSs to the AOS.
Research and identify issues related to multilingual organization systems.

Identify languages needed to satisfy our multilingual requirement.

Determine the sources for multilinguality (e.g. AGROVOC).
Identify components in the AOS that need to be translated.

Translate and map components as needed.
Tools
Describe functional requirements for development, storage, maintenance and retrieval tools.

Determine methods for maintaining the AOS (e.g. updating, suggestions handling).
Identify and evaluate off-the-shelf Web based tools that are suitable for development and maintenance

In particular, prioritize the discovery of suitable software for maintenance.
Customize tools as necessary and/or build tools in-house.

Test the AOS with oneFish/CDS application.
User Testing
Describe functional requirements for building interfaces for end-users.

Decide on methodology for testing (e.g. card sorting, task analysis, prototypes.

Determine frequency of user testing, and when initial testing will take place.

Determine what aspects of the AOS (e.g. pilot micro-ontologies, for indexing, with local users) will be tested, and when.
Identify types of users who will be using the AOS (e.g. their level of expertise, their language skills) and their needs.

Develop user profiles.

Develop use cases.
Build user interface prototypes.

Run several tests, using the results of one to inform the next (e.g. contextual enquiries can assist in creating prototypes to be tested with other users).

Summarize and analyze results of user testing.

Loop feedback into development of the design of the AOS.

Train end-users.

Implement and publicize the AOS.
Top

Appendix B: Timeline

References

1 Guidelines for the Construction, Format, and Management of Monolingual Thesauri,
ANSI/NISO Z39.19-1993, p. 1.
2 http://www.cyc.com/overview.html
3 http://umlsks.nlm.nih.gov/ (registration required; contact Sheldon Kotzin (kotzin@nlm.nih.gov) for access information)
4 http://www.cabi.org/IFD/index.asp
5 http://www.central.agnic.org/agnic/about/index.html
6 http://www.ala.org/alcts/organization/ccs/sac/rpt97rev.html
7 http://www.niso.org/thes99rprt.html

Top