IntroductionPurpose and NeedAt FAO, we are committed to helping combat and eradicate world hunger. Information dissemination is an important and necessary tool in furthering this cause, and we need to provide consistent, usable access to information for the community of people doing this very work. The wide recognition of FAO as a neutral international centre of excellence for agriculture positions it perfectly to lead in the growth and improvement of knowledge representation systems in the agricultural domain, and to lead in developing more functionality for users looking for information. Users are finding it harder and harder to retrieve information, particularly on the web, where growth has been exponential and unmanaged. Users are unsure where to go to retrieve the resources they need and how to retrieve them once they get there. There are numerous, independently created knowledge systems that contain large numbers of resources, each with their own method for describing, defining and relating these resources. The information science community has long known that terminology that is controlled improves information retrieval by standardising terminology and by providing a structure for languages. Controlled terminology, known as controlled vocabularies, improves the finding of information by increasing recall-helping users find resources that use different terminology for the same concept-and by increasing precision-defining the structure of the terminology so users can understand the scope of resources to be found. As the amount of information has increased, we have needed more and better tools to help us manage, access and share it. Once, we had to use the same hardware in order to do this. Later, we were able to use different hardware, provided we used the same software. Now we can use different software, provided we structure the information exactly the same way. But, not all systems use the same kind of structure. We need to be able to communicate different information structures among a community of systems. How can we develop a structure that is controlled, yet allows differences among systems? We need to develop an Agricultural Ontology Service (AOS) that will function as a reference tool that structures and standardises agricultural terminology in multiple languages for use by any number of different systems. The AOS will serve the following purposes:
The goals of the AOS are realised by assisting community partners in building ontologies. An ontology is a system that contains terms, the definitions of those terms, and the specification of relationships among those terms. It can be thought of as an enhanced thesaurus-it provides all the basic relationships inherent in a thesaurus, plus it defines and enables the creation of more formal, more specific and more powerful relationships. An ontology captures and structures the knowledge in a domain, and by doing so captures the meaning of concepts that are specific to that domain. This meaning is then extended to end-users through the use of tools (e.g., indexing, retrieval) that apply the ontologies.
Using the services provided by the AOS, community partners can increase the functionality of their current or planned knowledge representation systems by:
The following diagram illustrates how community partners can use the services of the AOS. They can retrieve, check, map, suggest, re-use and maintain their ontologies using components of the AOS. Several of these mechanisms are shown in the diagram. The diagram also shows how tools, such as an indexing application or a searching application for an end user, take advantage of the AOS by utilising the ontologies created using AOS components.
Thus, the AOS will provide terms, definitions and relationship components that can be shared among associated partners, thereby enhancing communication and interaction within the community. Use of these components increases the functionality for indexing and retrieving resources by providing a standard source for terminology and offering richer, more powerful ontological relationships. The following diagram illustrates the current use of relationships and the potential use of ontological relationships, using WAICENT as an example system.
In the current system, the WAICENT Info Finder retrieves all resources indexed with the topical category "forestry production." In the proposed system, WAICENT will be used to retrieve resources indexed using AOS ontological relationships that result in topical categories that are more finely described, e.g., "types of forest products," "pesticides used in forests." As a consequence, the user of the proposed system is able to retrieve a more granular and more relevant set of resources. In addition, along with the result set, the user can be shown other possibly relevant categories, again developed using ontological relationships, that they might want to use to retrieve other relevant resources.
The AOS should be built using AGROVOC as its platform, since this thesaurus was developed at FAO and has the appropriate scope and basic relationships to serve as the underlying structure for the AOS.
Because the AOS is designed to serve as a focal point for the vocabulary of the agricultural domain, and to codify and standardise the knowledge within this domain, we will need to build associations with community partners for its development. For instance, in the fisheries area, the AOS could partner with oneFish, FIGIS, FIDI, ASFA and SIFAR, among others.
Also, there should be no reason to work in a vacuum. Other organizations have developed knowledge services-for instance, Cycorp 2 and UMLS 3-and it is important to learn from the development of these other services, both in terms of process and issues.Therefore, to achieve the stated goals of the AOS, we need a service that includes the following:
This proposal discusses the steps to be taken in achieving these goals.
Each of these steps is discussed below.
An inventory of possible participants will need to be made before any project planning can start. It is important that we develop a plan that ensures wide involvement of the agricultural community. For instance, both the CABI Information for Development program and the National Agricultural Library (NAL) AgNIC5 program are organizations that would benefit from partnership with FAO WAICENT on the AOS. Their missions are similar to ours: AgNIC focuses on providing agricultural information in electronic format over the World Wide Web. The CABI Information for Development program helps design, build and sustain information and knowledge management systems and helps to provide access to knowledge tools in developing countries.As part of this inventory, we will need to determine the methods of partner participation. Specifically:
Since the service will be the reference for terminology and relationships, we should inform the development of this reference with as many knowledge organization systems (KOSs) in the agricultural domain as possible. A KOS is any system that attempts to classify the information within its boundaries, whether that is a web site, a domain, a network, or any other type of environment.
Examples of types of KOSs are:
To start, we will need to take a complete inventory of potential KOSs. In our inventory, we should determine the degree of overlap of the terminology and relationships in the KOS with AGROVOC, the platform of the AOS. This will give us an idea of:
This will result in the development of ontologies. Some KOSs, such as those developed by such organizations as CABI (the CAB thesaurus) and NAL (the AgNIC thesaurus), have a high degree of subject overlap with AGROVOC, yet will want to remain separate entities. These systems may add the richer functionality of the AOS components to their KOS, creating a separate, newly created ontology. Other KOSs may choose to be fully integrated into the AOS and use the AOS as their primary point of departure for agricultural terminology and relationships. The following graphic illustrates this development.
In summary, the use of KOSs will be two-fold:
There are now opportunities to use and share controlled vocabularies in web environments because of new standards that offer more power and flexibility. The advent of XML (eXtensible Markup Language) offers a common method for sharing knowledge across different tools. The RDF (Resource Description Framework) standard allows storage and sharing of metadata (the data that describes resources) across systems. The topic mapping language, XTM (XML Topic Maps), currently in development, may offer even stronger functionality for the use of such metadata.
These new standards are part of the Semantic Web that allows machines to share information the way humans currently share information on the web. The syntax and schema behind the Semantic Web gives us interoperability-the means to share resource descriptions among systems. The AOS is dependent on this ability, since it will need to share information between the AOS and the micro-ontologies.
To ensure this interoperability, components of the AOS will need to be encoded within the RDF framework. Terms and definitions and their associated relationships will be identified by Universal Resource Identifiers (URIs) and stored in this common framework. (We're considering the use of XTM instead of RDF if it proves that XTM will provide richer associations for better encoding.) The AOS will then use XML language to exchange these URIs among systems. Systems will need to use XML to be capable of communicating with the AOS.
The conjunction of these standards will enable the sharing of machine-readable URIs among a variety of different systems. For the AOS, this communication will allow components of ontologies-their terms, definitions and relationships-to be shared, evaluated and maintained using the AOS.
Recently, there has been considerable discussion relating to extending this core set of relationships. In the late 1990s, the American Library Association Subcommittee on Subject Relationships/Reference Structures examined over 165 relationships within the English language alone and from these produced a checklist of twenty candidate subject relationships for information retrieval.
We can use an extended set of relationships to perform more granular and more consistent indexing, and to enable more effective searching and browsing for users. We need to formalise rules for their development and implement processes for using them in indexing and retrieval.
For example, for the topic "plant production" we can describe the associations the topic has with other topics. In the table below, for instance, "raw product" is formally associated with the topic "plant production." In practice, an indexer would see all the appropriate topics and relationship types when describing a resource-a resource about "cotton balls" might receive the topic "plant production" and the relationship type "raw product." A searcher requesting the topic "plant production" could be presented with the option to limit his search to particular kind(s) of relationships, e.g., "Would you like to see raw products?". The prospect for retrieval of more relevant resources is greatly increased.
To effectively develop this functionality, a list of potential relationships needs to be compiled, in conjunction with the inventory of KOSs. These relationships should be compared among the KOSs, with the end result of a common set of relationships. These common relationships will be stored in the AOS for future re-use by other ontologies. As a result of continuous sharing, development and maintenance, a refined collection of commonly used ontological relationships will be available. (The compilation of relationships will need to be compared with those being considered by NISO and ISO for inclusion in a future standard for electronic thesauri.7) A key aspect of the AOS is that it will be multilingual. For any user in any country who needs access to resources, we should provide the ability to index and find information in any language needed. The AOS needs to collect and co-ordinate terminology, definitions and relationships in the five official languages of the FAO-English, French, Spanish, Arabic and Chinese-at least. Additional languages should be added by those partners developing their own ontology, as needed.
There are two large issues surrounding multilinguality:
During the collection of terms and relationships, special attention will need to be given to the multilingual aspect. For the AOS to be a functional multilingual reference, it needs to be comprehensive. We will need to keep in close communication with partners to be aware of changing needs, and write thorough policies to be followed by those managing the AOS.
We will need to develop a suite of ontology tools to be used for accessing the AOS and its set of ontologies. This suite should include development, storage, management and retrieval tools, which can be further defined as:
For some of these tools, off-the-shelf products may be sufficient, in which case a chosen few will need to be tested and evaluated for adequacy. In certain cases, it may be possible to tweak the product to fit our needs, depending on the availability of staff to assist in this. Otherwise, if there are no off-the-shelf products, we will need to develop our own (preferably web-based) tools using staff knowledge and skills.
As a proof-of-concept, we need to test AOS functionality at an early stage. The oneFish/CDS application provides a good opportunity to do this. oneFish uses CDS to perform many of the tools listed above, however, it does not use a controlled knowledge representation system as a reference for terminology and relationships. A separate concept paper is currently being written by the oneFish team to propose the development of a multilingual fisheries ontology to improve the current oneFish application. Many of the same steps will need to be taken by the oneFish team and the AOS team. Working closely with the oneFish team will make it possible to test a large part of the functionality of the AOS.
It wouldn't make sense to develop these tools without developing interfaces that address the needs of the end-users. To do this, we need to know:
These are only a portion of the questions that need to be asked when developing interfaces. To appropriately answer these, we need to strategise and we need to do usability testing. Effective usability testing involves meeting the users and can include such methods as focus groups, one-on-one evaluation, observation and task analysis. User profiling-e.g., demographics, use of tools, level of subject knowledge-will aid in making sure the appropriate end-users are chosen.
The results of this testing will make it possible to update and enhance AOS functionality. Testing also informs the methodology for training end-users in how to operate the tools, which will need to be initiated before the end of the project. Testing should be done iteratively throughout the lifecycle of the project to ensure that the AOS continues to be suitable for the needs of end-users.
The project should be completed in a 2-year time span. The following is a tentative time frame for a 3-phased approach. Many of these activities will take place concurrently. See Appendix B for a more detailed timeline.
References
1 Guidelines for the Construction, Format, and Management of Monolingual Thesauri, |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||