Achievements, impact and strategy for AGROVOC’s future
The AGROVOC Team of the Food and Agriculture Organization of the UN (FAO) convened the first meeting of AGROVOC Editors and stakeholders held in Utrecht, The Netherlands, on 25-26 June 2018, with twenty-five participants representing eighteen institutions from fourteen countries.
The objectives of this meeting were to reinforce the AGROVOC editorial community, to invite institutions to curate new languages and new topics, to outline new technical possibilities, and to set priorities for the next few years.
The meeting was organized jointly by the FAO which coordinates editing and publishing of AGROVOC, and the Land Portal Foundation. The participants – most of whom are also active in the AGRIS: International Information System For The Agricultural Science And Technology – resolved to support actively AGROVOC in the future aimed at consolidating gains achieved so far.
AGROVOC multilingual thesaurus
The AGROVOC multilingual thesaurus was created by FAO in 1980.
Since 2010, the AGROVOC has been expressed as a Semantic Web concept scheme using Simple Knowledge Organization System (SKOS).
Since April 2017, AGROVOC has been released monthly (see all AGROVOC releases). A number of AGROVOC concepts have been aligned with concepts in other multilingual knowledge organization systems related to agriculture and related domains. The general public can explore twenty-five top-level concepts AGROVOC with all their facets in 29 languages in a Web-based browsing environment, SKOSMOS.
Since March 2018 AGROVOC has been available for re-use as a Linked Open Data under the international CC-BY IGO 3.0. license
The technical infrastructure for AGROVOC has been managed for the past decade by the Artificial Intelligence Research group, University of Tor Vergata in Rome, in collaboration with FAO.
AGROVOC is edited by the web-based platform VocBench, - an advanced collaboration web environment for maintaining thesauri, ontologies, code lists and authority resources, providing features such as history, validation, a publication workflow, and multi-user management with role-based access control.
During the meeting the following structural and semantic aspects of AGROVOC were discussed:
In standard thesaurus practice, a concept must be labeled with just one preferred term per language. This can be a problem when different terms are preferred in different regions of a language area. For the purpose of distinguishing preferred terms, the data model used for expressing thesauri as Linked Data, Simple Knowledge Organization System (SKOS), pragmatically defines language in terms of unique language tags, such that preferred terms can exist, in parallel, for "Portuguese" (pt), or "Portuguese - Brazil (pt-BR). Wherever regional distinctions are important, VocBench can in principle be configured to tag terms with a region-specific language code.
AGROVOC top concepts
AGROVOC currently has twenty-five top-level concepts (or "facets"), which can be used as starting points for browsing in Skosmos or, perhaps less reliably, for automated consistency checking. These are: activities, entities, events, factors, features, groups, location, measure, methods, objects, organisms, phenomena, processes, products, properties, resources, site, stages, state, strategies, subjects, substances, systems, technology, and time. Some distinctions (e.g., between objects, entities, and phenomena) may seem unintuitive, making them hard to apply consistently, especially across languages.
The working group that created Global Agricultural Concept Space (GACS) scheme (based on aligned AGROVOC, CAB Thesaurus, and the USDA NAL Thesaurus), opted for just three top concepts: objects, events and actions, and properties, a bit like nouns, verbs, and adjectives. AGROVOC could in principle be simplified along these lines, but a closer study would be needed to verify whether the potential benefits would justify the effort required.
AGROVOC term types
Circa 44% of AGROVOC terms are assigned a "term type". These types are used to tag acronyms (Acronym) and to distinguish common names from taxonomic terms for animals, bacteria, fungi, plants, and viruses (e.g., Common name for animals, Taxonomic terms for bacteria). As with top concepts, this extra semantic structure enables consistency checks. If editors are expected to provide such structural elements when creating new concepts or terms, the elements must be easy to understand and to apply with consistency. A closer analysis could verify whether this is the case, and simpler approaches might be considered.
GACS, for example, distinguishes taxonomic terms with a separate language tag and uses concept types (not term types) to distinguish Organism, Chemical, Product, Geographical, and (for "none of the above") Topic.
Specialized concept schemes within AGROVOC
LandVoc thesaurus is a set of 270 terms about land governance created and maintained by the Land Portal organization as a distinct concept scheme within AGROVOC. Where concepts in AGROVOC are organized in a hierarchy structured by "type of thing", with land conflicts embedded three levels under the top concept phenomena and land conflict resolution embedded six levels under activities, LandVoc pulls concepts scattered throughout the AGROVOC hierarchy and re-structures them into a hierarchy designed for people working on land tenure, land management and land governance.
LandVoc places the concept land conflict resolution, for example, under land dispute, land conflicts, and the top concept access to land & tenure security. Simply adding these as additional broader relations directly to AGROVOC would only add complexity to AGROVOC, but without enabling the LandVoc vocabulary to be viewed with its own hierarchy. The solution, to be supported in VocBench 3, involves the use of hierarchical relation properties that are specific to LandVoc. With this multischeme hierarchy approach, LandVoc (for example) could flexibly be viewed and edited with its customized relations, or exported with a generic SKOS hierarchy of broader and narrower relations, without muddying the hierarchy of AGROVOC itself.
Custom relations (Agrontology)
The standard model for thesauri and SKOS concept schemes distinguishes just three types of relation between concepts: broader, narrower, and related. Between 2004 and 2008, the AGROVOC team pursued the goal of "ontologizing" AGROVOC by adding relations that were semantically more specific than related. These included relations between concepts, such as isUsedAs, influences, isPathogenOf, and causes, and relations between terms, such as hasSynonym, hasNearSynonym, and spellingVariant. However, of the 179 custom relations created in an Agrontology vocabulary, only 22 are used in AGROVOC more than 500 times, many were rarely used, and few AGROVOC editors today make the extra effort required to apply the relations evenly and consistently.
It is also unclear whether the custom relations actually serve the purpose for which they were intended, which was to support more complex queries and enable inferencing (i.e., calculating additional information using logic). If the custom relations are not in fact being applied evenly and consistently, they could be removed from AGROVOC and placed into an optional module (or "link set"). Alternatively, as suggested at the meeting in Utrecht, analysis could reveal a simplified core of fifteen or so relations worth prioritizing for use. (By comparison, GACS has just two: isProductOf and hasProduct.) However, some participants supported the current framework and questioned need for change.
AGROVOC concept and language coverage and curation
There is a need to expand AGROVOC subject coverage. Topics suggested for deeper coverage include:
animal science and husbandry, environmental protection and attitudes towards the environment, agricultural economics, nutrition, healthy lifestyle, ecology, agricultural technology and machinery, national agricultural heritages, gardening, pedology, biological oceanography, fisheries oceanography, agro-ecological zones, and limnology, forestry, agroforestry, engineering in agriculture, microplastics, plastic pollution in the sea, standards for food safety, standards for plant health and plant protection, blue growth, agroecology, and biotechnology, microorganisms, biodiversity, agricultural law, and climate change.
If you/your institution would like to contribute to the growth of the AGROVOC content in your language (actual AGROVOC language coverage can be found here), please, don’t hesitate to contact the AGROVOC team at: [email protected] and we will do our best to provide you with all necessary information.
It was a great opportunity for AGROVOC editors and stakeholders to commit to supporting further achievements in expanding the AGROVOC coverage and in advancing AGROVOC use for agriculture and related knowledge management practices.
- AGROVOC: Opportunities (Imma Subirats, FAO)
Key action priorities, as well as resources to making sure all AGROVOC-related goals to be achieved were also discussed. Capacity development activities such as webinars on maintaining AGROVOC in VocBench3 have been already planned.
We thank all the delegates who attended this important event, as well as those who submitted their ideas and thoughts electronically via [email protected], and mailing list for AGROVOC (which you are cordially invited to join!).
All presentations from the AGROVOC meeting are shared via AGROVOC EDITORS DGROUP (only for active editors) or ask [email protected]
Ivo Pierozzi Junior
Bonares Project, Helmholtz Centre for Environmental Research, Germany
Dr. T.V. Prabhakar
Russian AGRIS Centre at the Central Scientific Agricultural Library, Russia
Thai National AGRIS Centre and Asanee Kawtrakul of Kasetsart University in Bangkok
Department of Computer Engineering, Kasetsart University in Bangkok
Institute for Research and Technology in Food and Agriculture (IRTA), Spain
Xuefu Zhang and Guojian Xian
LIRMM, Montpellier, France
Lisette Mey, Laura Meggiolaro, and Carlos Tejo
FAO consultant, Netherlands
- Andrea Turbati, FAO consultant
- Carlos Tejo, Land Portal Foundation
- Imma Subirats, FAO
- Kristin Kolshus,FAO
- Laura Meggiolaro, Land Portal Foundation
- Lisette Mey, Land Portal Foundation
- Tom Baker, FAO consultant
- About AIMS and AGROVOC: http://aims.fao.org/agrovoc
- VocBench2: http://agrovoc.uniroma2.it:8080/vocbench2Agrovoc/
- VocBench3 (test): http://agrovoc.uniroma2.it:8080/vocbench3/
- AGROVOC in Skosmos: http://agrovoc.uniroma2.it/agrovoc/en/
- AGROVOC Web Services: http://agrovoc.uniroma2.it:8080/SKOSWS/services/SKOSWS?wsdl
- New short VB2 manual 2018: AGROVOC: working in VocBench2
- New short VB3 manual 2018: AGROVOC: working in Vocbench3
- AGROVOC recommended as a Common Agricultural Data Standard for Data Sharing Context to accelerate Data-Driven Agriculture Development in Cambodia and Nepal
- Discover AGROVOC and other Data Standards on the VEST / AgroPortal platform
- What are the Needs when working with Semantic Resources? AGRISEMANTICS collection of use cases in agriculture and nutrition
- AGROVOC Thesaurus : a BACKBONE to INTEGRATE & DISCOVER interoperable DATA [Several AGROVOC use-cases have been highlighted in the RDA IGAD AgriSemantics report (2018) 'A Collection of 20 use cases in agriculture and nutrition’]
- Discover AGROVOC multilingual Thesaurus of the Food and Agriculture Organization of the UN (FAO)... in Numbers!
- Reengineering Thesauri for New Applications: the AGROVOC Example
- Visit Agritrop, an open repository of CIRAD of publications indexed by AGROVOC with notable strengths in forestry and sustainable development
- Visit the CTA Shared Content Repository enhanced with AGROVOC concepts analysis and extraction configured through multiple processing chains for parsed content
- Version 4.0.2 of VocBench was released in August 2018
- A map of agri-food data standards (Report, 2016, GODAN)
- Building an eScience Thesaurus for Librarians: A Collaboration Between the National Network of Libraries of Medicine, New England Region and an Associate Fellow at the National Library of Medicine
- LODE-BD : How to select appropriate encoding strategies for producing Linked Open Data (LOD)-enabled bibliographic data
- Meaningful Bibliographic Metadata (M2B): Recommendations of a set of metadata properties and encoding vocabularies
- AGROVOC thesaurus : one of the biggest datasets in the Linguistic Linked Open Data cloud
- LODAtlas version 1.0 - developed by team Ilda at Inria - is a Web tool that helps users find linked datasets of interest through faceted browsing + keyword & URI search on the datasets' metadata and their schema-level content. The tool provides a set of interactive visualization widgets that help compare datasets along different criteria (number of triples, last update, interlinking with other datasets in the LOD cloud, etc.). Users can also get an idea of the contents of a given dataset thanks to a visual summary of the statements it contains. LODAtlas also provides a REST API that provides programmatic access to most of the data that can be visualized
- HortiVoc, an ontology for the horticulture domain in Indian languages covering 32 crops with nearly 8,000 terms
- Semantic Interoperability
- The slides from the 18th European Networked Knowledge Organization Systems (NKOS) Workshop presentations