E-Agriculture

Question 3: What are the emerging tools, standards and infrastructures?

Question 3: What are the emerging tools, standards and infrastructures?

The new paradigm for interoperability on the web and  for building the basic layer for a semantic web is the concept of Linked Open Data1 (LOD).

Instead of pursuing ad hoc solutions for the exchange of specific data sets, the concept of linked open data establishes the possibility to express structured data in a way that it can be linked to other data sets that are following the same principle. Examples of an extensive use of "linked open data" technologies are the NYT or the BBC news service. Some governments too are pressing heavily to publish administrative information as LOD.

                             


   The Linking Open Data cloud diagram


The technology of LOD is based on W3C standards  such  as the "Resource Description Framework2" (RDF), which facilitates the exchange of structured information regardless of the specific structure in which they are expressed at  the  source level. Any database can easily be expressed using the RDF, but also structured textual information from content management systems can be expressed in RDF. The presentation of data in RDF makes them understandable and processable by machines, which are able to mash up data from different sites. There are now mainstream open source data management  tools like  Drupal or Fedora commons which already include RDF as the way to present data.

Within the area of agricultural research for development an infrastructure to facilitate the production of linked open data is needed. The four key elements to make this possible are:

   a registry of services and data sets (CIARD RING,http://www.ring.ciard.net);

   common vocabularies to facilitate automatic data linking (thesauri, authority files, value vocabularies);

   technology (content management systems, RDF wrappers for legacy systems);

   training and capacity development

 



1 Linked Data - Connect Distributed Data across the Web http://linkeddata.org/ Last accessed March 2011
2 Resource Description Framework
http://www.w3.org/RDF/ Last accessed March 2011

I feel compelled to appeal for consideration of an "appropriate level of technology". There is a huge gap in the capacity of institutions to deal with a lot of this technology.

A case in point: Most of the research centres in our network do not have an IT department. When you get down into the state/provincial level institutions they may not have any IT staff or information specialists at all. Whatever systems they have tend to be run by enthusiasts from some other discipline rather than professionals in this area.

This group needs simple tools that will allow them to share their data in a meaningful way (perhaps with larger systems operated by better-resourced institutions that can archive it). They need things that are easy to set up, use and maintain.

If I may labour the point, tools that require specialised server environments and complex configuration to set up and a programmer or engineer to maintain are beyond their capacity. Such resources are just not available to them.

There's a place for both 'high end' or complex tools that require the resources of a large institution to run and 'low end' tools that don't. I guess I'm saying that the low end should not be neglected, as this will help perpetuate the digital divide.

Maybe I should offer an example. We have been doing some work to add support for Dublin core and OAI-PMH to the content management system we use, ImpressCMS.

The first module we have released is a 'podcasting' tool for publishing audio and video recordings. As you might expect, recordings can be browsed online, downloaded, streamed or accessed with podcasting clients via RSS feed with enclosures, and shared via social media.
 
Installation is a simple two-click process. The data entry form uses unqualified Dublin Core fields to capture metadata in a convenient form. The module also supports a zero-configuration implementation of the OAI-PMH.
 
The key point is that the operator doesn't *need* to know the details of Dublin Core or understand OAI-PMH in order to establish an OAI repository with this module. It 'just works' out of the box and can be used by *non-specialists*. 
 
A live demo of the Podcast module is available here (info about the archive including the base URL is available here). A second more general purpose "library" module is currently in beta. Both are distributed under the GPL V2.
Johannes Keizer
Johannes KeizerFAO of the United NationsItaly

Greetings from Brussels.  I am here to participate in a meeting between the EC and some enterprises about  "Linked Open Government Data".  I was invited to present at this meeting AGROVOC published as Linked Open Data set.  There is a growing awareness also among donors that we need to improve the information (management) architecture in the area of agricultural research and innovation.  The EC has recently offered funding for 2 projects in this area  The AIMS team is involved in a project called  agINFRA, which is just negotiating the budget with the commission.   agINFRA is not a research project, but it's aim is to create useful and usable systems and tools.  Part of agINFRA are also partners from China, India and Ecuador, so not only European institutions.  I am informing here about agINFRA because I want the involvement of the entire community which  is discussing here.  We will need feedback to do something useful.  Therefore I want here to share some points of the work program for agINFRA

- we will  invest in components like AgriDrupal and AgriOceanDspace that can be used efficiently also in institutions and working groups without IT department and without bigger IT support and which do not require constant connectivity neither for use within the institution. These components will stay within their open source community and the project will give only a boost to adapt these tools for our purposes.  Further in the year, when the project officially is signed and will start we will approach all of you to ask for particpation in the requirements formulation

- we will improve the AGROVOC VocBench to make it a serious reference system for concepts and entities in agricultural research and innovation that can easily be used by everyone, online and offline. We have started two weeks ago with AGROVOC as "Linked Open Data" and we have already 20,000 links from AGROVOC LOD to other systems and vocabularies.  The power of this has to be made usable for the entire community

- we will improve what our colleagues at IIT Kanpur have developed with agroTagger. Many of you will have seen was Thomson Reuters has developed with open Calais.  We will make AgroTagger the openCalais of our community.  We think that marking up the agricultural content on the web as much as possible with concept URIs from AGROVOC would bring us very much forward in the ability of connecting data.

- part of agINFRA is the improvement of the CIARD RING to create a gloabl switchboard for information services

- agINFRA will also look to cloud computing and develop prototypes how cloud services with high computing power (i.e for big sets of LOD triples) can be used to process data for other partners which do not have this computing power.

These are only some spots.  All upcoming tools and technologies will have to address the heterogenicity of partners, data and services. They should not aim to overcome this heterogenicity, but to make data sharing possible within this heterogenicity.

The goal is not to creae one big infrastructure in which all have to participate. The goal is to create components, methodologies and standards which serve to create many interoperable infrastructures.

There is a -just started - national (French) project about building a Scientific Digital Library Infrastructure that will not only focus on archiving publications.

I am involved in some of the working groups and LOD is being considered as a relevant technology.

I would have liked to go beyond geographical barriers as I think that thematic networks and infrastructures are more relevant than countries. Then it is more a funding issue...

XiaoLu SU
XiaoLu SUAII/CAASChina

Automatic annotation can help to improve tagging or indexing of contents. If annotated properly, with a well organized thesaurus, a featured term map can be extracted from each text. With these featured maps, contents of these texts can be identified from each other, links to terms can be extended to other vocabularies, even in other languages, that make possible to automatic key work tagging, and ease the job of search engine. A network analysis method can make automatic annotation more accurate. And the thesaurus can also get benefits from annotation results to improve its structure.

Johannes Keizer
Johannes KeizerFAO of the United NationsItaly

I agree very much with you!  Our friends and colleagues at IIT Kanpur have developed a tool called AgroTagger, which is now available also on the web.  It uses AGROVOC to analyze text for concepts that covered in AGROVOC and then produces AGROVOC tags/URIS to which later then can be referred.

XiaoLu SU
XiaoLu SUAII/CAASChina

We have also developed an auto-tagging system, using China agriculture thesaurus. It is focused on Chinese processing. Now it runs as windows application, and can not be accessed from Internet. We have planned to develop a web version soon, and wish it can interoperable with AgroTagger.  if AgroTagger can not handle Chinese, I think our system may do some help.

Johannes Keizer
Johannes KeizerFAO of the United NationsItaly

woww! this is very precious Info.  AgroTagger at the moment processes English, Hindi and French as I know, but having Chinese would be essential.  We should set up collaboration on this.

AgroTagger is part of the agINFRA FP7 project. CAAS AII and ITKanpur are both part of agINFRA  :-) So the opportunity is very good to create a multilingual automatic indexing for agricultural information!

 

Agrotagger was developed in collaboration with FAO and ICRISAT.  For a given document it can suggest keywords from Agrovoc or from a small (about 3000) subset of Agrovoc(named agrotags). You can check it out at http://agropedialabs.iitk.ac.in/Tagger/. The documents can be in formats like PDF, DOC, .. This software is also mirrored at MIMOS Malasia http://202.73.13.50:58300/agroTagger/

 

Karin  Nichterlein
Karin NichterleinFood and Agriculture Organization of the United Nations (FAO)Italy
Dear all,
let me present you a concrete information and communication system that focuses on sharing of information on problems, technologies and good practices relevant for  small producers. This system is provided by the tri-lingual platform on Technologies for Agriculture (TECA) established by FAO in English, French and Spanish that was first introduced as a supply-driven repository for technologies validated with producers. An evaluation revealed that TECA is a good system to document and share otherwise dispersed information on technologies for smallholders with the advantage of providing a standard framework for description. The evaluation as well found that the platform lacked interactivity and needed to become demand driven. A new TECA version in Drupal was launched that combines the repository with communication tools allowing exchanges by users on technology and information needs and sharing of experiences. The new platform and its new functions such as an exchange group function was tested a) with rural users in a field pilot in Uganda and b) with a community focusing on the subject area beekeeping and marketing,  in close collaboration with the International Federation of Beekeepers’ Associations, both groups using facilitators to encourage sharing of experiences and technologies. Lessons learned are being used to further improve the platform to make practical information easier accessible (low band use)  and understandable for those working closely with small holders: producer organizations, public and private extension agents, NGOs, training institutions,  research systems or institutions, input suppliers, marketing agencies, etc.
While improving the TECA platform http://teca.fao.org/ (including upgrading it to Drupal 7), we are also revising the TECA guidelines and will soon have a package ready consisting of the improved TECA Drupal components and the guidelines that will help to integrate TECA into national programmes, institutions, development projects etc. for sharing of technologies and good practices with those most in need and often neglected, rural people and small holders and those directly supporting them.
Related to the content management system we use for the platform, we closely collaborate with the AgriDrupal community. The platform under revision will also be adjusted to the latest Agrovoc developments.

Kind regards

Karin Nichterlein

FAO, Research and Extension Branch