E-Agriculture

Question 1: What are we sharing and what needs to be shared?

Question 1: What are we sharing and what needs to be shared?

The landscape of information and data flows and repositories is multifaceted. Peer reviewed journals and scientific conferences are still the basis of scholarly communication, but science blogs and social community platforms become increasingly important. Research data are now increasingly managed using advanced technologies and sharing of raw data has become an important issue. 

This topic thread will address and discuss details about the types of information that need to be shared in our domain, e.g.:

  Information residing in communications between individuals, such as in blogs and
community platforms supported by sources such as directories of people and
institutions;

  Formal scientific data collections as published data sets and their associated
metadata and quality indicators, peer-reviewed scholarly journals or document
repositories;

  Knowledge „derivatives‟ such as collections of descriptions of agricultural
technologies, learning object repositories, expertise databases, etc.; And surely more...

Schema of data repositories and flows in agricultural research and extension. Data flows

There are several interesting examples of successful data exchange between distributed datasets, and some of them in the area of agricultural research and innovation. There are also ambitious attempts that still have to live up to expectations. A common characteristic of most examples is that they are based on specific ad-hoc solutions more than on a general principle or architecture, thus requiring  coordination between  "tightly coupled"  components and limiting the possibilities of re-using the datasets anywhere and  of replicating the experiment.

In some  areas there are global platforms for sharing and interoperability. Some of these address the need to access scholarly publications, mostly those organized by the publishers, and others address the interfacing of open archives. With regard to standards and services in support of interoperability, there are several very successful initiatives, each dealing with different data domains. Among document repositories, the most successful initiative is surely the Open Archives Initiative (OAI) Protocol for Metadata Harvesting used by a global network of open archives. The strength of this movement is changing the face of scholarly publishing.  Geospatial and remote sensing data have strong communities that have developed a number of wildly successful standards such as OGC that have in turn spurred important open source projects such as GeoServer. Finally, in relation to  statistics  from surveys, censuses and time-series, there has been considerable global cooperation among international organizations leading to initiatives such as SDMX and DDI, embraced by the World Bank, IMF, UNSD, OECD and others.

Singer  System1, GeoNetwork2, and GeneOntology Consortium3 are examples of successful initiatives to create mechanisms for data exchange within scientific communities. The SDMX4 initiative aims to create a global exchange standard for statistical data.

There are more examples, but these advanced systems cannot have a strong impact on the average (smaller, less capacitated) agricultural information systems, because  overall there are no easy mechanisms and tools for information systems developers to access, collect and mash up data from distributed sources. An infrastructure of standards, web sevices and tools needs to be created.

 


1 Singer System http://singer.cgiar.org/ Last accessed March 2011
2 GeoNetwork
http://www.fao.org/geonetwork/srv/en/main.home Last accessed March 2011
3 GeneOntology Consortium
http://www.geneontology.org/ Last accessed March 2011
4 SDMX
http://sdmx.org/ Last accessed March 201

Anyone mind if I start?

Hi, Simon Wilkinson from NACA, a regional inter-governmental organisation that facilitates cooperation in aquaculture R&D between member states.

NACA is a highly distributed organisation with a limited budget so we have a great interest in sharing information and experience electronically, although most of research centres in the network still have limited capacity to make use of IT. Aquaculture is lagging a long way behind terrestrial agriculture in this regard.

The main thing we share in electronic form is aquaculture publications. Since 2002 we have had a policy of making all publications available for free download from our website (www.enaca.org), mainly in PDF. More recently we have also started publishing audio recordings of technical presentations in MP3 for download/streaming and podcast feeds.

The NACA website is produced with a conventional content management system (CMS). Like most such tools, it was designed to build websites to view on screen. It is Google-friendly, ties into various social media and is great for presenting information to people. However, it doesn't follow any accepted metadata standard and does not offer any structured way to share data with machines.

We would like to share/federate our publication metadata with other digital libraries. So we have been doing some work to develop publication modules for the CMS that i) use standard Dublin Core metadata fields and ii) support the Open Archives Initiative Protocol for Metadata Harvesting (OAIPMH), which are released as open source projects.

Regards

Simon Wilkinson
Communications Manager
Network of Aquaculture Centres in Asia-Pacific

 

Hugo Besemer
Hugo BesemerSelf employed/ Wageningen UR (retired)Netherlands

There is a different angle: how does all this fit in the internal processes of a research group? At what stage are there sharable products?  The UK Research Information Network and the British Library have published a study on information exchange patterns within 7 research groups in the life sciences. See http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/... .  The first thing that struck me is the limited role of literature, bearing in mind that scientific performance of researchers is assessed to a large  extent on the basis of their publications and how they are cited. But most important is of course the diverse ways of working, and one of the most important conclusions is, in my view: 
"- Data and information sharing activities are mainly driven by needs and bene?ts perceived as most important by life scientists rather than top-down policies and strategies
- There are marked differences in the patterns of information use and exchange between research groups active in different areas of the life sciences, reinforcing the need to avoid standardised policy approaches"

Valeria Pesce
Valeria PesceGlobal Forum on Agricultural Research (GFAR)Italy

This study on information exchange patterns gives an interesting perspective...

Of course, given its scope, this study is focussed on exchange among scientists.
I think what most of the other stakeholders in agriculture (donors, extensionists, farmers) are concerned with is the sharing of heterogeneous information (scientific, raw data, management data) among different types of actors.
So, in my opinion, even if top-down information sharing policies apparently do not influence the way scientists exchange information, they may as well be needed if in the view of donors or project managers it is essential that scientific information is shared beyond the scientific community.

I completely agree with you!

Diane Le Hénaff

Andrianjafy Rasoanindrainy
Andrianjafy RasoanindrainyFarming & Technology for AfricaMadagascar

Hi All,

I work at 3 different level (at local, as a farmer, at national, as adviser and consultnat for projects and gov, and at international as consultant and "positive-observer"), it's a difficult choice I made few years now, but sometimes in life you have to live that way to be more coherent with yourself (if you have the courage), and to be more accurate in your "perception" of the outside world.

I'm more a computer scientist and began to be seriously involved in ARD after the food crisis in 2008.

I agree with Valeria and will go further, we mustn't forget that this consultation comes with the interesting title "BUILDING THE CIARD FRAMEWORK FOR DATA AND INFORMATION SHARING" ...... yes and ... FOR WHAT? WHY WE ARE SHARING INFORMATION? FOR WHOM?

Many skilled technicians and recognized scientists will forget why they are doing research and why they must share.

I'm based in Madagascar and become more familiar to the African continent.

The context is very different from what we see in developed countries. Very few of the researchers in the West will understand the culture and the livelihood of these researchers.

Institutional data and information are sometimes considered personal goods by researchers because they consider they are not paid for what they are producing.

To share, one researcher must be convinced he has enough to share to others. But in poor countries, researchers  never reach that level of satisfaction of sufficiency.

Why?

1) Because they compare themselves to other researchers form countries where they've studied (rich countries). They look at the european or american researchers who is very well paid.

2) Because many don't see the small farmers who need their discoveries and who survive with an income 10 - 20 times lower.

As long as researchers do research with "my interest-first" in mind, they won't share. And as long as these researchers are contaminated by the high standards from western systems, they won't be motivated.

The CIARD FRAMEWORK must consider these dimensions: HUMAN, SOCIO-CULTURAL, SOCIO-ECONOMICAL if we want to build an inclusive, strong and coherent framework. Not just technical.

 

Hugo Besemer
Hugo BesemerSelf employed/ Wageningen UR (retired)Netherlands

Another angle is: can we select what is stoered in a repository and what not.
Last year a  study  the two major Dutch data repositories (DANS and 3TU Datacentrum:was publihed  "Selection of Research Data: guidelines for appraising and selecting research data". ( http://www.surffoundation.nl/nl/themas/openonderzoek/cris/Documents/SURF... )  ) These guidelines have to be rather general, they have to cover the wide variety of data that science produces including  observational, experimental, model or simulation data, man-made versus machine-made (sensor) data, raw data or processed data". They distnguish between obligations (for re-use, verification, or general purposes such as cultural heritage or accountability). If no such obligation exists they point at uniqueness as an important reason for preservation. "In fact, these are all domains in the arts and sciences where research cannot be recreated".  The opposite of uniqueness is a repeatability of data generation: in experimental science there is a tendency to repeat observations after some time, probably with better techniques.
Where is agricultural data on this continuum?  A lot of it is simply unique, like  of earth observation data .  There are also an awful lot of field trials and cultivar tests. They are to some extent repeatable  but it takes time as crops takes time to grow.  If a proportion of that could be made available to the world in a sensible way we could mobilize a lot of "hidden knowledge".

Dear Simon,

You've done the first step by exposing the production in DC via an OAI module on the CMS. But it doesn't make the data being "open". It is the step of dissemination ...

I think the second step should go further with:

-> data and not only publications.

-> standardization using model & format that allow information sharing and reusability

It is not easy, especcially because current applications are based on relational databases, specific model & format. The database you presented is an example.

How ?

-> about data: we should integrate publications with projects, organisation, social network, researchers, raw data ...

See the CERIF model: http://www.eurocris.org/Uploads/Web%20pages/CERIF2008/CERIF2008_1.1_XML.pdf

See the VOA3R project that aims at integrating social network with open access publications...: http://voa3r.eu

-> by producing data under LOD (Linked Open Data)

AIMS is working on an excellent handbook called LODE Recommendations
Report on how to produce Linked Open Data (LOD)-enabled bibliographical data

It is very strategic to be able to share and reuse....

With kind regards,

Diane Le Hénaff, INRA

Johannes Keizer
Johannes KeizerFAO of the United NationsItaly

I am a researcher on organophosphorous pesticides. I am describing my desktop.

In the morning when I come into the lab, it is still empty. In the  topic box in the left hand corner of my desktop I write "Diazinon". Immediately my desktop starts getting populated. Centrally the manuscript appears from my cloud server on which I am working at the moment. In the lower part the desktop splits up in two boxes, one containing all the papers I am citing in my manuscript, the other one contains comprehensive bibliography on Diazinon. These streams come from openArchives and commercial journal publishers. Obviously I can filter the bibliography with one click by journals, authors, subtopics and I also can influence the relevance of the  order in which they appear.

On the right hand side of my desktop I find my datasets on the last toxicity test and enzyme activity measurements on Diazinon. But not only my datasets are there, but also the datasets of my two colleagues in the USA and China, who are working on the same questions. With one click I can align my datasets and their datasets and get them processed on a powerful computer in the cloud.   So in real time I can get data for the manuscript which I am writing.


 

Johannes Keizer
Johannes KeizerFAO of the United NationsItaly

Another box on my Desktop contains a RSS feed with news from all the science blogs on Diazinon and op Pesticides with the newest informal communications from the colleagues in the community.  It does not need to be mentioned that also all conferences are listed in another box. Both information streams have   inbuild filter mechanisms.

As I am also interested in the social implications of pesticide use, I have another RSS stream with pesticide  related news from 12 selected newspapers and magazines.

Another stream that I normall keep activitated are funding possibilities in the area...Money for our research is always needed. This stream does not only list funding agencies, but also gives information about conditions and people who could help in formulating the project.

There are many other features, which I did not activitate, because not closely related to my work:

- Legal situation in different countries on OP pesticides

- research groups and researchers

- tables with import restrictions from the Rotterdam Covention

- List of producers

- existing textbooks

and more......

 

Well, when I was a researcher on Diazinon, I did not have this environment. I could not even   dream about it, because we could not imagine what would be possible 20 years later.  Now, this is not anymore science fiction. We can create such a working environment if we want. We have to agree on standards and methodologies. The technology is already there  

Andrianjafy Rasoanindrainy
Andrianjafy RasoanindrainyFarming & Technology for AfricaMadagascar

Let me share what researchers in African research centers use to say (see the digital divide in perspective):

- we don't have enough computers at work (this doesn't apply to CG centers based in Africa)

- we don't have internet connection or we have very weak connection

- we have no training to use specialized, scientific softwares ...

Of course, this comes with all the consequenses, what I see is:

- very few researchers know about Web 2.0

- very few know about the power of using networks and a central server, using a groupware ... well all the basics for developped countries.

So, Keizer, before dreaming of exploiting the power of todays computing equipments with intelligently structured information, the dream for Africa is far behind.

This is good to know, when we ask african researchers to share information and if we realy want them to share.