Hugo Besemer

Hugo Besemer

Organization Self employed/ Wageningen UR (retired)
Organization type University
Organization role
Data management specialist
Country Netherlands (Kingdom of the)
Area of Expertise
data management
FAIR data

One of my first jobs involved writing an AGRIS manual with a pen, and I have been involved with agricultural information management ever since. I have tried to keep with the technical developements while keeping in touch content side of things. Half of my time I am working for Wageningen University and Research Centre in the Netherlands, the other half as a free agent doing consultancies for international organizations like FAO, DFID, CGIAR, CTA and IMARK. Currently my work involves data repositories, linked data and citation analysis.

This member participated in the following Forums

Forum Forum: "Building the CIARD Framework for Data and Information Sharing" April, 2011

Question 1: What are we sharing and what needs to be shared?

Submitted by Hugo Besemer on Thu, 04/07/2011 - 11:24

I have been scanning through messages. It is quite a caleidoscope and I may have overlooked but I miss one dimension: creating new knowledge by combining things from different sources.

Let  me take the example of climatic change:

Climatic change affects the environment in which agriculoture operates. But it also causes climatic change, emitting CO2 ,  methane etc. And it can also mitigate climatic change by absorbing gases like CO2, N2O etc. To get an overview of how all these processes interact one has to combine

- climatic data

- economic / social data at farm level

- spatial information

- crop growth data

and more. An example of an attempt to develop a model combining these things is http://www.seamless-ip.org/

Scientists from different disciplines will have to learn to talk to each other, as they are used to communicate in the very specific language of their specialization. A biomathematician said about working with social scientists: "you really have to be patient and like each other to make it work". But can we make  the different types of data talk to each other? That  is an issue that we can discuss under question 2  on this forum.

 

 

Submitted by Hugo Besemer on Wed, 04/06/2011 - 14:10

"Data" is maybe too general. Maybe we ca distinguish

statistics

data from research

catalogues of things (e.g. germplasms)

........

 

Assume that be resources you mean links to information sources?

Submitted by Hugo Besemer on Tue, 04/05/2011 - 10:02

Peter, can you tell more about your stuggle to get datasets? In Wageningen we can rather easily get datasets of supplementary data with articles (publishers charge for that), we are now also working on data from groups that worked on models that combined data from different domains (e.g. farm, climate, crop models) ,  but we do not really know what to do to get thousands other datasets that are on peoples computers and nobody knows about but themselves.

Submitted by Hugo Besemer on Mon, 04/04/2011 - 21:15

We can not ignore mundane issues. The aim of this and similar initiatives is to motivate people to  make things public that they did not untill now. To do so with  datasets public requires work. The originators have to add metadata for discovery, and document the data. i.e. explain how it was collected, what the parameters are, which files there are and what their technical format is,  etc.  DDI was mentioned in the introduction, it is one of the initiatives to standardoze that process at least for the socual sciences. So we have to think about the incentives. Scientists are much more pressed than they used to be say twenty years ago, struggling with time registry forms and competing for funds. They write publications anyway, and apart from the with to share discoveries they do so for their scientific reputation, so indirectly to gain a next round of funding. There is not necessarily a convergence of interest with commercial publishers. With the exception of textbooks authors of scientific works do not get paid better if more copies of their works are sold. So one can  at least argue that open access publishing may make their work more visible, and therefore may help with their reputation . I hope that these discussions will help justifying that scientists or their institutions invest in data curation. Therefore we should think about priorities, and as our colleague from  the Global Rust Foundation said, potential audiences.







 

Submitted by Hugo Besemer on Mon, 04/04/2011 - 12:33

Another angle is: can we select what is stoered in a repository and what not.
Last year a  study  the two major Dutch data repositories (DANS and 3TU Datacentrum:was publihed  "Selection of Research Data: guidelines for appraising and selecting research data". ( http://www.surffoundation.nl/nl/themas/openonderzoek/cris/Documents/SUR… )  ) These guidelines have to be rather general, they have to cover the wide variety of data that science produces including  observational, experimental, model or simulation data, man-made versus machine-made (sensor) data, raw data or processed data". They distnguish between obligations (for re-use, verification, or general purposes such as cultural heritage or accountability). If no such obligation exists they point at uniqueness as an important reason for preservation. "In fact, these are all domains in the arts and sciences where research cannot be recreated".  The opposite of uniqueness is a repeatability of data generation: in experimental science there is a tendency to repeat observations after some time, probably with better techniques.
Where is agricultural data on this continuum?  A lot of it is simply unique, like  of earth observation data .  There are also an awful lot of field trials and cultivar tests. They are to some extent repeatable  but it takes time as crops takes time to grow.  If a proportion of that could be made available to the world in a sensible way we could mobilize a lot of "hidden knowledge".

Submitted by Hugo Besemer on Mon, 04/04/2011 - 12:23

There is a different angle: how does all this fit in the internal processes of a research group? At what stage are there sharable products?  The UK Research Information Network and the British Library have published a study on information exchange patterns within 7 research groups in the life sciences. See http://www.rin.ac.uk/our-work/using-and-accessing-information-resources… .  The first thing that struck me is the limited role of literature, bearing in mind that scientific performance of researchers is assessed to a large  extent on the basis of their publications and how they are cited. But most important is of course the diverse ways of working, and one of the most important conclusions is, in my view: 
"- Data and information sharing activities are mainly driven by needs and bene?ts perceived as most important by life scientists rather than top-down policies and strategies
- There are marked differences in the patterns of information use and exchange between research groups active in different areas of the life sciences, reinforcing the need to avoid standardised policy approaches"

Question 2: What are the prospects for interoperability in the future?

Submitted by Hugo Besemer on Wed, 04/06/2011 - 08:28

Linked  Open Data is probably the way to go. But there is a chicken-and-egg dilemma here: why would people make the investment and expose their data if nobody comes to use it and there is little data to combine with?

I think CIARD ot the agricultural information community in general can play a role, I can think of at least two ways:

- Formulate and find funding for projects that use LOD as a technology and that solve real life problems. The way to engage should be exposing your data in the right format.

- As Diane pointed out (and I hinted at it) documenting the data (how it is collected and what the parameters mean) is a lot of work and scientists need to provide most of it. Tranlating to LOD  is still more work: you do not just have to say what the rows and columns in your spreadsheet mean, you should also think of the right encodings (URI's) for the things and properties. and values But this is something where the community (through CIARD or otherwise) can help by developing guidelines. AIMS has made a start by working on guidelines for the exchange of bibliographic information (LODE), but what about other forms of information  typically exchanged for agriculture, like field trials, soil surveys, farm data etc.?

I am aware that I am inclined to talk about datasets in the first place, that is what I am workiong on at the moment. But I think much of this is also applicable to other forms of information, like news, project descriptions etc.

properties and

Become a member

As e-Agriculture Forum member you can contribute to ongoing discussions, receive regular updates via email and browse fellow members profiles.