John Fereira
| Organization | Cornell University |
|---|---|
| Organization type | University |
| Country | United States of America |
I am a programmer/analyst/technology stragegist at the Albert R. Mann Library at Cornell University (since 1995). I have been working with technology since 1975 (building/testing pong games for Atari). My primary focus in my current position relates to system architectures and "middleware" software which facilitates data exchange between systems. I have a specific focus on Agriculture Information Systems with a strong interest in improving the acquisition and dissemination of Agriculture Information to meet the needs of those with significant technological and other barriers and using the technologies most suitable for those environments (i.e. cell phones).
This member participated in the following Forums
Forum Forum: "Building the CIARD Framework for Data and Information Sharing" April, 2011
Question 4: What actions should now be facilitated by the CIARD Task Forces?
As some of you may know the CIARD RING site as well as several other AIMS related sites are based on Drupal, specifically version 6 of the Drupal CMS.
Although there has been much more active development with RDF in the Drupal 7 release there is a significant amount of development for Drupal 6. I have a copy of the CIARD ring site on a development machine that I have been using to RDFize the CIARD RING site and have made a considereable amount of progress.
On Monday Valeria and I will be having a skype call to discuss how to implement the additions that I have made on my dev site into the production CIARD RING site. Once that happens, much of the content in the RING site will be exposed as RDF and it will support a Sparql endpoint for queries on the data contained in the RING.
Moving forward, I have done a bit of investigation into Drupal 7 and specificially any roadblocks that still exist that would prevent a migration form D6 to D7. There are still a few obstacles but once those get resolved and D7 becomes more mature, the use of Drupal 7 as a content delivery mechanism means that all content in the site can be shared as RDF because RDF is built into the core of the system.
Wow. There have been lots of really good discussions on all of the questions so far and it's good to see that we're still going strong when addressing this final question.
I'd like to start by considering who *is* CIARD anyway? To me, CIARD is a community of practictioners. The criteria for becoming part of the community, or more officially, a CIARD Partner, it essentially boils down to an agreement to follow a set of fairly simple principles. Organizations can also be included in the CIARD RING and describe the services they provide in terms of subject areas, geographic location, the technologies they employ, etc. Although there are CIARD Task forces, but even though I've been an active participant of the CIARD Content Management Task Force I'm not sure I could name all of the member of the CMTF as it exists today. Clearly there are some communication issues even in our own house.
So what can CIARD do? One area is to facilitate communication between those in the CIARD community. As a technical contributor (with admin rights) to the RING itself I have seen a few limitations in how the RING can be used, as Valeria put it, for "Sharing information on what we are sharing and how". For those that have registered their organizations with the RING and have added a service you've probably notice that "we" ask a lot of questions. However, even when an organization fills out all the "required" fields there is still a lot of information about the information being shared that is not being captured.
For example, if one looks at the "How To" section on the RING there is an "instructions" link which provides a view of all of the instructions that an organization provided about how to use their services. In most cases, a boilerplate answer was provide regarding who could contribute to and/or consume information from their service. That's probably not as useful as the base uri for the OAI-PMH provider an organization might support.
On the front page of the site you'll find a map of services that have been registered in the RING. However, when I looked at the complete list of services awhile back a significant number of them did not provide the location information and thus their services do not appear on the map.
In both of these examples, an organization initiates communication with the RING, the site captures some information, then the communication stops. I don't have a formal solution but some mechanism where a working group could be established that would followup submissions to the RING to maintain communtication beyond a "Thank you for submitting your information" response. For example, if a service indicates that they provide an RSS feed or indicates that they are an OAI Provider, a followup email asking for the appropriate URLs, and capturing that information such that it's easily accessible by other RING partners would help improve the "How To" section.
The other area where I have contributed to is the creation of a couple of tutorials on the RING. I created one which had some boiler plate code for OAI-PMH harvesting and another which explains how to configure the Drupal Feeds module to consume an AgriFeeds RSS feed. However, those are the only tutorials available via the RING. It seems to me that if we could encourage CIARD RING organizations to contribute tutorials on other topics the RING could be a wealth of information on how to interoperate between similar systems. In general, if we can find a way to encourage more CIARD partner created content the system will become much more robust.
A third area that could be explored would be how to capture and share information related to people, and all the related information about a person such as subject expertice, technical expertice, etc. When registering with the RING, the user profile information is pretty limited. VIVO has been mentioned a few times in our discussions and perhaps could be used to manage the user profiles of those that are part of the CIARD community. So, for example, one could ask "who else is using ImpressCMS to disseminate information on Aquaculture" and the RING could provide an answer.
Anyway, that is just a few general ideas to toss out but they all boil down to how CIARD can improve communication to facilitate information sharing.
Question 2: What are the prospects for interoperability in the future?
When Krishnan brought up the topic of Interoperibility amoung people I thought that might be a good opportunity to introduce (for those that are not familiar with it) a project developed out of Cornell called VIVO (http://www.vivoweb.org). I'm hoping that my boss (the original developer and current development manager) will chime in to provide greater detail but VIVO is an open source semantic web application originally developed and implemented at Cornell. When installed and populated with researcher interests, activities, and accomplishments, it enables the discovery of research and scholarship across disciplines at that institution and beyond.
There is currently a large NIH funded VIVO project underway that involves seven institutions to create a national network of scientists will facilitate the discovery of researchers and collaborators across the country. Essentially, it is being implemented to facilitate the Interoperability among people.
Although VIVO has not yet founds it's way into the Agricultural Information Systems domain in any sort of production environment there has been a great amount of interest.
For example, suring a recent visit to several institutions in Costa Rica we had talked about devleoping a community of experts system that might involve insitutions associated with the SIDALC project in latin america. That includes 158 institutions in 22 different countries.
There is also a project at the United States Department of Agriculture (USDA) that has committed to using VIVO to create a one-stop shop for federal agriculture expertise and research results. Here's the official announcement: http://www.ars.usda.gov/is/pr/2010/101005.htm
Personally, I've done a bit of work integrating VIVO with Drupal based systems and the creation of a "Semantic Services" project that is being used in a few Cornell departments to provide faculty information to students.
When talking about interopibility among people I think taking a serious look at VIVO is warrented.
I am somewhat familiar with the eScienceNews system and although I haven't looked at the underlying technologies the site uses for it's implmentation I have a pretty good guess as to what it's doing. I suspect that it's using a system called OpenCalais (http://www.opencalais.com/) a web service that does a semantic analysis of "documents" using Natural Language Processing, machine learning and other methods to provide entities/tags that are delivered by back to the client which can then be used to enhance the discovery of those document by providing information on what a document is about.
When we're talking about where we can go in the future in sharing information, tools suche as Open Calais that let the machine do some of the work to improve interoperibility the the discovery of information will become quite valuable. Another project that I am familiar with is an AgroTagger system which essentially uses a similar text analysys approach then applies AgroVoc terms for tagging the document.
Question 1: What are we sharing and what needs to be shared?
I have been reading the responses so far to Q1 and many of them of have talked about the audience for shared information. We've seen discussions about African researchers, the needs of small holder farmers, those with a specific interest that are addressed with projects like the Borlaug Wheat Rust project, etc.
However, I'd like to suggest that in the context of "what kind of information is shared" that the audience is not the end user of the information. Granted, the information being shared is not really useful unless it provides end users get a tangible benefit, but when discussing about what kind of information can/should be shared, aren't we really talking about the sharing of information between systems? To me, in order satisify the desires of users like Johanness with the description of a customizable "smart" desktop, or researchers in the field, that interaction is between the end users of information and some sort of delivery platform.
Technologies such as RSS, OAI-PMH, LOD, etc. are just tools for encapsulating information, perhaps structuring that information and including linkages to related information but they don't try to dictate how that information is formatted or how it should be delivered to end users.
To me, when we talk about the "audience" for information that is going to be shared, the audience is other systems. In order to be successful, the interoperability between systems providing (sharing) information and other systems consuming that information, is the crux of what we should be discussing here. That means identifying and agreeing upon useful standards like OAI-PHM, LOD as RDF, and ensuring that the systems that are sharing information support those techonolgies sufficiently. Once a system is able to consume shared information, it's really up to the system (and those that develop it) as to how it's delivered.
I do think that "discovery" of the type of information that is being shared is an important piece here. For example, a site which is delivering content related to Aquaculture is going to want to be able to discover and consume information from other systems providing information about Aquaculture. This is where systems like the CIARD RING come in.
In any case, as we move forward in the questions in the consultion I think we'll find more clarity between information sharing and information delivery.
If "data" is too general (I agree that it is), so is "resources". Is information about a person, their professional affiliation, subject areas of expertice, links to publications, or other pieces of information related to people "data" or "resources"?
Valeria wrote "we need to share information on what we are sharing :-)"
Absolutely.
The CIARD RING can certainly be a good mechanism for discovering what information can be shared and I've actually been looking at it pretty closely recently so I have a good sense on how well the RING (as it exists right now) is accomplishing that task.
I have been playing around with a Drupal OAI-PMH harvester module and have been able to harvest content from numerous services that have indicated that they are functioning as an OAI Provider. Currently, there are 47 services in the RING which have indicated that "OAI Provider" is the service type. I couldn't find any of them that had specified the base URI (something that I could append ?verb=Identify) to obtain more information about what they are sharing. That's simply because when an organization provides information about the services they provide we are not asking for that base URI. I suspect the we'd see a similar pattern for services that expose information using an RSS feed.
I've got a development site that I've used to test harvesting OAI-PMH information and created a "OAI Provider" content type. It has a field for entering in the base uri, then I used a couple of computed_fields to construct urls and obtain the metadataformats and sets that the oai provider has exposed. It's pretty simple, but without the base uri readily available I had to search a couple of OAI registries to find out what that uri was.
Sometimes asking a simple question for a simple answer can be all it takes to determine what kind of information is being shared.
Greetings everyone,
I am John Fereira from Albert R. Mann Library at Cornell University. I've been working with CIARD and am a member of the CIARD Content Management Task Force and have recently done quite a bit of work on the CIARD RING site. Rather than engage too much in the dicussion right now I'm going to wait until some of the later topics. I had a bit of an incident yesterday (a house fire) that has me out of work for at least a day or so. I've got some minor burns but am otherwise fine as our the rest of my family. I do have some things that I want to contribute in regards to OAI-PMH and other CIARD RING, and information sharing technologies we're working with but I'll keep reading for and add my input over the next few days.