Previous PageTable Of ContentsNext Page

4. Writing the XML 11

4.1 Header of XML files

The AGRIS DTD provides a set of elements, refinements and schemes for describing and enforcing the structure that makes up the XML format of a bibliographic record. It is essential, when creating or exporting AGRIS AP compliant XML documents, to append the header shown below.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ags:resources SYSTEM "http://purl.org/agmes/agrisap/dtd/">
<ags:resources xmlns:ags="http://purl.org/agmes/1.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:agls="http://www.naa.gov.au/recordkeeping/gov_online/agls/1.2"
xmlns:dcterms="http://purl.org/dc/terms/">

4.1.1 Declaring XML

All XML documents must declare that they are XML documents by writing the following XML declaration:

<?xml version="1.0" encoding="UTF-8"?>

This line tells a software that receives the XML data file that you are writing XML and that it should match the file to the XML specification for version 1.0. We shall tackle the encoding issue later in this document. As this is not actually an XML tag containing data, it does not require a closing tag and must be at the beginning of the document.

4.1.2 Declaring the document type

When marking up documents using a DTD, it is a standard practice to include a DOCTYPE declaration so that the processing tools 'know' which DTD the document being processed conforms to. When an XML document is validated against the DTD by a validating XML parser, the XML document will be checked to ensure that all required elements are present and that no undeclared elements have been added. The hierarchical structure of elements defined in the DTD must be maintained. The values of all attributes will be checked to ensure that they fall within defined guidelines. In short, every detail of the XML document from top to bottom will be defined and validated by the DTD. This facilitates the process of ensuring uniformity among groups of XML documents, such as those harvested by the AGRIS repository from distributed centres from around the world.

<!DOCTYPE ags:resources SYSTEM "http://purl.org/agmes/agrisap/dtd/">

The above DOCTYPE declaration for an AGRIS resource document, marked up using the AGRIS DTD, indicates that the document type is ags:resources and that it conforms to the DTD. Requiring that an XML document be validated against the AGRIS DTD ensures the integrity of the data structure. XML documents may be parsed and validated before they are ever loaded by an application.

This declaration points to a PURL (Persistent Uniform Resource Locator), which facilitates the validation, provided that the computer is connected to the Internet. If not, the DTD included in the appendix should be used to validate the XML document.

4.1.3 Declaring the namespaces

The namespace declarations should be the next line following the XML DTD reference. Because documents may contain multiple namespaces, and because the possibility of collisions between prefixes exists, namespaces allow developers to map prefixes to URIs for elements and their contents, not just document-wide.

<ags:resources

xmlns:ags="http://purl.org/agmes/1.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:agls="http://www.naa.gov.au/recordkeeping/gov_online/agls/1.2">

In the above example, there are four namespace declarations: ags, dc, dcterms and agls. In general, a namespace uniquely identifies a set of names or tags so that there is no ambiguity when tags having different origins but the same names are mixed together. Thus, dcterms:citation is different from ags:citation.

4.2 Mandatory elements and schemes

The structures used to describe the “AGRIS” class of documents are Text Only (dc:type, dc:source, etc.), Element only (dc:citation, agls:availability, etc.) and Mixed content (dc:title or dc:relation). All attributes are data character strings (CDATA) with the exception of ags:ARN, which is a unique identifier for the root element ags:resource (ID) and the reserved attribute xml:lang which, where applied, should be constrained to the three-letter ISO639-2 language code12.

Within the DTD, cardinality of the elements is indicated with the following cardinality operators.

(no indicator)

Required

One and only one

+

Required, repeatable

One or more

?

Optional

None or one

*

Optional, repeatable

None, one, or more

4.3 XML body of the document

This section will explain how to encode each element, refinement and scheme to create well-formed XML elements. Each table describes the content model of the element, a template explaining how the content should be tagged, the attributes and if the attribute is required.

4.3.1 Attribute ags:ARN

This attribute replaces the previous AGRIS field for Temporary Record Number (TRN). It has an ID validity constraint that provides uniqueness to an AGRIS resource. It is therefore essential that a unique numbering system be used to differentiate between two records. ARN is mandatory for all records submitted to AGRIS. The format used for this required attribute is made of 12 characters, divided into three groups. A typical ARN will contain:

  1. the two-letter ISO country code of the country where the AGRIS Resource centre is located for the code of the multinational or international institution submitting input. This list can be found in the ISO3166-1 for geographic codes. AGRIS codes that are currently being used by the centres are provided in the Appendix C of AGRIS AP User Guide or in the AGRIS web site.
  2. the year in which the input record is created. This must be in four digits and is not the year of publication of the resource.
  3. the sub-centre code assigned by the Resource Centre, one character only, to be used in countries with more than one resource centre. It may be a letter or a digit. In countries where there are more than nine sub-centres the sub-centre code may be a letter. For countries with one resource centre a zero (0) should be entered in this field.
  4. a number made of five digits. This can be assigned on a yearly basis or it can be just a local classification number, such as an internal library number. For example:


4.3.2 Root element ags:resource

This is the root element and it contains all the other core elements and qualifiers. Five of the core elements are mandatory, namely title, date, subject, language and availability information. It is the most important element, as it contains the rest of the document and becomes synonymous with the document type.

XML content model

(dc:title+, dc:creator*, dc:publisher*, dc:date+, dc:subject+, dc:description*, dc:identifier*, dc:type*, dc:format*, dc:language+, dc:relation*, agls:availability+, dc:source?, dc:coverage*, dc:rights*, ags:citation*)

XML tag

<ags:resource ags:ARN="XF2004000244"> </ags:resource>

XML attributes/schemes

ags:ARN (See 4.3.1).

required

4.3.3 Element dc:title

Enter in this element the title of the document. Enter also, if available, the translated title of the resource (dcterms:alternative).

XML content model

(#PCDATA | dcterms:alternative)*

XML tag

<dc:title xml:lang="eng">title of resource
<dcterms:alternative xml:lang="eng">other title, normally translated </dcterms:alternative>
</dc:title>

XML attributes/schemes

xml:lang

required

4.3.4 Element dc:creator

This element describes all entities (Agents) that handle the resource, i.e. creating or contributing. It may include a person (ags:creatorPersonal); an organization, a service or an agency (ags:creatorCorporate); or a conference (ags:creatorConference).

XML content model

(ags:creatorPersonal | ags:creatorCorporate | ags:creatorConference)*

XML tag

<dc:creator>
<ags:creatorPersonal>personal creator</ags:creatorPersonal>
<ags:creatorCorporate>corporate creator</ags:creatorCorporate>
<ags:creatorConference>conference creator</ags:creatorConference>
</dc:creator>

XML attributes/schemes

-

 

4.3.5 Element dc:publisher

Enter in the two refinement elements the information about the publisher. These elements provide the name of the individual, group, or organization which controls or publishes the item (ags:publisherName) and its location (ags:publisherPlace).

XML content model

(ags:publisherName | ags:publisherPlace)*

XML tag

<dc:publisher>
<ags:publisherName>name of publisher</ags:publisherName>
<ags:publisherPlace>location of publisher</ags:publisherPlace>
</dc:publisher>

XML attributes/schemes

-

 

4.3.6 Element dc:date

Enter in this element the date when the resource was made available. dc:date must be used together with its qualifier (dcterms:dateIssued).

XML content model

dc:date (dcterms:dateIssued)

XML tag

<dc:date>
<dcterms:dateIssued>date of publ.</dcterms:dateIssued >
</dc:date>

XML attributes/schemes

scheme (dcterms:W3CDTF)

 

4.3.7 Element dc:subject

Enter in this element the subject information about the resource. It can be free-text (dc:subject), come from a classification scheme (ags:subjectClassification) or from a controlled vocabulary (ags:subjectThesaurus).

XML content model

(#PCDATA | ags:subjectClassification | ags:subjectThesaurus)*

XML tag

<dc:subject>
<ags:subjectClassification scheme="ags:ASC">ASC scheme</ags:subjectClassification>
<ags:subjectClassification scheme="ags:CABC">CABC scheme</ags:subjectClassification>
<ags:subjectClassification scheme="dcterms:DDC">DDC scheme</ags:subjectClassification>
<ags:subjectClassification scheme="dcterms:LCC">LCC scheme</ags:subjectClassification>
<ags:subjectClassification scheme="dcterms:UDC">UDC scheme</ags:subjectClassification>
</dc:subject>
<dc:subject>
<
ags:subjectThesaurus scheme="ags:AGROVOC" xml:lang="eng">AGROVOC term</ags:subjectThesaurus>
<ags:subjectThesaurus scheme="ags:ASFAT" xml:lang="eng">ASFAT term</ags:subjectThesaurus>
<ags:subjectThesaurus scheme="ags:CABT" xml:lang="eng">CABT term</ags:subjectThesaurus>
<ags:subjectThesaurus scheme="ags:NALT" xml:lang="eng">NALT term</ags:subjectThesaurus>
<ags:subjectThesaurus scheme="dcterms:LCSH" xml:lang="eng">LCSH term</ags:subjectThesaurus>
<ags:subjectThesaurus scheme="dcterms:MeSH" xml:lang="eng">MeSH term</ags:subjectThesaurus>
</dc:subject>

XML attributes/schemes

ags:subjectClassification
scheme (ags:ASC | ags:CABC | dcterms:DDC | dcterms:LCC | dcterms:UDC)
ags:subjectThesaurus
scheme (ags:CABT | ags:AGROVOC | ags:NALT | ags:ASFAT | dcterms:LCSH | dcterms:MeSH)
xml:lang

required

required


required

4.3.8 Element dc:description

This element indicates different descriptive aspects of the resource. These may include a brief statement, annotation, comment, or elucidation concerning any aspect of the resource (ags:descriptionNotes); formally designated version of the data set or information resource being described (ags:descriptionEdition); or an abstract as a summary of a document designed to give the user a clearer idea about the document’s contents (dcterms:abstract).

XML content model

(ags:descriptionNotes | ags:descriptionEdition | dcterms:abstract)*

XML tag

<dc:description>
<ags:descriptionEdition>description of edition</ags:descriptionEdition>
<ags:descriptionNotes>notes</ags:descriptionNotes>
<dcterms:abstract xml:lang="eng">abstract</dcterms:abstract>
</dc:description>

XML attributes/schemes

dcterms:abstract
xml:lang

optional

4.3.9 Element dc:identifier

The identifiers help locate or/and identify a resource. There can be many numbers assigned to a document. This element is reserved for standard numbers taken from the item. Some of the numbers may be input in authorized form. For web resources, the URI (electronic address starting with: for ex. http:// or ftp://) is also placed in this element. Numbers assigned by cataloguing institutions for internal purposes are not entered here, but placed into the agls:availability field.

XML content model

dc:identifier (#PCDATA)

XML tag

<dc:identifier scheme="ags:DOI">DOI id</dc:identifier>
<dc:identifier scheme="ags:IPC">International Patent Classification no.</dc:identifier>
<dc:identifier scheme="ags:ISBN">Book ISBN</dc:identifier>
<dc:identifier scheme="ags:JN">Job Number</dc:identifier>
<dc:identifier scheme="ags:PN">Patent Number</dc:identifier>
<dc:identifier scheme="ags:RN">Report Number</dc:identifier>
<dc:identifier scheme="dcterms:URI">URI of resource</dc:identifier>

XML attributes/schemes

scheme (ags:IPC | ags:RN | ags:PN | ags:ISBN | ags:JN | dcterms:URI | ags:DOI)

optional

4.3.10 Element dc:type

Although it is not mandatory, the value of this element should be provided when possible. It explains the nature or genre of the content of the resource and also helps to describe the general categories, functions, genres, or aggregation levels for the content of the resource.

If possible, select the dc:type values from the DCMI Type list13. If using a local type controlled vocabulary, make sure there is no code but instead whole words that describe the genre of the resource.

XML content model

dc:type (#PCDATA)

XML tag

<dc:type>DC Types controlled vocabularies</dc:type>

XML attributes/schemes

scheme (dcterms:DCMIType)

optional

4.3.11 Element dc:format

The extent element (dcterms:extent) is used to indicate the size or duration of the resource. The medium element (dcterms:medium) is used to indicate the material or physical carrier of the resource.

XML content model

dc:format (dcterms:extent | dcterms:medium)*

XML tag

<dc:format>
<dcterms:extent>collation, size, duration of the resource</dcterms:extent>
<dcterms:medium> the material or physical carrier of the resource. </dcterms:medium>
</dc:format>

XML attributes/schemes

-

 

4.3.12 Element dc:language

For this element, it is recommended to enter the three letter code from ISO639-214. If your local system does not allow you to provide the 3 letter code, enter the two letter code, indicating the scheme as ISO639-115. If a language does not have a code in the selected scheme, enter the full form of the language without indicating the scheme.

XML content model

dc:language (#PCDATA)

XML tag

<dc:language scheme="ISO639-1">language of resource</dc:language>

XML attributes/schemes

scheme (ISO639-1 | ISO639-2)

optional

4.3.13 Element dc:relation

This element is used to link one resource to another. It allows the establishment of various relationships between resources and for users to locate related resources. When using relation element, it is important to establish the type of relationship by choosing a value from one side of any of the following pairs of relation refinement types.

XML content model

(#PCDATA | dcterms:isPartOf | dcterms:hasPart | dcterms:isVersionOf | dcterms:hasVersion | dcterms:isFormatOf | dcterms:hasFormat | dcterms:references | dcterms:isReferencedBy | dcterms:isRequiredBy | dcterms:requires | dcterms:isReplacedBy | dcterms:replaces | ags:relationHasTranslation | ags:relationIsTranslationOf)*

XML tag

(only dcterms:URI scheme included for each qualifier)

<!-- physical or logical part of the referenced resource -->
<dc:relation>
<dcterms:isPartOf scheme="dcterms:URI">related URI</dcterms:isPartOf>
</dc:relation>
<!-- the referenced resource either physically or logically -->
<dc:relation>
<dcterms:hasPart scheme="dcterms:URI">related URI</dcterms:hasPart>
</dc:relation>
<!-- a version, edition, or adaptation of the referenced resource. Changes in version imply substantive changes in content rather than differences in format -->
<dc:relation>
<dcterms:isVersionOf scheme="dcterms:URI">related URI</dcterms:isVersionOf>
</dc:relation>
<!-- a version, edition, or adaptation, namely, the referenced resource -->
<dc:relation>
<dcterms:hasVersion scheme="dcterms:URI">related URI</dcterms:hasVersion>
</dc:relation>
<!-- same intellectual content of the referenced resource, but presented in another format-->
<dc:relation>
<dcterms:isFormatOf scheme="dcterms:URI">related URI</dcterms:isFormatOf>
</dc:relation>
<!-- pre-existed the referenced resource, which is essentially the same intellectual content presented in another format -->
<dc:relation>
<dcterms:hasFormat scheme="dcterms:URI">related URI</dcterms:hasFormat>
</dc:relation>
<!-- references, cites, or otherwise points to the referenced resource -->
<dc:relation>
<dcterms:references scheme="dcterms:URI">related URI</dcterms:references>
</dc:relation>
<!-- is referenced, cited, or otherwise pointed to by the referenced resource -->
<dc:relation>
<dcterms:isReferencedBy scheme="dcterms:URI">related URI</dcterms:isReferencedBy>
</dc:relation>
<!-- is required by the referenced resource, either physically or logically -->
<dc:relation>
<dcterms:isRequiredBy scheme="dcterms:URI">related URI</dcterms:isRequiredBy>
</dc:relation>
<!-- required by the referenced resource, either physically or logically -->
<dc:relation>
<dcterms:requires scheme="dcterms:URI">related URI</dcterms:requires>
</dc:relation>
<!-- is supplanted, displaced, or superseded by the referenced resource -->
<dc:relation>
<dcterms:isReplacedBy scheme="dcterms:URI">related URI</dcterms:isReplacedBy>
</dc:relation>
<!-- supplants, displaces, or supersedes the referenced resource -->
<dc:relation>
<dcterms:replaces scheme="dcterms:URI">related URI</dcterms:replaces>
</dc:relation>
<!-- has a translation, namely, the referenced resource -->
<dc:relation>
<ags:relationHasTranslation scheme="dcterms:URI">related URI</ags:relationHasTranslation>
</dc:relation>
<!-- a translation of the referenced resource -->
<dc:relation>
<ags:relationIsTranslationOf scheme="dcterms:URI">related URI</ags:relationIsTranslationOf>
</dc:relation>

XML attributes/schemes

scheme (ags:IPC | ags:PN | ags:ISBN | ags:JN | dcterms:URI | ags:ARN)

required

4.3.14 Element agls:availability

Availability provides users with a number or code that is uniquely associated with an item, and serves to identify that item within an organization. This number is normally assigned by the organization that holds the item. Since this is local information, availability must include the name or code identifying the institution or repository (ags:availabilityLocation) in which the item is housed and the local number (ags:availabilityNumber) with which the resource is locally accessed.

XML content model

agls:availability (ags:availabilityLocation, ags:availabilityNumber)*

XML tag

<agls:availability>
<ags:availabilityLocation>availability location</ags:availabilityLocation>
<ags:availabilityNumber>availability number</ags:availabilityNumber>
</agls:availability>

XML attributes/schemes

-

 

4.3.15 Element dc:source

This element (dc:source) provides the reference to a resource of which the current resource is a part. When cataloguing the analytic, this element is used to provide information for identification of the Monograph. Source information that can go into this element includes title of the whole, creators of the whole, etc.

XML content model

dc:source (#PCDATA)

XML tag

<dc:source>additional information of resource</dc:source>

XML attributes/schemes

-

 

4.3.16 Element dc:coverage

This element (dc:coverage) provides information about the geographical (dc:spatial) and temporal (dc:temporal) coverage of the resource.

XML content model

dc:coverage (#PCDATA, dc:spatial, dc:temporal)

XML tag

<dc:coverage>additional information of resource
<dcterms:spatial scheme ="dcterms:ISO3166">coverage (ISO3166)</dcterms:spatial>
<dcterms:temporal scheme ="dcterms:TGN">coverage (TGN)</dcterms: temporal >
</dc:coverage>

XML attributes/schemes

dcterms:spatial
scheme (dcterms:POINT | dcterms:ISO3166 | dcterms:TGN | dcterms:Box)
dcterms:temporal
scheme (dcterms:Period | dcterms: W3CDTF)

optional

optional

4.3.17 Element dc:rights

This element is used to provide a simple human-readable statement of who holds rights over a resource.

XML content model

dc:rights (#PCDATA, ags:rightsStatement, ags:rightsTermsOfUse)

XML tag

<dc:rights>
<
ags:rightsStatement>statements of rights</ags:rightsStatement>
<
ags:rightsTermsOfUse>terms of use<ags:rightsTermsOfUse>
</
dc:rights>

XML attributes/schemes

 

4.3.18 Element ags:citation

This is a mandatory entry when the resource is part of a serial. A serial is defined as a publication, usually having numerical or chronological label, and intended to be continued indefinitely. It may be made available on any medium and is issued in successive parts.

XML content model

ags:citation (ags:citationTitle | ags:citationIdentifier | ags:citationNumber | ags:citationChronology)*

XML tag

<ags:citation>
<ags:citationTitle>Title of the serial</ags:citationTitle>
<ags:citationIdentifier scheme="ags:ISSN"> Identifier of the Serial </ags:citationIdentifier>
<ags:citationNumber>Number of the issue</ags:citationNumber>
<ags:citationChronology>Chronological designation of the issue</ags:citationChronology>
</ags:citation>

XML attributes/schemes

ags:citationTitle
xml:lang
ags:citationIdentifier
scheme (ags:ISSN | ags:CODEN)

optional

required




11 In APPENDIX B an instance of an AGRIS AP XML document
12 Codes for the Representation of Names of Languages at http://www.loc.gov/standards/iso639-2/langcodes.html
13 http://dublincore.org/documents/dcmi-type-vocabulary
14 ISO639-2 the three letter language codes http://www.loc.gov/standards/iso639-2/langcodes.html
15 ISO639-1 and ISO639-2 codes http://www.loc.gov/standards/iso639-2/codechanges.html



Previous PageDébut de pageNext Page