Almost everything discussed in the previous chapters was necessary to the build up of a functioning GIS, but all the data gathering and digital recording could have been done without a GIS software programme. We have now reached a stage however where all the collected data should be ready for use in a GIS. As was mentioned in Chapter 5, given the huge choice of GIS software available, the particular choice of same is dependent upon the range of functionality required. It is the intention of this chapter to describe the functions (or operations) that a typical GIS may perform. Since the most expensive GIS programmes are capable of performing over 2 000 separate functions, and even an inexpensive GIS will perform hundreds of functions, then we can only give an indication of the most important ones here. These are set out under broad headings in Table 6.1. It should be stressed that not all of these headings are “hard and fast”, and indeed there will be a considerable degree of overlap, or blurring, between GIS functionality. Thus different authors or different GIS programmes may classify functions, manipulations, analyses and ways of data management in very different ways, and terms such as “conversions” or “transformations” may be interchangeable and may feature under a number of the headings given in Table 6.1. Further details on GIS operations and functionality can be obtained from Burrough (1986), Aranoff (1989), Star and Estes (1990), Maguire et al (1991), Martin (1991), Bernhardsen (1992) and Environmental Systems Research Institute (1993), whilst details on the more theoretical aspects of spatial and data modelling and manipulations can be found in Preparata and Shamos (1986), Foley et al (1990), Laurini and Thompson (1992) and Bonham-Carter (1994).
Table6.1 A Classification of GIS Functions
Data Pre-processing and Manipulation
|(i)||Data validation and editing, eg checking and correction.|
|(ii)||Structure conversion, eg conversion from vector to raster.|
|(iii)||Geometric conversion, eg map registration, scale changes, projection changes, map transformations, rotation.|
|(iv)||Generalisation and classification, eg reclassifying data, aggregation or disaggregation, co-ordinate thinning.|
|(v)||Integration, eg overlaying, combining map layers or edge matching.|
|(vi)||Map enhancement, eg image enhancement, add title, scale, key, map symbolism, draping overlays.|
|(vii)||Interpolation, e.g. kriging, spline functions, Thiessen polygons, plus centroid determination and extrapolation.|
|(viii)||Buffer generation, eg calculating and defining corridors.|
|(ix)||Data searching and retrieval, eg on points, lines or areas, on user defined themes or by using Boolean logic. Also browsing, querying and windowing.|
(i) Spatial analysis, eg connectivity, proximity, contiguity, intervisibility, digital terrain modelling.
(ii) Statistical analysis, eg histograms, correlation, measures of dispersion, frequency analysis.
(iii) Measurement, eg line length, area and volume calculations, distance and directions.
(i) Graphical display, eg maps and graphs with symbols, labels or annotations.
(ii) Textual display, eg reports, tables.
(i) Support and monitoring of multi-user access to the database.
(ii) Coping with systems failure.
(iii) Communication linkages with other systems.
(iv) Editing and up-dating of databases.
(v) Organising the database for efficient storage and retrieval.
(vi) Maintenance of database security and integrity.
(vii) Provision of a “data independent” view of the database.
Under this heading we will consider a large number of functions which a GIS may be required to perform in order to get the digital mapped data into the desired format so as to obtain requisite map output or to confidently allow for any subsequent data analysis. Essentially, this means that the original digital data may need to be changed in some way, i.e. either by correcting it, updating it, refining it or by altering it in some desired way. Some of the pre- processing functions have already been described in section 4.5.3, and it is possible that many of the functions can be performed using other types of software, e.g. image processing packages. The capacity of a GIS to perform pre-processing means that the user has a huge opportunity to “interactively experiment” with the available data, thereby allowing for the appropriate data to be derived according to the task in hand. The efficiency in which individual GIS's perform manipulations will depend upon the particular algorithms which they use and the way in which the data is structured.
In essence this function represents the checking and revising of any data which has previously been captured, with the obvious aim being to minimise errors. In the case of digitised data, it is often possible and desirable to perform editing immediately following data capture, i.e. as a final stage in the digitising process, but it is important to note that many GIS software programmes allow for the detection and correction of digitising errors as a pre-processing function. Typical digitising errors are shown in Figure 6.1.
Figure 6.1 Typical Errors Which Might be Made Whilst Digitising (after Laurini and Thompson, 1992)
GIS software also contains programmes for verifying the correctness of all geometric, topological and attribute data, e.g. making certain that all graphical data is suitably defined, that attribute data does not exceed expected ranges and that impossible combinations of attributes do not occur. Data may be copied, deleted, moved, joined, altered, etc. Any of these data editing functions should be capable of being performed on both the graphical and the textual data. If data is not carefully verified, and errors remain, then manipulations of the data at a later processing stage will cause error propagation and multiplication, thereby invalidating, or at least making less useful, any final GIS output. Further details on the importance of data validation and error correction can be found in Burrough (1986), Goodchild and Gopal (1989), Dunn et al (1990), Chrisman (1991), Thapa and Bossler (1992) and Rybaczuk (1993).
In section 4.3 we showed why it was preferable to structure digital data in such a way that it required less storage space. For many manipulations it may also be preferable to convert data from a raster to vector structure, or vice versa. This is necessary since there are still no truly integrated GIS's which are able to handle both raster and vector data with equal ease. Figure 6.2 provides a useful conceptual summary of the necessary steps in both of these conversions. It is important to note that in the vector to raster conversion (rasterising) there will be an inevitable loss of accuracy, a factor which would be exacerbated both with increasing sinuosity of the lines and with increasing raster cell size. In the raster to vector conversion (vectorising) the GIS software programme performs a vectorising process which “threads” a line through groups of pixels using a special “thinning” algorithm. There will be a consequent need for topological information to be constructed and for individual features to be identified. These latter requirements can call for considerable operator intervention, but there are GIS functions which automatically compute new nodes and links and compile topology tables. Star and Estes (1990) explain in some detail the advantages of certain raster to vector data structure conversions, i.e. according to the type, and the proposed use for, the data being handled.
When performing manipulations on mapped digital data, it is important that, if the data is to be merged in any way, then it should all conform to the same geometric reference system. Latitude and longitude co-ordinates are frequently used in small scale mapping, although the most frequently used co-ordinate system in GIS is the Universal Transverse Mercator (UTM) system (Figure 6.3). Nearly all GIS software allows for the possibility of converting the map referencing system used to a wide range of possible map projections or from one co-ordinate system to another. Such processes are sometimes called transformations or rectification. Transformations are based on the mathematical relationships that exist between the various map projections, i.e. relative to angles, areas, direction and distances. A more basic type of geometric conversion is called registration. This is simply changing one mapped view to line up with another, i.e. irrespective of any referencing system.
Figure 6.4 illustrates some further geometric conversions. Scale changes are easily accommodated via a simple multiplier function and maps can easily be rotated to particular orientations. A more complex function which most GIS's can achieve is the correction for distortions (rectification). These distortions may occur in the original source data for a number of reasons, e.g.
(a) Aerial photographs or RS satellite images have varying scales due to platform tilt and the curvature of the Earth.
(b) The angle of view or relief differences also cause variations in scale.
(c) Photographs or maps variably shrink with age.
(d) Maps on paper can easily suffer from stretching.
(e) Distortions in the optic systems being used.
Figure 6.2 Summary of Vector to Raster and Raster to Vector Structure Conversion (from Robinson Barker, 1988)
Distortions are manipulated by “rubber sheeting” methods, i.e. by treating the distorted image as an elastic sheet which can be stretched or compressed until it exactly fits a selected base mapon which there are positively identifiable ground control points.
Under this general heading a large number of manipulations can be performed, all of which are designed to change the data in some way such that it can be more easily used for a particular purpose, e.g.
(a) Adding to data, or deleting undesirable data.
(b) Aggregating or disaggregating numerical or attribute data.
(c) Classifying or reclassifying data into user defined, or GIS suggested, classifications. This usually involves deciding upon attribute value classes or changes to existing classes.
(d) Using data reduction algorithms to generalise or smooth linear data, e.g. to thin out coordinates in digitised lines (in order to greatly reduce the amount of data storage). Figure 6.5 illustrates how generalisation can be applied to a well known map outline.
Figure 6.3 The Universal Transverse Mercator (UTM) Zones
(e) Lines can be deleted or “dissolved” in order to simplify mapped surfaces.
(f) New attributes can be assigned to spatial points, lines or polygons.
(g) Annotations can be added to maps using labels, text, legends or cartographic symbols.
These manipulations involve the creation of new or revised mapped surfaces by (in one way or another) joining two or more previously defined maps. Perhaps the most frequently performed manipulation under this heading is the merging by overlaying of two or more mapped layers, i.e. any number of raster or vector map layers can be progressively added or subtracted to produce a desired map. Existing data on a single theme can be merged, e.g. a water qualitative map could be progressively built up by merging maps of perhaps water temperature, pH, dissolved oxygen, salinity, etc. By merging these layers a new map would be created which consisted of numerous polygons, each of which may have a different combination of water qualitative factors. The overlaying of maps can give rise to major problems. For instance, it is likely that many new polygons will be created, some of whose boundaries should correspond between the two maps being overlayed. Invariably the original data sources will be different, or the digitising will be inaccurate, and the new map will exhibit numerous so-called “slivers”, i.e. polygons arising from a poor match between boundaries or other lines (Figure 6.6). Here the overlaying of map a) on to map b) created map c) with its numerous slivers. There are now “intelligent” algorithms in some GIS's which can automatically counteract these slivers, i.e. as in Figure 6.6 (d). Other algorithms will also be needed to establish new topology and new attribute tables. It can be seen that overlay procedures are quite complex and they may take up a lot of computing time.
Figure 6.4 Some Geometric Conversions Which a GIS May Perform (after Dangermond, 1983)
Frequently it will be necessary to form a new map by accurately joining two mapped sheets (or many contiguous map sheets) along their edges, i.e. by edge matching, so that all linear and area features exactly coincide. A seamless dataset is then produced. When merging or integrating maps it will also be necessary to take into account possible variations in the data structure and format between any two maps, plus the ways that have been used to assign labels or to identify objects. So inevitably, integration must be performed with a great deal of consideration as to the integrity and make-up of the data being used.
These GIS functions simply consist of a series of operations which allow for the cartographic refinement of the finished map, i.e. at the manipulation stage factors concerned with map presentation can be improved. This may include adding a suitable border to the map, varying the width of mapped lines, altering chosen colours, varying the fonts or font size, altering the layout of the map or the position of textual features such as the key or title. If a 3-D image has been created this might be the stage at which a land use categorisation could be draped over the image.
Figure 6.5 Illustration to Show How a Map Outline Can be Generalised
This is the procedure for estimating the values for any continuous (rather than discrete) “properties” at unsampled sites along a line or within an area. This must be based on existing point observation data within the area (or along the line), which themselves should have been derived using valid measuring and sampling techniques. Figure 6.7 shows how isolines (lines joining places of equal value) have been interpolated from known values at several points. The problem in interpolation is choosing the model which is best able to produce correct interpolations, i.e. a model which suits the data array and the way in which actual variability occurs. Many simple GIS interpolation procedures rely on the use of various weighting functions, i.e. this allows for the logical fact that near points used in an interpolation should count for more than distant points. There are more complex interpolation models which cover 2-D arrays of data points e.g. Thiessen polygons, the use of kriging or Fourier series, or models which can be applied to interpolating linear pathways, e.g. the fitting of spline functions. A special case of interpolation is centroid determination. In this function the GIS is able to calculate the co-ordinate location of the centre of a polygon. Extrapolation is simply the using of interpolation techniques to extend calculated trends beyond the area of specific study or interest, or beyond the range of the data held. Burrough (1986) and Martin (1991) provide comprehensive coverages on interpolation and Laurini and Thompson (1992) provide a detailed description of the various models for interpolation.
Figure 6.6 Illustration Showing How Overlaying Techniques Can Lead to Poor Integration (from Bernhardsen, 1992)
Since a fundamental concern of GIS is with spatial distances, it is frequently useful to determine what are known as buffer zones (zones of equal distance) around a point or an area, or along a line. Figure 6.8 shows how simple buffer zones are created around a point, line and polygon in the raster structure. Buffers may be generated by the GIS at any preset distance and they might represent features such a maximum market range around a town, a legal exclusion zone, a zone of noise disturbance or a zone in which some economic rights exist. Buffer zones are clearly very useful in the compilation of many geographic models, though in reality their use may be restricted, mainly to isotropic surfaces.
Figure 6.7 A Simple Example to Show Interpolated Isolines
For discussion purposes the two processes of searching (querying) and retrieving can be viewed together, i.e. as a single process. In order that a GIS can perform any analytical procedure, it is essential that the software is able to selectively search and retrieve requisite data by as many criteria as possible. These criteria will include mapped data (lines, points and polygons) as well as attribute, numeric or textual data. The searching for data is performed using a dedicated Standard Query Language (SQL) and any search may be confined to a certain mapped area or to a specific theme. It is further possible to selectively retrieve data in various ways. For instance data can be classified by any theme, region or class. It is common to retrieve data using the rules of Boolean logic. Here the simple operators of “AND”, “OR”, “XOR” or “NOT” are stated to show which sort of conditions need to be met before the data is retrieved. Commands for search and retrieval using Boolean logic can be relatively complex. So, for instance, the GIS could be instructed to “find all the marine areas having a mean water temperature of >20C, in combination with a depth of <50 metres, which are situated in the waters of both country “x” and “y” and in which quotas do not yet operate”. Complex requests like this can involve any parameters for which the data is held.
Figure 6.8 Buffers Around a Point, Line and Polygon
It is the incorporation of analytical functionality which arguably distinguishes a true GIS from other forms of mapping packages. Recently there has been criticism that many GIS packages have lacked a sufficient range of analytical functions, but nowadays this would seem to be unfair in the sense that most packages have at least a limited range of such functions. Also, it is now usually a simple matter to link a GIS package to a specialist analytical software programme in which the analysis is carried out before reverting back to the GIS software for mapping. Additionally, it is not usually worth the time and financial effort involved for a software house to integrate lots of analytical functions, i.e. since most of these are only required for research purposes.
The analytical functions which most GIS software provides operate on both the spatial or the attribute data (or a combination of these). Most of the following analyses can be performed on both vector or raster structured data, though inevitably one or other of these is more efficient depending on the actual analysis being performed. Some authors have reviewed analytical capacities under headings which correspond to the types of data, i.e. point analysis, polygon analysis and linear analysis or vector analysis and raster analysis, though we have chosen to discuss the techniques under the headings of spatial, statistical and measurement analyses. Figure 6.9 shows examples of raster analytical techniques. Laurini and Thompson (1992) or Bonham- Carter (1994) provide a detailed theoretical background on how the various analyses are performed.
Figure 6.9 Some Raster Based Analytical Techniques (from Dangermond, 1983)
In Table 6.1 we gave examples of some of the important types of spatial analysis which may be performed. Here we can briefly expand upon these.
(a) Connectivity (or network) analysis is useful for determining how well connected any particular site is via any method of communications. Thus a connectivity index can be worked out which shows, for example, for all towns (nodes) in any selected area, the relative number of road or other communications or pipeline connections (links) which exist between each town and all the other towns in that area. Connectivity can also be conceived in terms of distance, cost or time, and it is useful in optimising route allocations. Figure 6.10 illustrates how connectivity is recorded for a simple set of links and nodes.
(b) Proximity and contiguity analyses are respectively simply methods of determining and indicating measures of distance between locations, or of showing a location's degree of adjacency to neighbouring locations. Figure 6.11 shows how contiguity is recorded for polygons in a small area. The creation of a buffer zone is an example of a proximity operation.
Figure 6.10 Method of Compiling a Connectivity Matrix (after Laurini and Thompson, 1992)
Figure 6.11 Method of Compiling a Contiguity Matrix (after Laurini and Thompson, 1992)
(c) Intervisibility defines, from map evidence, whether or not it is possible to have a direct line of sight between any two points on the map. Thus a calculation is made, bearing in mind the existence of high ground as shown by contours, whether or not hills or other high ground would obscure the line of vision.
(d) Digital terrain modelling is the process whereby it is possible, using digitised height data, to build a 3-D model of any desired area. These models amy also be called 2.5-D since they only show surface heights and not true volumetric data. From the marine point of view it would be equally possible to use bathymetric data to construct visual models showing the physical appearance of selected areas of the sea floor.
(e) Location optimisation is now being widely used as a GIS based method which allows for the selection of optimum locations for the siting of any activity. This analysis is usually used by larger commercial companies when seeking, for instance, sites for new retailing outlets or for centralised distribution points. In these cases various spatially variable economic and social indicators, such as the social class structure of an area and the population density, would need to held in a digital geo-referenced form. Similar analyses are also used by the forestry and agricultural sectors in seeking to optimise their operations, though here physical rather than economic criteria might be more important.
(f) Trend surface analysis is a method for establishing whether a generalised spatial surface exists, i.e. one which may be obscured by a mass of detail in the real world. For instance, in any one country there may be an overall “wealth” surface which trends perhaps from east to west but which could well be obscured by numerous pockets of prosperity or poverty. From the marine viewpoint, it is quite likely that trend surfaces would exist with regard to the distribution of particular species, i.e. such that they would gradually decline outwards from a biologically optimum area but in an irregular, and thus perhaps obscured, way. The fitting of a trend surface therefore becomes a useful way of identifying spatial anomalies-points or areas which are above or below the general trend.
As with the spatial analyses, there are a huge range of statistical functions which any individual GIS might be able to perform. Since many of these functions are not particular to GIS, i.e. they are commonly performed by statistical packages, or spreadsheet or database packages, then we need not elaborate on them here, except to mention that it is possible to link many existing statistical packages with GIS software as a means of executing statistical analyses. These include simple descriptive statistics showing measures of centrality, frequency analyses or measures of dispersion, plus more complex correlations and multi-variate analyses. Many GIS's are now capable of performing some complex spatial statistical analyses such as spatial autocorrelation and nearest neighbour analysis. Spatial autocorrelation is used to provide a measure of contiguity between areas, e.g. Figure 6.12 shows the theoretical range of possible autocorrelations. If a marine species were shown to be distributed in any of these ways, then we might need to seek explanations. Nearest neighbour analysis provides a relative measure of the dispersion of points in a given area, i.e. they may tend towards clustering, randomness or a uniform spread. There is now some recognition of the importance of geostatistical capabilities as a functional tool within GIS's (Thomas, 1991).
Figure 6.12 The Possible Range of Spatial Autocorrelation
Under this heading, a GIS will be capable of performing a large number of operations on one or more data layers. Measurement will vary from simple counts (enumeration), to measuring linear or curvilinear distances, to calculating areas, perimeters or volumes or to recording directional or angular measurements. The derived measurements can then form the bases of further work using the GIS software, e.g. tabular or graphical displays. Clearly certain data structures will lend themselves more easily to different types of measurement. Thus it will be a simple task to compute area if a raster format is used, whilst distance measurements can be more accurate using vector data, i.e. if only because the central point of a pixel (in the raster structure), from where distances are measured, may not be the true starting point.
A major anticipated use for any GIS will be to display the data, i.e. the display capacity will represent the output from the system as presented initially on the VDU. A fundamental usefulness of the concept of GIS is that it can display output at any stage in the processing of the data. So the GIS provides the facility for maps to be incrementally built up, with desired modifications being possible at any stage. Modifications might be in terms of changes to the data inputs to the map, or in terms of the visual representation of the map. So the GIS user can control, review or experiment at any stage in order to achieve a meaningful final output. All good GIS software will have a range of graphic display features to control factors such as label size, fonts, colour or shading ranges, line widths, symbolism, map feature positions, etc. And the format of the display is not confined to maps - it may be in graphical, tabular or textual forms. The early 1990's has witnessed a huge emphasis in the GIS field on the perceptual science of “visualization” - how we look at maps, what information is being conveyed, how people may each view the same mapped scene differently, different ways of communicating information, etc. For those interested further in visualization we would recommend Earnshaw and Wiseman (1992), Bonham-Carter (1994) or Hearnshaw and Unwin (1994).
The data display itself can be temporary or permanent. Temporary display is that which is captured on the VDU. It is the functional user interface in that the VDU shows the results of any commands which have been given via the keyboard, and interactive experimentation can then take place at no cost, at great speed and in an almost infinite variety of ways. Only when the user is satisfied with any temporary screen view need permanent output be obtained. Permanent output is usually by means of hardcopy display, as obtained by the use of any of the variety of devices described in Section 5.2.4, though it may also be permanently saved to an internal hard disk, or to some form of transferable disk or tape, or it may be sent to an alternative location via networking facilities. Hardcopy display is usually output to paper or film and may be in black and white or multi-coloured. The display will vary in quality as a function of the GIS capability, the detail of the data, the scale of mapping, the quality of paper, the use of vector or raster structuring or, most importantly, with the quality of the output hardware being used and the printing resolution (in dpi) to which it is set. The best quality digital output is now superior to that which can be achieved by manual methods.
In the near future it is likely that GIS output, and indeed its full range of functionality, will be capable of display and/or use aboard suitably equipped fisheries vessels. Many vessels now have quite sophisticated navigation systems which utilize electronic charts in conjunction with radar and plotting facilities. These can display a variety of static information covering bathymetry, navigational features, land masses, restrictive areas, etc, plus the tracks of moving vessels. It will be a simple progression to extend this functionality so that other desirable layers will be capable of being integrated and displayed in an interactive mode, i.e. such that the vessel has on-board ability to perform a range of required GIS functions.
Space prohibits a detailed look at databases or database management systems (DBMS), but further details relating DBMS to GIS can be found in Martin (1986), Austin (1989), Maguire (1989), Dale (1990), Date (1990), Batty (1992), Howells (1993) and Laurini (1994), and many other texts are specifically written on this topic. An introduction to GIS databases was given in sections 4.2 and 4.3, and we remind readers here that a database may be defined as a large collection of related data which has been structured in an ordered way but which is independent of any particular application. For GIS purposes this collection of data may be stored externally in digital form in a purpose created computer software database package, e.g. Oracle or dBASE, or internally in a database which forms part of the GIS software. Attribute data is frequently stored externally whilst the geographic data are more commonly stored within the GIS software.
Distributed databases are also commonly used in GIS and becoming more so. In this case the data may be held in disparate sources within or outside an organisation. Obviously, if a GIS has access to such databases then a vast range of extra data becomes available to the system. In the fisheries context, it is easy to envisage that a GIS could offer greater functionality if it could have direct access to oceanographic, meteorological and perhaps environmental databases, all of which were likely to be situated separately from the fisheries GIS. Laurini (1994) provides a useful overview of distributed databases, plus the many problems to be overcome before their use can become more universal. Anon (1993b) provides detailed information on how a major database, the Regional Maritime Database (BDRM), is being built to cover the marine areas along the West African coastline. This database will be capable of being accessed by any or all of the 10 participating countries, and all information is being specifically geo-referenced so that GIS functionality can be ensured.
A DBMS is a computer programme for creating, maintaining and accessing digital databases. There are a large number of commercial packages available for doing this. The DBMS provides the essential link between the GIS software, external data sources or graphics enhancing packages and any operations which the user might wish to perform. DBMS can work with different data types such as characters, numerals or dates; they have languages for describing or manipulating the data or for querying the database for particular pieces of information; they provide programming tools and they have particular file structures. Table 6.2 lists the principal features which a good DBMS should be capable of performing.
Table6.2 The Main Features Required of a Database Management System
|*||To create data bases which are in a carefully structured and consistently logical format.|
|*||To create new data bases.|
|*||To extract data from the database in a variety of ways.|
|*||To persistently and constantly execute any commands.|
|*||To display data as required.|
|*||To edit data in any requisite way.|
|*||To sort data.|
|*||To allow for the transfer of data between various software packages.|
|*||To protect data against loss, unauthorised entry, copying and destruction.|
|*||To protect against any inconsistencies which may result from multiple simultaneous use|
|of the database.|
|*||To be independent of particular hardware needs.|
The reasons for having a DBMS are fairly basic. Thus firstly, it is obvious that the data needs to be maintained in the sense of being kept up-to-date, correct, properly ordered, properly structured, etc. It also has to be managed so that it is properly understood, and that it is available how, when, where and to whom it is required. Storage structures must be constantly monitored so as to minimise storage space and to maximise searching efficiency. And since most databases are constantly growing, new fields might need adding or indeed old fields might need deleting. A database manager is usually assigned who can not only manage all of the above, but who can also regulate access to the database, cope with systems failures when they occurred and who is able to link databases with external databases if required. He or she may also be required to cope with the legal sides of data management, i.e. making certain that inaccurate decisions are unlikely to be made as a consequence of using his data, and making sure that copyright infringements are not made.
For a DBMS to be most effective it is important that the data is stored in an orderly way. There are basically four types of DBMS structural/storage methods: - hierarchical, networked, relational and object oriented.
(a) Hierarchical. In the hierarchical model each record can have a number of links to lower “levels”, but only one link to a higher level. The highest link is the “root”, lower levels are called “children” and levels above are called “parents”. Figure 6.13 shows a typical hierarchical data model for a hypothetical mapped area. Hierarchical data structures are easy to understand and to update or expand and they are useful where up and down searching is required, but they are not very good in circumstances where horizontal searching is carried out, i.e. where it might be necessary to locate all records which are at one level, since there are no connections at the same level.
Figure 6.13 A Hierarchical Database Structure Based on a Simple Map (after Bernhardsen, 1992)
(b) Networked. This is similar to the hierarchical data model but here it is possible to have more than one parent, and thus many-to-many relationships can be found. Figure 6.14 shows that a networked database structure can be analogous to a communications network where there may be many linkages between any combinations of centres. This type of data structure makes good use of the available data, with rapid connections being possible, but it is difficult to create and maintain. Both hierarchical and networked structures are now seldom used in GIS's.
(c) Relational. Here data is organised in a series of tables, each of which contains one type of record. The rows of the tables correspond to records and the columns to fields of the records. Each table in the database will be linked by a common field, otherwise called a unique identifier (or a key attribute). Data is extracted from the database by defining the relationship which is appropriate to the query being asked. This could well involve the use of relational algorithms in order to construct new tables if required. Figure 6.15 shows an example of the use of a relational database as it might apply in a fisheries situation.
Relational databases are very flexible and are easy to establish, use and maintain. Almost any relationship is possible to work out by the use of Boolean logic and mathematical operations, and additional data can easily be added to the database. The main drawback lies in the fact that searches can be very time consuming, i.e. given the calculations and other specifications involved and the huge numbers of tables which might be needed. The Standard Query Language (SQL) has been developed as the standard language for use with the relational DBMS, and this kind of database is the one most frequently used by GIS's. Floen et al (1993) describe in detail how a relational database has been set up in the Institute of Marine Research in Bergen, Norway, to access, query and manipulate all of their data holdings.
Figure 6.14 A Networked Database Structure Based on a Simple Map (after Bernhardsen, 1992)
(d) Object Oriented. This is a new type of DBMS which is slowly being developed, on a world-wide basis, mainly under the guidance of the Object Management Group (Leach, 1993). This group intends to make object oriented DBMS the norm for the future by ensuring that they are extremely efficient and that there is a world-wide common framework for their development so that they can work across all environments, on all hardware platforms and under any operating system. These databases take a sophisticated view of geographical entities and as such object oriented DBMS are conceptually difficult to explain and to understand. Basically all entities are considered as having the three fundamental concepts of: object, class and inheritance. For instance, geographic objects might be “road”, “port”, “sea”, etc. Each object can then in turn have a class such as “Secondary (road)”, “Fishing (port)”or “Shallow (sea)”, and they may also have sub-classes. Each object can also be defined as having certain properties, e.g. a fishing vessel may have an owner, a value, a size, etc., and each object may have certain functions (operations or methods) which can be performed on them, e.g. select, measure, locate, modify or draw. Inheritance means that when a class is added to an object, then all aspects of “class” can potentially be included in the new description, e.g. all aspects pertaining to “Fishing” can potentially be linked to the class of “Fishing port”. For those seeking a more detailed explanation of object-oriented DBMS, then or Laurini and Thompson (1992) or Cooper (1993) give reasonably easy definitions.
Figure 6.15 The Use of Relational Database for Integrating Fishing Vessels and Home Ports
Object oriented databases provide a vehicle for representing data in a form similar to that of the real world. The fact that data is realistically modelled makes user interaction easy and the DBMS is also easy to update and therefore to maintain. There are currently many problems which must be overcome with these databases before they become widely adopted (Batty, 1992), the main one being the lack of a standard object-oriented query language (Kufoniyi, 1995). However, whilst in the past relational databases have been most favoured by GIS users, the move is now towards the benefits of object orientation and this is likely to continue.