Previous Page Table of Contents Next Page


6. DATABASES AND DATA MANAGEMENT

This section introduces the concept of a relational database, illustrated by a hypothetical fisheries example.

A database is used to store the “raw data”. That means that pre-processing of data should not occur. For example, the interview data to be entered in the database are exactly the same values as those written in the interview forms by the enumerator. All processing of data should be executed by the database system. If processed data are entered, the tracing of errors - validation of data - becomes difficult if not impossible. In addition, re-analysis of pre-processed data may be impossible.

It is important to develop the database simultaneously with the development of the data collection programme, as the database acts as a check on the consistency of the data collection programme. The database will show if the programme actually produces the expected output. The database will also help in the search for the best stratification.

A “database” consists of “Tables”, “Forms” and “Reports”. These three main components are linked by aid of “Relations” and “Queries”. Below follows a very brief introduction to these basic concepts. For a more comprehensive introduction, the reader is referred to textbooks on databases (e.g. Arte 1988 and Date 1995).

6.1 TABLES, RELATIONS AND QUERIES

The “tables” are the containers of the data, the “forms” are the tools used to interact with the database and the “Reports” are the output produced by the database. A table has “rows” (records) and “columns” (fields). As an example, a part of the list of province names is shown, which has 2 columns.

NameLetter
Quang NinhQN
Hai PhongHP
Thai BinhTB
Nam HaNH
.........................
.........................
.........................
Kien GiangKG

The number of “Records” equals the number of coastal provinces of Viet Nam. A row is called a “record”, and each of them contains 2 “fields”.

You may also consider the fields as the columns. The fields of the records in “Province Name Table” are:

“Name” = The name of the province

“Letter” = The code letters for the province.

Tables A and B below contain some basic fisheries landing statistics, which might originate from the trip-interview forms.

The fields in the records are given “names” and the fields of these two tables are:

TABLE A: TRIP DETAILS AND TOTAL LANDINGS:
NAME OF FIELD
Field type
Vessel Registration CodeKey, text
Date of samplingKey, date
Name of Landing PlaceKey, text
Type of Landing PlaceText
Name of GearText
Total landings in KgNumber

TABLE B: LANDINGS BY COMMERCIAL SPECIES GROUP
NAME OF FIELD
Field type
Vessel Registration CodeKey, text
Date of samplingKey, date
Name of Landing PlaceKey, text
Commercial Species GroupKey, text
Weight of commercial species groupNumber

TABLE A: FISHING TRIP DETAILS AND TOTAL LANDINGS

Vessel Registration Code
(Key)
Date of sampling
(Key)
Name of landing place
(Key)
Type of landing place
(text)
Name of gear
(text)
Total landings in Kg
(number)
QN471111-3-1997Hong GaiJettyBottom Trawl132
NA999015-3-1997Hong GaiJettyBottom Trawl423
NA999020-3-1997Cua LoBeachBottom Trawl197
KH234517-3-1997Xom BongQuayTuna Long Line530
KH190017-3-1997Xom BongQuayTuna Long Line401
MH123424-3-1997Cua LoBeachGillnet514

TABLE B: LANDINGS BY COMMERCIAL SPECIES GROUP

Vessel Registration Code
(Key)
Date of sampling
(Key)
Name of Landing Place
(Key)
Commercial Species Group
(Key)
Weight of Landings
(number)
QN471111-3-1997Hong GaiSmall fish75
QN471111-3-1997Hong GaiLarge fish32
QN471111-3-1997Hong GaiSmall shrimps10
QN471111-3-1997Hong GaiLarge shrimps15
NA999015-3-1997Hong GaiSmall fish223
NA999015-3-1997Hong GaiLarge fish200
NA999020-3-1997Cua LoSmall fish97
NA999020-3-1997Cua LoLarge fish75
NA999020-3-1997Cua LoSmall shrimps13
NA999020-3-1997Cua LoLarge shrimps12
KH234517-3-1997Xom BongSmall fish230
KH234517-3-1997Xom BongLarge fish300

The fields are of different data types. We shall come back later to this topic, but here we shall only note that some fields are “key-fields” and other are “non-key-fields”. Bold Italics indicate the key fields. The combination values of a set of keys (i.e. three key fields in Table A and four key fields in Table B) is unique. That means that no two records in the database can have the same combination of (Vessel Registration Code, Date of Sampling, Name of Landing Place).

This example also illustrates the concept of “Relations” and “Queries”. Here the relation between the two tables is that for each combination of (Vessel Registration Code, Date of sampling, Name of Landing Place) in Table A, there are zero, 1 or more records in Table B with the same first three keys, namely one for each observation of the fourth key, Commercial Species Group.

A “query” is (for example) to join the two tables together into one big table containing all the data from a trip in one row (see below). The query in this case matches the records, which have the first same value of the three first keys. Furthermore, it groups the records so that rows with identical first three keys are together. This is just one example of what a “query” can do. When designing a data collection programme, it is important that the database designers are involved, to secure compatibility of database and interview forms.

Vessel Registration CodeDate of samplingName of landing placeType of landing placeName of GearTotal landings in Kg
Commercial Species GroupWeight of Landings
(Key)(Key)(Key)(Text)(Text)(Number)(Key)(Number
QN471111-3-1997Hong GaiJettyBottom Trawl132Small fish75
QN471111-3-1997Hong GaiJettyBottom Trawl132Large fish32
QN471111-3-1997Hong GaiJettyBottom Trawl132Small shrimps10
QN471111-3-1997Hong GaiJettyBottom Trawl132Large shrimps15
NA999015-3-1997Hong GaiJettyBottom Trawl423Small fish223
NA999015-3-1997Hong GaiJettyBottom Trawl423Large fish200
NA999020-3-1997Cua LoBeachBottom Trawl197Small fish97
NA999020-3-1997Cua LoBeachBottom Trawl197Large fish75
NA999020-3-1997Cua LoBeachBottom Trawl197Small shrimps13
NA999020-3-1997Cua LoBeachBottom Trawl197Large shrimps12
KH234517-3-1997Xom BongQuayTuna Long Line530Small fish230
KH234517-3-1997Xom BongQuayTuna Long Line530Large fish300
KH190017-3-1997Xom BongQuayTuna Long Line401No dataNo data
MH123424-3-1997Cua LoBeachGillnet514No dataNo data

6.2 FORMS OF DATABASE

The “forms” are the tools for all communication with the fisheries database. The “soft-forms” are the screen pictures through which you communicate with the database.

It is important that the forms are designed so that the data entry is “friendly” to the encoder. The paper interview forms should not be too different from the “soft form”. The encoders with their accumulated experience should contribute to the design of the database forms. Forms could also be developed for “palm-top” computers for direct field entry.

The forms are composed of “controls” and “text-boxes”. The “controls” are the tools used for communication with the computer and the “text-boxes” are used for giving information to the user. Controls may be the “scroll-bars” and the “toolbars”. When you click on a control, it is in “focus”. Only when a control is in focus you can use it.

This section, as an example, shows a small collection of forms from the “VIETFISHBase”. The commercial software behind VIETFISHBase is MS ACCESS, the database of Microsoft Office. Note that this particular choice of commercial software does not imply any recommendation on its suitability for fisheries databases from the side of the author or FAO. Many other databases could be used1.

The first form you see when starting the system is the “logon form” (Fig. 6.2.1), where you tell who you are, and then the system checks your status as user (i.e. the privileges you have as a user). For example, only a restricted number of operators will possess the right to change the program or to change the “look-up-tables”. The password is a secret code, only known by the user and the administrators. After the log-in form, you enter the main menu of the database (which is not shown). There are several items in the main menu of VIETFISHBase, which will not be discussed here. In this example, we shall use only one item from the main menu, namely the interviews main form, which is used for entry of data from hard forms or editing existing data. As an example of a well-designed data entry form, the “interview form” of VIETFISHBase is shown (Fig. 6.2.2).

Figure 6.2.1

Figure 6.2.1 Log-on form

The controls for data entry are either “boxes” where you can enter data from the keyboard, or they are “list-boxes” where you can select the data from a “look-up-table”.

The “interview form” is shown in Figure 6.2.2 as it appears on the computer screen. The interview form and its sub-forms are designed to handle all the data from a “fishing trip”. Only part of the elements of the interview form will be explained, as the purpose here is not to give a complete introduction to VIETFISHBase, but to use it as example of form design.

1 MS Access is only suitable for relatively small databases. You should review the software available with expert help and choose one suitable for the data collection programme envisaged.

Figure 6.2.2

Figure 6.2.2 The interview form of VIETFISHBase.

In the design of this interview form, it has been attempted to include as much as possible in one screen, without making the form too complex. Furthermore, it has been attempted to make the data entry as safe as possible, by using look-up tables to fill in the fields, whenever possible.

The interview form is the first form in a hierarchy of forms, and to enter the complete set of interview data, the forms in the following levels of the hierarchy must be activated.

The interview form is composed of one main form, “Sample information” and five sub-forms:

  1. Vessel information (which fishing vessel and which fishing gear);

  2. Spatial information (where did fishing take place, and where did landing take place);

  3. Effort information (what was the fishing effort, e.g. days away from port);

  4. Number of discard operations (what was discarded);

  5. Landings (what was the total weight of the landings).

The five sub-forms cannot be filled in before the sample information form is filled in. It is possible to omit the sub-forms, although at least one of the sub-forms, the Vessel information sub-form, must be filled in to make the record meaningful.

It may not be straightforward to find a specific interview in a set of interviews. (There were 34088 interviews in the database when this material was selected for illustration.)

The database has a special facility to search for a specific interview, or to “filter” the interviews, leaving only a sub-set of interviews, which satisfy certain conditions, for example, only interviews from Hai Phong province. You activate the search facility by the “binocular-control”, depicted on the left.
The “Copy” record is useful, because many of the data a particular encoder enters are the same for subsets of records. For example, the name of the landing place remains the same, for all interviews from that landing place. The “copy record” allows the encoder to change the data that differs between records, rather than type the record in again each time.
The delete control deletes the interview record.

Sample information form: This form does not contain any direct information on the fishery or the landings, only information on when, where and by whom the data sample was collected.

The encoder will not enter the name of the province from the keyboard, but selects it from a look-up table. The names of the enumerator and the encoder are also selected from look-up tables. Thus, only encoders and enumerators in the look-up table can be entered in these fields. This check of encoders is necessary for the security of the database.

Vessel information sub-form: This form is used to identify the vessel(s), which performed the fishing trip, and the gear(s) it used. The first information to enter is the registration number of the fishing vessel, therefore this form may also be used for control and updating of the central vessel register. It also alklow for entry of data on the gear.

Spatial information sub-form: This contains the name of the landing place and the location of the fishing ground, either by name or in the form of a “statistical rectangle” (rectangles of 30×30 = 900 square nautical miles) or subdivision of a statistical rectangle (9 divisions of each 100 square nautical miles). All statistical rectangles have a code. Also the depths (max and min) at the fishing grounds can be entered.

The landing place is selected from a look-up table, which also contains information on the position of the landing place (the division of a statistical rectangle) so that this combined with the fishing ground can be used to compute the distance between the two. The fishing grounds are also selected from a look-up table. All fishing ground names in this look-up table are defined by another table, which contains information on which statistical rectangles (or divisions) correspond to a fishing ground.

Effort information sub-form: The effort measurement may be of several types, but the most common one is the number of days away from port. In addition, the level of fishing activity (the effort per month), here measured as the number of days actively fishing during the last month, is entered in this sub-form. The interviews should preferably contain both the fishing days and the fishing activity during the last month.

Number of discard operations sub-form: “Discards” are catches not landed but returned to the sea because of limited hold capacity or low value of the catch. The discard information is given by the type of discard (all fish discarded, only low value discarded, etc.) and the number of discard operations “Dsc.Oper”, (say trawl hauls) and the estimated % of total weight “Dsc Weight”, which is discarded.

The interview form is the first in a hierarchy of three forms to record the landings, at various levels of details. Clicking the mouse on the button “Commercial group Details” brings you the pop-up sub-sub-form for entry of information about the commercial groups.


Previous Page Top of Page Next Page