|
Ray Wurlod |
Master Class |
|
ABN 57 092 448 518 |
|
|
Education and Consulting Services |
UniVerse® |
UniVerse File Dictionaries
Relationship
Between Dictionary and Data Portions
Associated
Multi-Valued Fields
Storage Locations in Dictionary
Creating
Entries in Prime Style Dictionaries with REVISE
Creating Entries
in Pick Style Dictionaries with REVISE.
Creating A or
S type Entries in Prime Style Dictionaries with REVISE
Creating Dictionary
Entries with ED
Creating
Dictionary Entries with SQL
Having completed this master class, the student will be able:
Prerequisites
It is assumed that the reader is familiar with the RetrieVe query language, the data entry/update tool REVISE and the UniVerse line editor ED.
Notices
UniVerse® is a registered trademark of International Business Machines Corporation.
Windows® is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
UNIX® is a
registered trademark in the United States and other countries licensed
exclusively through X/Open Company Limited.
Other
company, product, and service names used in this publication may be trademarks
or service marks of these companies or others.
When a UniVerse file is created using command defaults, three things come into existence. These are the data portion of the file, the dictionary portion of the file, and the VOC entry that describes the physical location of the other two, as well as specifying the name that users use to access that file.
The data portion of the file, for most file types, contains nothing but data. The exception is the B-tree (Type 25) files, in which some navigational information is stored. Distributed (Type 27) files are not created but, rather, defined as being made up from one or more extant part files.
A UniVerse file dictionary can contain a number of different types of entry, but these can be grouped into four classes:
It is possible, by using keywords to override the command defaults, to create only the data portion or only the dictionary portion of a UniVerse file. It is also possible, by copying the VOC entry, to create a synonym for a UniVerse file name. These techniques are beyond the scope of this paper.
If a UniVerse table is created using a CREATE TABLE statement, the underlying file dictionary is automatically populated with entries that describe the actual data. If a UniVerse file is created using a CREATE.FILE statement, only one entry is added to the file dictionary. The name of this entry is @ID, which is the default name for the file's key (primary key) field.
The purpose of this paper is to describe in detail the various types of entry that can be found in a UniVerse file dictionary, and to show different ways in which these can be created and maintained.
In hashed files and B-tree files, non-key data are stored in a structure called a dynamic array. This is an array containing an arbitrary number of elements each of arbitrary length. There is one dynamic array per record. Physically, a dynamic array is simply a character string with special, reserved, delimiter characters used to separate the elements. The key value is not stored in the dynamic array, because the key value is needed to find the dynamic array. (The key is stored with, but not in, the dynamic array.)
In Type 1 and Type 19 files, which are operating system directories, data records are operating system files which are also dynamic arrays, except that the field delimiters are "end of line" characters. The record key is the operating system file name.
Thus, a data record that contains four non-key data fields (or columns) will be a dynamic array containing three delimiters. This could be represented as follows.
8808¶3456¶112¶7
In this example, the ¶ character has been used to represent the standard delimiter. The actual character is a non-printing character called a "field mark" or "attribute mark" (the two terms are synonyms in this context). It can be referred to using the system variable @FM (or @AM).
Note that you have no way, by simple inspection, of knowing what the data in any particular field means; it is the function of the file dictionary to store the metadata that allow for this.
UniVerse supports the concept of some fields being "multi-valued"; that is, a single field contains a list of zero or more values. This could be represented as follows:
|
8808 |
3456 |
112 |
7 |
|
418 |
4 |
||
|
704 |
1 |
Here, field number 3 and field number 4 each contains a list of three values. If there exists a one-to-one relationship between the values in each of these lists, the multi-valued fields are said to be "associated". Physically these must be stored in a single character string, so that the physical storage model uses another delimiter character, called a "value mark".
8808¶3456¶112§418§704¶7§4§1
In this example, the § character has been used to represent the "value mark" delimiter character. It can be referred to using the system variable @VM.
Further nesting is possible. An individual value in a multi-valued list can have sub-values (delimiter character "sub-value mark", @SM), then text items ("text mark", @TM), and so on. Theoretically UniVerse supports an eight dimensional data model but, practically, humans have problems conceptualising beyond three.
The file dictionary entry provides two things for a particular field; how it is stored and its default output characteristics.
To define how a field (or attribute) is stored is simply a matter of identifying its field (attribute) number. For "D" type, "A" type and "S" type dictionary entries, this is an integer value in field number 2 of the dictionary entry. For virtual fields (columns), this is an expression (in field number 2 of an "I" type dictionary entry) or correlative (in field number 8 of an "A" type or "S" type dictionary entry).
The remaining visible fields in the dictionary entry specify defaults for how the field is to be displayed, should it be selected in a query and these defaults not overridden by field qualifier keywords. Aspects of this display information include:
Field qualifier keywords (EVAL, CONV, COL.HDG, FMT, ASSOC, ASSOC.WITH, AS, SINGLE.VALUED, MULTI.VALUED or their synonyms or SQL variants) can be used in queries to override any of these except SQL data type.
Alternate dictionary entries may refer to the same item of data, but with different specification as to how it is to be displayed.
UniVerse supports two "styles" of file dictionary, which may be termed "Prime" style and "Pick" style. The following table summarises the differences between the two.
Table 1: Prime Style versus Pick Style Dictionaries
|
Characteristic |
Prime Style |
Pick Style |
|
Definition of actual data |
D type |
A type or S type |
|
Definition of virtual data |
I type |
A type or S type with correlative |
|
Associated multi-valued fields |
Association phrase |
Controlling and dependent attributes |
|
Dictionary file name |
D_filename |
P_filename |
|
"Dictionary of dictionaries" |
DICT.DICT |
DICT.PICK |
In UniVerse file dictionaries you can, in general, mix and match Prime style and Pick style items, though doing so can lead to messy dictionary listings. The exception to this statement is that virtual field definitions of one style cannot use, as components, virtual field definitions of the other style.
Both styles of dictionary support phrase type and storage type entries.
By default, the Prime style dictionary is used in RetrieVe queries. In many sites using Pick, Reality or IN2 flavors, "A" and "S" types are created in Prime style dictionaries to take advantage of this. To require RetrieVe to use a Pick style dictionary, a USING clause is required. For example:
LIST filename USING PDICT filename
A Pick style dictionary can most easily be created for an existing file using the PDICT keyword. For example:
* Ideal, PI/open or Information flavor
CREATE.FILE PDICT filename 3 1 2
* Pick, Reality or IN2 flavor
CREATE-FILE PDICT filename 1,2,3
The resultant VOC entry contains the pathname of the data portion in field number 2, the pathname of the Prime style dictionary portion in field number 3, and the pathname of the Pick style dictionary portion in field number 5. (Field number 4 can contain "M" for multi-level data files.) For example:
id: filename
0001: F Description of file
0002: filename
0003: D_filename
0004:
0005: P_filename
A default listing of the contents of a file dictionary may be produced using the LIST verb followed by the filename preceded by the DICT keyword.
LIST DICT filename
A different format of listing can be had by preceding the filename with the PDICT keyword. This format is more suited to Pick-style dictionary definitions. It requires that a Pick-style dictionary has been created for the file, and recorded in field number 5 of the file's VOC entry.
LIST PDICT filename
In both cases a default listing is produced. The fields to be included in the report are specified in the global "dictionary of dictionaries" which, for the DICT keyword, is a file called DICT.DICT and, for the PDICT keyword, is a file called DICT.PICK. Both DICT.DICT and DICT.PICK are visible from all accounts; that is, there is a VOC entry for each in every account.
It is the phrase called "@" in DICT.DICT or DICT.PICK that specifies the default output listing when a file's dictionary is requested to be listed to the screen, and the phrase called "@LPTR" that specifies the default output listing when the file's dictionary is requested to be listed to a printer using the LPTR keyword.
You can quite acceptably generate non-default dictionary listings for particular purposes, simply by constructing reports that name the fields to be displayed, etc. For example:
LIST DICT filename WITH TYPE = "I" EXP FMT "48T" DBL.SPC
UniVerse/SQL can also be used to obtain a listing of a file dictionary, again by preceding the file name with the DICT keyword. (You can use PDICT, but this does not list the Pick-style dictionary.) However, unlike the RetrieVe listing, you must explicitly specify how the report is to be sorted. For example:
SELECT * FROM DICT filename ORDER BY TYPE, KEY.CODE ;
As noted earlier, entries may exist in the file dictionary portion that describe actual data fields in the data record, virtual fields that are generated when a query is executed, and phrases that contain oft-used subsets of queries (as a labour-saving device). There may be some special phrases that are used by UniVerse itself. There may also be locations in which information pertinent to the file as a whole are stored. The format and layout of each of these types of dictionary item are discussed in this section.
In order to be accessible to queries a field in a data record must be defined with an actual data type entry ("D" type, "A" type or "S" type) in the file dictionary. A file dictionary can have more than one definition describing the same field; since it may be desired to present the same data in different ways (perhaps a wider column if the output is to go to a wider device, such as a printer).
Virtual fields are entries in the file dictionary that specify how values can be generated for display in columns of reports. They are not necessarily synonyms for actual data fields (though they may be), nor do they necessarily use actual data field values in generating their own values (though they usually do).
Phrases are fragments of sentences without verbs. Since UniVerse commands can also be called sentences, a phrase can be any fragment of a UniVerse command. Phrases stored in file dictionaries typically contain frequently used combinations of field names, selection criteria, sorting criteria, report options and output options, so that the user can type in just the phrase name. The RetrieVe query engine substitutes the phrase definition for the phrase name when parsing the query. (Phrases are not able to be used in UniVerse/SQL queries.)
The UniVerse file dictionary model includes another type of dictionary entry, which is used to contain any information that may pertain to the file as a whole. Except for the requirement that the first character in the record is an "X" (from which such storage locations are termed "X-type" entries), there is no requirement about how X-type entries should be laid out.
Actual data can be described by "D" type entries (the default for Prime style dictionaries), or "A" or "S" type entries (the default for Pick style dictionaries).
"D" type dictionary entries are up to eight fields long, and are laid out as follows.
Field Name |
Field Number |
Contents |
|
CODE |
1 |
"D" followed by optional text description of item |
|
LOC |
2 |
Field (attribute) number in which item is stored |
|
CONV |
3 |
Conversion code |
|
NAME |
4 |
Column heading |
|
FMT |
5 |
Format specification |
|
SM |
6 |
"S" for single-valued, "M" for multi-valued |
|
ASSOC |
7 |
Association phrase name |
|
DATATYPE |
8 |
SQL data type |
The "D" in field number 1 and the location in field number 2 are mandatory. A dictionary entry describing the entire key has 0 (zero) for the location.
If conversion code is omitted, no conversion is applied to the data in this field on output.
If column heading is omitted, then the field name (the key to the dictionary file) is used as the column heading.
If the format specification is omitted, "10L" (10 characters wide, left-justified) is used.
If neither "S" nor "M" appears in field number 6, "S" is assumed.
Association phrase name is only relevant if there is an "M" in field number 6, and must be the name of a phrase type in this dictionary or in the VOC file.
SQL data type is specified (in field number 8) without parentheses, for example "VARCHAR,32" rather than "VARCHAR(32)".
"A" type and "S" type dictionary entries are ten fields long, and are laid out as follows.
Field Name |
Field Number |
Contents |
|
D/CODE |
1 |
"A" or "S" followed by optional text description of item |
|
A/AMC |
2 |
Field (attribute) number in which item is stored |
|
S/NAME |
3 |
Column heading |
|
|
4 |
Controlling/dependent attribute definition |
|
|
5 |
"S" for single-valued, "M" for multi-valued |
|
|
6 |
SQL data type |
|
V/CONV |
7 |
Conversion code |
|
V/CORR |
8 |
Correlative (not used for actual data definitions) |
|
V/TYP |
9 |
Justification ("L" = left-justified, "R" = right-justified) |
|
V/MAX |
10 |
Column width for display |
The "A" or "S" in field number 1 and the attribute number (A/AMC = "attribute mark count") in field number 2 are mandatory.
If column heading is omitted, then the field name (the key to the dictionary file) is used as the column heading.
If there is nothing in field number 4, this item is assumed not to be associated with any other multi-valued field. Otherwise this field contains the definition of a set of associated multi-valued fields:
If neither "S" nor "M" appears in field number 5, "M" is assumed.
SQL data type is specified (in field number 6) without parentheses, for example "VARCHAR,32" rather than VARCHAR(32).
If conversion code is omitted, no conversion is applied to the data in this field on output.
The use of field numbers 5 and 6 in the dictionary record is a UniVerse extension to documented Pick standards.
Two ways exist to generate columns in output in UniVerse. "I" type entries contain a BASIC expression that is used to generate values. This expression is compiled, and the mini-program stored within the dictionary record. "A" and "S" type entries may contain a correlative, which is used to generate values. Correlatives are interpreted rather than compiled. The two types are incompatible: that is, an "I" type expression can not include reference to an "A" type or "S" type entry that contains a correlative, and a correlative can not refer to an "I" type.
"I" type dictionary entries are up to eight fields long, and are laid out as follows.
Field Name |
Field Number |
Contents |
|
CODE |
1 |
"I" followed by optional text description of item |
|
EXP |
2 |
Value-generating expression |
|
CONV |
3 |
Conversion code |
|
NAME |
4 |
Column heading |
|
FMT |
5 |
Format specification |
|
SM |
6 |
"S" for single-valued, "M" for multi-valued |
|
ASSOC |
7 |
Association phrase name |
|
DATATYPE |
8 |
SQL data type |
The "I" in field number 1 and the expression in field number 2 are mandatory.
Apart from these you can see that the layout is the same as for "D" type entries.
When the "I" type item is compiled, additional fields are used:
The information in these fields (20 and beyond) is binary, not text, so should not be viewed with a text editor. You can use the UniVerse line editor in "up arrow mode" to view these fields safely.
Before "I" type dictionary items can be used, the expressions in them must be compiled. This is accomplished with the COMPILE.DICT command (it has a short form, CD). If the "I" type is to be used in a query and it is detected that it has not been compiled, UniVerse will attempt to compile it automatically. However you should not rely on this behaviour, because if the first use of the "I" type is in a BASIC program, it will not automatically be compiled.
"A" type and "S" type dictionary entries containing correlatives are ten fields long, and are laid out as follows.
Field Name |
Field Number |
Contents |
|
D/CODE |
1 |
"A" or "S" followed by optional text description of item |
|
A/AMC |
2 |
Field (attribute) number in which item is stored |
|
S/NAME |
3 |
Column heading |
|
|
4 |
Controlling/dependent attribute definition |
|
|
5 |
"S" for single-valued, "M" for multi-valued |
|
|
6 |
SQL data type |
|
V/CONV |
7 |
Conversion code |
|
V/CORR |
8 |
Correlative |
|
V/TYP |
9 |
Justification ("L" = left-justified, "R" = right-justified) |
|
V/MAX |
10 |
Column width for display |
Every field in the dictionary item is the same as for "A" type or "S" type items without correlatives, except that there is a correlative in field number 8.
There are two kinds of correlative, "A" correlatives and "F" correlatives.
They are two quite distinct "languages" from each other.
Arithmetic in correlatives is strictly integer arithmetic. That is, for example, dividing the value 7 by the value 4 will yield 1 with a remainder of 3, rather than 1.75.
A phrase in a UniVerse file dictionary has the following layout:
Field Name |
Field Number |
Contents |
|
CODE |
1 |
"PH" followed by optional text description of item |
|
MACRO |
2 |
Phrase contents |
Phrases are principally labour saving devices. They are used to store commonly used sentence fragments in the file dictionary, such as:
A user who needs to replicate the contents of a phrase need only use the name of the phrase in a query; the query parser will replace the phrase name with the phrase contents, thereby reducing the amount of typing the user has to do and thus the likelihood of error.
Phrase contents can be split over multiple lines (fields) for ease of maintenance, by ending each incomplete line with a continuation character (underscore). For example:
id: ITEM.REPORT
0001: PH
0002: ID.SUP _
0003: ORDER.NO CUST.NO _
0004: PROD.NO QTY SELL ITEM.TOTAL _
0005: HEADING "Customer orders by individual item'GTG'Page 'SGLL'"
There are several special phrase names that are of particular meaning to UniVerse, which are listed in the next section. However, they have the same layout as already described.
There are several special phrases that are used by UniVerse to keep track of how particular things are to be done. The names of each of these special phrases begins with an "@" character.
Table 2: Special Phrases in UniVerse Dictionaries
|
Phrase Name |
Description |
|
@ |
Defines the fields that will be used in the output of a query if no fields are specified for display. |
|
@LPTR |
Defines the fields that will be used in the output of a query if no fields are specified for display, and the query is directed to a printer using the LPTR keyword. If there is no @LPTR phrase, the @ phrase is used. |
|
@KEY |
Specifies the names of fields making up a multi-column primary key. |
|
@SELECT |