The TurboIMAGE/XL FAQ
Hosted by Beechglen Development Inc
www.beechglen.com


Last Updated: Wednesday, November 25, 1998

This is the TurboIMAGE/XL FAQ. It is meant to be a guide for beginning 
programmers and data base administrators. NOTE; THIS IS A WORK IN PROGRESS
any input, updates, or corrections are most welcome, please send them to: 
imagefaq@beechglen.com 

Another terrific on-line resource for the HP3000 is the HP3000 FAQ. It can be 
located at http://www.3k.com/index_publications.html Any questions or comments 
should be sent to rcb@3k.com.

An indispensable "On-line Encyclopedia for HP Users" can be found at 
http://www.robelle.com/library/smugbook/

The TurboIMAGE/XL Database Management System Reference Manual can be found at
http://docs.hp.com:80/dynaweb/smpe/b1019/b840/@Generic__BookView 



TABLE OF CONTENTS

1.0 How do I get started? (Introduction)

2.0 What are some of the fundamentals of data management?
2.1 Placement of data on computers  
2.2 Data management laws 
2.3 History of data management 
2.4 How files are accessed 
2.5 Data base technologies 

3.0 What is the Image Data Base Nomenclature?
3.1 Image terms and vocabulary 
3.2 Types and relation of files 
3.3 Record formats 
3.4 Major subsystems 

4.0 How do I define and process a Schema?
4.1 The data base program 
4.2 Capacities 
4.3 DBschema 
4.4 DButil 

5.0 What are the Data Base Access Intrinsics?
5.1 Parameters and data types 
5.2 DBopen & DBclose 
5.3 DBlock & DBunlock 
5.4 DBerror & DBexplain 
5.5 DBput & DBdelete 
5.6 DBget & DBfind 
5.7 DBupdate & DBcontrol 
5.8 (Dbbegin,DBxbegin) & (Dbend, DBxend)
5.9 DBxundo
5.9 DBinfo

6.0 What are the Utilities and Tools?
6.1 DBSTORE & DBRESTORE 
6.2 DBLOAD & DBUNLOAD 
6.3 Capacity, structure change, and diagnostic utilities 
6.4 Logging and recovery 
6.5 DBDRIVER 
6.6 MPEX 
6.7 DBLOADNG/Howmessy 

7.0 How do I begin the design process?
7.1 Defining the data elements 
7.2 Building hierarchical structures   
7.3 Folding hierarchies
7.4 Prototyping
7.5 Walk Through

8.0 How do I improve on an existing design?
8.1 Common design situations 
8.2 Moving data about the shell game 
8.3 Critical design assumptions 

-------------------------------------------------------------------
 
1.0 How do I get started? (Introduction)

Most people initially learn Image by a slow process of osmosis 
from their current application environment.  Thus the success of 
the initial learning process usually depends on the skill and 
consciousness of the people that created and maintain this 
environment.

The development, maintenance, and access of Image data bases can 
be viewed from many perspectives, the programming aspects, the system
management aspects, Operating System aspects, and the user access aspects.
This could also be stated as the push-pull relationship between those 
who create data bases and those who must live with them and keep 
them working on a day-to-day basis.

There are obvious "trade offs" and gray areas between these various
types of activities. On one hand, the push is to get the application
developed and turned over into production as soon as possible, and on
the other hand the priority is to get as many users on the system as 
possible while providing reasonable response times.

It is therefore extremely important that both perspectives understand 
the limitations that Image places on the other. This is the only way
to achieve a successful, cost effective balance between development 
and production.
 
One way I've heard this simply stated is:

"Image" is a do-it-yourself brain surgery kit, you could make 
yourself as smart or as dumb as you want. It all depends on what 
you know."  Fred White, ADAGER.

Specific to Image, there are many features that can be abused by 
developers that can prove to be a nightmare for those who have to 
perform maintenance. On the other hand there are many maintenance 
utilities and subsystems that can be very useful to developers 
and thus speed up implementation times.

Actually to most people, be it applications or system management 
orientated, Image is a myth, and data bases past and present are 
legends. The first myth is shown in the three names that Image 
goes by.  Image, TurboIMAGE, and TurboIMAGE/XL. Image was the 
original name, Turbo Image was a major rewrite of Image that was 
prompted by the larger number of users supported on the HP3000 
Series 6x's, and 70's. And, Turbo Image/XL is a third generation 
rewrite of Image prompted by the migration to the MPE/XL versions 
of the HP3000's.

From a development and system management perspective, these 
different versions of Image are very compatible. So, throughout 
this FAQ only the differences will be pointed out and they will 
be collectively referred to as simply Image.

Another aspect of the Image myth is that it becomes all things to 
all people. Like any data base management application, Image has 
its strong points and its weak points. The latter sections of 
this FAQ describe some of the ways that application and data 
base designers can get into trouble attempting to overcome 
Image's weaker points.

A final comment on the Image myth is that many system managers 
and programmers are very superstitious about their data bases. 
They feel that Image is built of many black boxes, mirrors, magic 
dust, and secret incantations the details of which are known only 
to wizards and high priests (HP).

This FAQ methodically breaks down these black boxes explaining 
the components, sweeps out the magic dust, and teaches you the 
appropriate time and places to invoke the secret incantations. 
And, this will, of course, make you an Image wizard in the 
process.

------------------------------------------------------------------- 

2.0 What are some of the fundamentals of data management?

2.1 Placement of data on computers

The key reason that data is placed on computers is for the 
ability of the computer to retrieve, sort and summarize. One of 
the most important aspects of any scheme for sorting, retrieving 
and reporting is that data should only be sorted on a computer if 
it has some need to be recalled at a later date, and all data 
stored should be time stamped so that at some future date, its 
usefulness can be determined.

Data that is stored in any fashion that is not likely to be 
retrieved and is not obsolete over time will add to the overhead 
and expense of the more heavily accessed data.

2.1.1 removed

2.1.2 The value of information as a corporate asset

The value of information as an asset to the company is dependent 
on three factors: capacity, reliability and security. Capacity is 
the capability of the computer to meet the information demand. 
Information is only valuable if it can provide cost savings or 
business improvements on a timely basis. Reliability is the 
quality measurement for a computer's ability to meet information 
demands on a consistent basis. And security is the attribute 
which measures the ability to grant access to information on a 
need to know basis.

2.2 Data Management Laws

There are certain rules of thumb, or laws, that can be described 
in general terms and apply to many situations, including data 
management.  The following are my top four "laws" of data 
management. I hope you find them amusing and/or insightful.

2.2.1 To err is human, to really foul things up requires a data 
base.

All methods for data storage and retrieval are prone to error 
situations. We simply haven't got the technology to the point 
where it is infallible, and certainly evolution hasn't done much 
to alleviate the problems of human error. The point is that all 
data base systems must be designed with technical and human 
failures in mind. To this end the KIS (keep it simple) principle 
is the only weapon, because the failure rate of any system 
increases directly with its level of complexity.

2.2.2 Separating keys and pointers from data for faster retrieval

The cornerstone of information storage and our information 
society is the index file. We use them every day, in hundreds of 
different ways, woven into every day life. My favorite example is 
the card catalog at the library. This index allows us to search 
on three possible values. author, title, and subject to quickly 
determine if a book or books exist and their relative location in 
the library. The alternative method of scanning each book on each 
shelf would be many times more time consuming than using the card 
catalog. The card catalog is not free and has its drawbacks.
 
Cards must be added and removed as books are added and removed 
from the library, and the books must be arranged in a specific 
order on the shelves. If you ever want to get on the bad side of 
a librarian, just put a book back on the wrong shelf.  In this 
case the work expended in maintaining the card catalog and the 
location of the books is justified by the need to speedily find 
and retrieve books of interest.

Computerized data management schemes use the same principle of 
separating the index information from the data to allow faster 
access to the specific records of interest. While computerization 
allows multiple indices into the data, it should be noted that 
the marginal value of additional indices drops significantly 
after two or three, while the overhead to maintain the index is 
usually linear.

2.2.3 It's when you do the work that counts

The decision of which items to index on and how to maintain the 
data is highly application dependent. If an application's primary 
function is to produce batch reports, then building in indices 
that are maintained as each record is added or deleted is 
probably a waste. On the other hand, an application is required 
to retrieve transaction information based on a variety of 
possible key values, then it probably makes sense to maintain 
indices for each of the possible key values.

2.2.4 There is no such thing as a free lunch

All throughout  the following discussions of  general data 
management and specific HP3000 and Image issues we will find many 
trade offs of functionality for a given expense. Generally, the 
more functionality built into an application, the greater the 
expense in terms of computer resources required. The only magic
trick is to match the functionality of the application with the 
capabilities of the system and data base software.

2.3 History of Data Management

The only way to completely understand the state of the art in 
data management, we must first discuss the evolutionary history 
from the past to the present. This will, hopefully, clear up some of 
the seemingly confusing terms and vocabulary.

2.3.1 Punched card files

The first computers used punched cards to store and retrieve 
information. There were two  ways to retrieve information, 
depending upon whether the file of cards was sorted or not. If 
the card file wasn't sorted then to find a specific card, the 
entire file would have to be read to find each card of interest.

If however, the card was sorted on the item of interest, then we 
could scan the file until we passed the point where the cards of 
interest would have been if present. The sorting of the cards 
really paid off in the ability to summarize and report for 
different occurrences of the key value.

2.3.2 Tape files

The next innovation in data management was the use of magnetic 
tape to store greater volumes of information and allow much 
faster scanning of individual records. The tapes still did not 
allow indexing, as you still had to read from the beginning of 
the file going forward until you selected the records of 
interest.

2.3.3 Disc files

After magnetic tape, the next innovation was to spread the 
Magnetic material on discs and allow the read write head to move 
over the data. This finally allowed the ability to read records 
in any sequence. A good analogy would be a tape player versus a 
compact disc. On your tape player you must listen to each song in 
sequence on the tape (unless you manually rewind), but on your 
compact disc you can choose which songs and in which sequence you 
would like them to play. This capability is called random access. 
The next song or record played does not have to be physically 
adjacent to the previous.

2.3.4 Random access

Random access capability finally allowed computers to take 
advantage of indices. As in the library example, key files were 
built that had pointers to the location on the disc where the 
data record of interest could be found. These pointers are then 
used to directly read the desired record. The only aspect which 
is random is the fact that the record could be anywhere on the 
disc. The time to read a given record is made up of three 
components. the time to locate the read/write head over the data, 
the time to wait for the disc to spin so the data record of 
interest is under the head, and the time to transfer the data 
from the magnetic media to the computer. The major portion of the 
total time is the time to locate the head over the data.

2.3.5 RAM and ROM discs

As computers have become increasingly more powerful, the demand 
for disc IO speeds has also grown. This is because all computers 
from mini's to mainframes wait for disc I/o's at the same speed. 
One way to increase disc IO speed is to have more disc drives. 
Another way is to store the most active data in the RAM (random 
access memory) memory of the computer and only write it out to 
disc when the record is changed. This is required because the in 
memory version of the record would be lost in the event of a
power failure, while the disc version of the record will be 
maintained until it is overwritten.

ROM (read only memory) discs are becoming available that store 
huge quantities of information on a read only basis. These discs 
have many indices built in for fast retrieval of data.

2.4 How files are Accessed

Now that we've discussed a little of the history, we can get down 
to the specific methods used by Image to access data. It is very 
important to understand these access methods as they form the 
very foundation for any further discussion.

2.4.1 Serial access

Serial access is the reading of a file from front to back, or 
back to front. The key connotation is that the file is not sorted 
in any fashion, so that all records must be read to determine if 
the specific ones of interest are found.

2.4.2 Sequential access

Sequential access is the reading of a file from front to back or 
from back to front, but the file is specifically sorted so that 
the search can be discontinued when the point at which the 
records should have been has past.

2.4.3 Directed access

Directed access is the ability to read a specific record number 
in the file, without having to read any other preceding records. 
In and of itself, this isn't very useful unless the key value is 
also the relative record number, or directed access is used in 
conjunction with calculated or chained access. The case of the 
key value being the relative record number can be exemplified by 
having the customer number be the record number in the file, thus 
to retrieve a customer record we could directly read the record 
at the customer number location in the file with one read.

2.4.4 Chained access

Chained access is simply the storing of relative record numbers 
with the data as pointers to other logically linked records. For 
example, in our customer record which we accessed by customer 
number, we could store the relative record number of the first 
order record in the orders file and in each order record store 
the relative record number of the next order record for that 
customer, thus forming a linked list or chain of entries for each 
customer.

2.4.5 Calculated access

The simple case for the index file is that the key is also the 
relative record number. This, while simple to implement, leaves 
something to be desired when selecting possible key elements. 
What if we wanted to key on something other than a number, i.e. 
customer name. What if the number we wanted to key on was too 
large of a range for the disc space available, i.e. social 
security number.  What  is required is a  calculation for 
generating a relative record number for these keys such that the 
result is between 1 and the capacity of the file. This process is 
called hashing and its purpose is to allow direct reading of the 
index file record which is of interest.



        +-----------------------------------------------------+
Data    |     |     |     |     |     |     |     |     |     | EOF
Records | A00 | Z33 | B01 | U23 | M91 | P00 | F55 | T07 | A00 | Mark 
        +-----------------------------------------------------+
Relative   0     1     2     3     4     5     6   .....   n
Record                                         ^
Number                                    Direct Access
                                          Read Record #6

Serial >--^------^-----^-----^-----^-----^-----^-----^-----^-- 
Access  Find Record With ID # = A00 Must Read Each Record



2.5 Data Base Technologies

Different data base technologies are no more than the packaging 
of a standardized method of accessing data. The package will only 
be successful if it meets the tests for capacity, reliability and 
security. There are many different data base management packages 
on the market today. They can be classified into three general 
categories called models. hierarchical, relational and network. 
These models refer to the fashion in which indexing is used to 
link different logically related files via pointers and chains.

2.5.1 Hierarchical models

In the hierarchical models of data bases, the files are linked 
from top to bottom, left to right, to form a single master 
logical record. This forms a tree structure, and there is a 
single master file that is called the root to which all of the 
remaining files are related either directly or indirectly. In 
this model all the indices or relationships of the files in the 
tree structure are fixed and cannot be changed without unloading 
all of the data, modifying the data base definition and reloading 
the data. Hierarchical models lend themselves to very large 
transaction processing and batch reporting applications.
 
2.5.2 Relational models

In relational data base models the indexing or relation of one 
file to another is more flexible. Indexes can be built and 
deleted on an as required basis. This is done at the user level 
by linking two or more files by a common field. This can be a 
very powerful tool for application environments where the amount 
of data is small and the information requirements are very fluid.

2.5.3 Network models

The network model is similar to the hierarchical model except 
there can be more than one root file. Image is a two level 
network model. The first level of files are called master data 
sets, and the second level of files are called detail data sets. 
A data set is the Image term for a file.

The master data set serves as an index into up to sixteen detail 
data sets, and the detail data set can have indices from to one 
to sixteen master sets.

------------------------------------------------------------------- 

3.0 What is the Image Data Base Nomenclature?

3.1 Image terms and vocabulary

In order to appear to have some unique corner on data base 
technologies, the marketers of various data base software will 
invent a new set of acronyms and vocabulary to make the old and 
common place seem new and worth paying for. So, each data base 
package seems to have its own jargon, and it may help to equate 
these image terms with some other ones you may be more familiar 
with.

Capacity - The number of records or entries a data set may contain.

Chain - A set of logically related records in a detail data set 
which are linked together by pointers to the previous and next 
entry.

Chain Head - The pointers in the master data set that point to 
the first and last entry in the related detail that contain the 
logically related entries.

Data Entry - A record in the data set made up of data items or 
fields.

Data Item - The named field of a given data type.

Data Base - The set of related files or data sets.

Hashing - The calculation that accepts a key value as input and 
generates a relative record number between 1 and the capacity of 
the index file.

Path - The physical relationship between the master data set and 
the linked records of a detail data set.

Relative Record Number - The number which represents the distance 
from the first record in the file to the current record in 
count sequence.

Search Item - The data item whose content is used to logically 
form a path from the master to the linked list of detail records.

3.2 Types and relation of files

3.2.1 Manual masters

Image master data sets are index files which can optionally hold 
data items. The master file can only have one key value and this 
entry must be unique. This is because key access to this file is 
via a calculated access based on a hashing algorithm.
 
There are two kinds of master data sets: manual and Automatic 
(kind of like cars). Manual masters can have additional data 
items besides the search item. The name "manual" comes from the 
fact that these data entries must be written and deleted by an 
application program. Manual masters work well if there are a few 
small fields and there will never be a need to access the records 
on one of the non-search item fields. The non search item access 
would require a serial search of the entire master data set.

3.2.2 Automatic masters

The Automatic master cannot have any non-search item fields as the 
data entries are added and deleted as required by Image to serve 
as index records for detail data sets. Automatic masters are 
easily added and deleted from the network design using any one of 
the Image utility packages.

Both Automatic and manual masters can contain chain heads, 
pointers to logically related detail data set records. This is 
the real power of Image as it allows the ability to make subsets 
of entries in the associated detail data sets. This chain head 
holds the relative record numbers and count for the first and 
last entries of the linked list of detail records.

3.2.3 Detail data sets

Detail data sets are used to hold records which are accessed via 
the linked lists forming chains or in front to back or back to 
front serial fashion.

3.2.4 Master to detail relationships

The value of the path from a master to a detail data set depends 
entirely on the average number of detail records tied to a master 
entry. This ratio can range from 0 to the number of entries in 
the detail data set. If the average is greater than ten percent 
of the number of detail entries, then the key value chosen is a 
waste because it doesn't make a small enough subset to warrant 
chasing the chain. It would be easier to serially search each 
detail. A frequent example is plant code in a company with two or 
three plants. If an inventory detail with 999,991 entries, 
the average plant chain length would be 500,000 to 333,333, not a 
very useful subset.

On the other hand, if the average is less than one, it is probably 
a design with a small detail tied to a large master. The large master 
is probably linked to a separate large detail. This is a waste because
the master in question must have pointers to the small detail that it 
can never use.
 
 
3.3 Record Formats

The actual file record for an image data  set is made up of a bit 
map, pointer information, and at least one data item. The records 
are  blocked together  based on  a parameter  that specifies  the 
maximum  block size,  and Image  takes  care of  fitting as  many 
records as it can in that block and wasting any leftover space.


         Logical Record Format of an Image Master Data Set

|Chain Head for Path 1| |Chain Head for Path n| | Data Part of Entry  |
+---------------------+ +---------------------+ +---------------------+
|CHAIN | FIRST | LAST | |CHAIN | FIRST | LAST | |KEY  | DATA ITEMS FOR|
|COUNT | ENTRY | ENTRY| |COUNT | ENTRY | ENTRY| |ITEM | MANUAL  MASTER|
+---------------------+ +---------------------+ +---------------------+
|  3   |   3   |   1  | | 50   |  22   |  21  | |TEST | TEST DATA     |
+---------------------+ +---------------------+ +---------------------+
            |      |
            |      |
            |      |      +-------------------------------------------+
            |      |      |             Detail Data Set               |
            |      |      +-------------------------------------------+
            |      |      | Forward | Backward | Key  |   Data Items  |
            |      |      | Pointer | Pointer  | Item |               |
            |      |      +-------------------------------------------+
            |      +--> 1 |    0    |    2     | TEST |               |
            |             +-------------------------------------------+
            |           2 |    1    |    3     | TEST |               |
            |             +-------------------------------------------+
            +---------->3 |    2    |    0     | TEST |               |
                          +-------------------------------------------+
                 

                 Physical Block Format of an Image Data Set


+----------------------+ +---------------------+ +---------------------+
|MPE/V File Label Block| | 1st Physical Block  | | Nth Physical Block  |
+----------------------+ +---------------------+ +---------------------+
|MPE   | USER  | EMPTY | |BIT | MEDIA  | MEDIA | |BIT | MEDIA  | MEDIA |
|FILE  | LABEL |  TO   | |MAP | RECORD | RECORD| |MAP | RECORD | RECORD|
|LABEL |       |  END  | |    |   1    |   N   | |    |        |       |
+----------------------+ +---------------------+ +---------------------+



+------+         +-----------------------------+ +---------------------+
|MPE/XL|         |     1st Physical Block      | | Nth Physical Block  |
+------+         +-----------------------------+ +---------------------+
|FILE  |         |USER  | BIT | MEDIA  | MEDIA | |BIT | MEDIA  | MEDIA |
|LABEL |         |LABEL | MAP | RECORD | RECORD| |MAP | RECORD | RECORD|
|TABLE |         |      |     |   1    |   N   | |    |        |       |
+------+         +-----------------------------+ +---------------------+


3.3.1 Bit maps

At the front of each block, Image builds a bit map to represent 
the availability of the records in the block. This saves time in 
accessing the individual entries of the block to see if it is 
occupied. These maps play an important role when the master or 
detail is serially searched, in that especially the case of 
masters we may read many empty blocks before one is found that 
contains a record.


                  Bit Map Format of an Image Data Set
+----------------------------------------------------------------+
| Bit Map  | ENTRY 1| ENTRY 2| ENTRY 3| ENTRY 4| ENTRY 5| ENTRY 6| 
+----------------------------------------------------------------+
| 11001000 | MEDIA  | MEDIA  | EMPTY  | EMPTY  | MEDIA  | EMPTY  |
| 00000000 | RECORD | RECORD | RECORD | RECORD | RECORD | RECORD |
+----------------------------------------------------------------+

 
3.3.2 Pointer information

Remember that pointers are simply stored relative record numbers 
that form the physical structure of the data base. The pointers 
in master records form a chain head consisting of a first entry 
and a last entry pointer and a total entry counter. This chain 
head occupies six words (12 bytes) for each chain head or path 
for a total up to 16. In each master record there is also a chain 
for entries whose search item hashes to the same relative record 
number. When this "collision" happens, the entries are chained 
together on a secondary chain.

In the detail each path has a set of forward and backward 
pointers which allows the ability to read, add to, and delete 
entries in the chain. The chain pointers in the detail take up 
four words (8 bytes) for each path for a total of up to 16 paths.

3.3.3 Data content

The data content of each entry is made up of the data items 
specified for the data set. The maximum data entry size is 2048 
words (4096 bytes).

3.3.4 Blocking factors

The blocking of records is done by specifying a maximum block 
size in the data base definition, and Image works backwards to 
fit the optimum number of records per block as to not waste disc
space. The actual record is made up of the bitmap plus the pointers
plus the actual data items, and is called a media record. The 
default block max is 512 words (1024 bytes or four sectors) and should
be changed to 2560 words (5120 bytes or 20 sectors). This will allow 
Image to pick the most optimum value.

3.4 Major Subsystems

Image is a complete data base package. It comes with the 
utilities for building and installing the data base, subprograms 
for accessing the data base programmatically called intrinsics, 
utilities and tools for maintenance and recovery, and a report 
writer called QUERY/3000.

3.4.1 Implementation overview

In the beginning, the programmer designs reports and interactive 
screens that will make up the particular application or system 
desired. The data items from the report and screen layouts are 
defined in terms of type and relationship to each other. The data 
base is designed and coded into the form of a data base 
definition called a schema. This schema is processed by a
compiler into a set of files which form the data base. The 
application programmer can now build programs that access the 
data base via subprograms called intrinsics.

During the debugging process, QUERY could be used to load test 
data and to interactively check the results of programs under 
test. The various data base utilities and tools are used to 
configure and maintain the data base while in production.

3.4.2 Schema definition and processing

The schema is developed either by manually programming it using 
EDIT/3000, QUAD or  some other text editor,  or generated 
automatically from a data dictionary. Either way the data base 
program is compiled and the resulting object module is then used 
to build the final data base.

3.4.3 Data base access intrinsics

The data base access intrinsics are the only way to access the 
data base. All utilities and applications must use this set of 
subprograms or privileged mode. Accessing an Image data base in 
privileged mode is not supported by Hewlett Packard and is best 
left up to the experts. These intrinsics provide the common 
access point for controlling security and reliability of the data 
contained within.

3.4.4 Utilities and tools

There are many data base utility tools to facilitate maintenance 
of Image data bases. Later chapters will go into the HP supplied 
and more popular contributed library and purchasable utilities.

3.4.5 QUERY\3000

Query is a non-programming tool for entering test data, modifying 
limited amounts of data online, and producing simple reports. 
Query is also useful for displaying the current number of entries 
and the structure and content of data bases.

------------------------------------------------------------------- 

4.0 How do I define and process a Schema?

4.1 The data base program

The data base is specified using a simple program called a 
schema. This program is a simple free format MPE file which is 
input to a compiler DBschema.PUB.SYS. The schema and the output 
from the compiler provide important documentation regarding the 
installation of the data base.

While the data base programming language is simple and free 
format, the tricky part is the use of periods, commas, and 
semi-colons which are used as syntax dividers.

4.2 Begin

A throwback to the SPL programming language, the begin statement 
is used to start the schema and name the data base. Example:

BEGIN DATA BASE TEST;

The data base name can be up to six alphanumeric characters, the 
first character must be alphabetic. The individual data sets will 
be identified by appending a two digit numeric counter i.e. 01, 
02, 03... so it usually is not a good idea to name your data 
base DB01, because the name of the first set will be DB0101, the 
second set DB0102 ..., this can be very confusing.

The data base name is really DBNAME.group.account, so before the 
data base is built, consider what group and account it should 
reside in. This can save a lot of trouble and effort down the 
road.

4.3 Passwords

Passwords are used to screen read and write access at the data 
set and data item. It is usually sufficient to code passwords at 
the set level only. Passwords at the item level add processing 
overhead, and this type of security is better dealt with in the 
application program.

The Image password can be up to eight ASCII, characters, but the 
first character must be alphabetic. The use of lower case or 
non-printable characters can raise security but make it very 
difficult to input passwords interactively. Hard coding
passwords into programs makes them very difficult to change, 
reading the current passwords in from a control parameter file 
allows them to be altered on a periodic basis without recompiling 
programs. Example:


PASSWORDS:
10 READER;
20 WRITER;
^ ^
| |
| + - Password to be supplied at data base open
|
+----- Arbitrary class number from 1 to 63 >>

4.3.1 User class table

Each password maps to an index number from 1 to 63, this allows 
for up to 63 different passwords, the numbering convention used 
is up to the data base designer. The number is then used as 
shorthand notation to specify which passwords allow access to 
data items or data sets.

A class number of 0 is assigned to an open request that specifies 
a password not on the password list. This class can be assigned 
to data items and sets to allow access even with a bad or missing 
password.

A class number of 64 is assigned to an open request that 
specified a ";" and the user is signed on as the creator of the 
data base. The creator is simply the user who ran DBschema and 
DButil to build the data base. This class turns off all security
and allows full access to all items and sets.

4.4 Data item definition

The items section of the schema program defines each field that 
will exist in the data base. These fields are used later in the 
definition of the individual files or data sets that make up the 
data base. An example of the items section of the schema:
 
ITEMS:
CUSTOMER-NAME,    X40 (10/20); << CHANGE TO 40 ON 1/2/88 >>
ADDRESS-LINE-1,   X40 (10/20);
ADDRESS-LINE-2,   X40 (10/20);
ADDRESS-LINE-3,   X40 (10/20);
CITY,             X40 (10/20);
STATE,            X2  (10/20);
ZIP,              X10 (10/20);

^                 ^   ^
|                 |   |
|                 |   +- Read/Write Class List
|                 +- Type and Sub-Item Length
+- Data Item Name >>

4.4.1 Item name constructs

The item name can be up to 16 alphanumeric characters, but the 
first character must be alphabetic. The item names are used in a 
request list at data access time and must match exactly as 
defined in the schema. Thus as a practical concern, items should 
be named in upper case and be long enough to describe the field 
but extra prefixes and suffixes should be avoided. The maximum 
number of data items per data base is 1,023.

4.4.2 Item definition parameters

The actual storage of the item is defined using three parameters, 
the sub-item count, the type designator and the sub-item length. 
The sub-item count is an array specification, it defaults to 1. 
Arrays are a simple method of storing related items i.e. monthly 
expenses etc, however most report writers do not handle sub-item 
counts other than the default of 1. Some of the older data bases 
were forced to make extensive use of sub-item arrays due to an 
old limitation of 255 data items per data base.

The type designator is used to tell Image how to store the field 
within the record. Note each field must be defined to be aligned 
on a 16 bit boundary.

16 bit item types:
I - Signed binary integer in 2's complement
J - Same as I but allows Query to edit input for COBOL COMP format
K - Absolute binary value
R - An HP formatted floating point number

8 bit item types (must be used in multiples of two)
U - ASCII data but no lower case alphabetic characters allowed
X - Any ASCII data
Z - Zoned decimal format

4 bit item types (must be specified in multiples of four)
P - Packed decimal number
 
Item   Length   COBOL                 FORTRAN     SPL
Type   Bytes
 I       2      S9    -S9(4)  COMP    INTEGER     INTEGER
 I2      4      S9(5) -S9(9)  COMP    INTEGER*4   DOUBLE INTEGER
 I4      8      S9(10)-S9(18) COMP
 J       2      S9    -S9(4)  COMP
 J2      4      S9(5) -S9(9)  COMP
 J4      8      S9(10)-S9(18) COMP
 K1      2                            LOGICAL     LOGICAL
 R2      4                            REAL        REAL
 R4      8                            DOUBLE      LONG
 Un      n      Display PIC A         CHARACTER   BYTE
 Xn      n      Display PIC X         CHARACTER   BYTE
 Zn      n      Display PIC 9         CHARACTER
 Pn     n/2     COMP-3                NUMERIC

4.4.3 Read/write class list

As mentioned above, the user class numbers are used to screen 
access at the  data item and data set  levels.  This is 
accomplished by specification of a read/write class list at the 
data item and set level. The security logic works in the 
following fashion:

At data base open the class number is mapped from the password. 
If the creator password is used at DBopen, the no read write 
screening is processed.  If the password used at DBopen is not 
in the password list, then a class of 0 is assigned. Otherwise, 
the class number is simply mapped from the password provided.

At data set access, the set read/write class list is checked in 
the following fashion. If the class is on the write side of the 
list specification the user has full access to all items in the 
set. If the class is on the read side of the list specification, 
this simply means that we will be allowed to process the item 
level security screen to see if we have read or write access to 
each individual item. This adds additional overhead for each 
data set access. If the class is not on the set read/write class 
list then no access is allowed to that set.

At data item access, the read/write class list is processed to 
allow read or write access to each individual data item, and is 
only processed if the class number was on the read side of the 
class list at the set level. If the class is on the read class 
list for the item, then read access is granted. If the class is 
on the write class list then read and write access is granted. 
And if the class is not on the class list, then no access to the 
item is allowed.
 
4.4.4 Comments

Documentation in the form of comments can appear anywhere in the 
data base schema simply be using << to start the comment and >>
to end it. These comments can span multiple lines, and any text 
in between is ignored by the compiler. Comments at the item 
level can be useful in documenting the use of the data item. 
Comments can also be used to document changes made to the data 
base over time.

4.4.5 Sorting of items table

In data bases with a large number of data items it can be very 
helpful if the item names appear in sorted sequence.  This can 
save a great deal of time when referring to the schema for the 
item type and size in order to build data definitions for 
programs.

The items can be sorted by using editor to specify the data items 
then using sort to arrange them in alphabetic sequence, and then 
adding the rest of the schema definition.

4.5 Sets

The data set is made up of up to 255 individual data items. There 
are two kinds of data sets. Master key data sets and Detail Data 
sets. The data base can have from one to 191 data sets. The data 
set name can be from one to 16 alphanumeric characters, but the 
first character must be alphabetic.

4.5.1 Master Data Set Definition

Master data sets are accessed by hashing a key value to generate 
a relative record number. The Image master can have only one key 
item. There are two kinds of Image master data sets, manual and 
automatic. The manual master is used if data must be stored with 
the key record, and the data entry must be manually written to 
the data base. The Automatic master cannot contain any data items 
other than the key data item, and is automatically added or deleted
as required. This will become clearer when detail data sets are discussed.
 
An example of a manual master data set schema definition:

SETS:
       << data set name.   type    class list  device class >>
NAME:     CUSTOMER-MASTER, MANUAL  (10/20),    DISC1;
ENTRY:    CUSTOMER-NAME(1),  <<-- key item and path count >>
          ADDRESS-LINE-1.
          ADDRESS-LINE-2,
          ADDRESS-LINE-3,
          CITY,
          STATE; <<-- note the ; to end the list of items >>
CAPACITY: 100001,101,101;
<< maximum capacity,[initial capacity [,increment]] >>


An example of a manual master data set schema definition:

SETS:
       << data set name.   type       class list  device class >>
NAME:     ORDER-NO-MASTER, AUTOMATIC  (10/20),    DISC1;
ENTRY:    ORDER-NO(1);  <<-- key item and path count >>
CAPACITY: 101;

The read/write class list on the data set level is specified 
similarly to the item level, as previously discussed, write 
access at the set level gives full access to all data items in 
the data set. Read access at the set level means that you must 
examine the item level access security to see if your class has 
read, write, or no access to each individual data item.

The device class parameter on the data set allows the creator to 
specify which disc device class the data set is to reside on. 
There are some important performance notes to be made at this 
point. The first is that if each disc drive only has a device 
class of disc on it then the only class you'll be able to specify 
is disc. Therefore, it is a good idea to configure each disc with 
a unique device class i.e. disc1, for LDEV 1, disc2, for LDEV 2, 
etc.

In order to spread out the disc I/o's and reduce access 
contention, it is a good practice to put masters and details on 
different disc drives. Due to their access method, masters tend 
to be very random and should be placed on non-cached discs. The 
access to details can range from being very sequential to very 
random depending on the volatility of the data entries and 
timeliness of proper maintenance procedures.

The item names that make up the data set can be specified in any 
sequence. However, they must have been previously defined in the 
items section. There is no problem with having an item name that 
is identical with the set name.

The path counter is used to identify both the data item that will 
serve as the key and the number of associated detail data sets 
linked to the master by the value of the key. A master can have
from 0 to 16 paths to various detail data sets. A value of 0 
simply is used to identify a key data item for a master with no 
detail data sets.

The capacity of a data set can range from 1 to 2,147,483,647 data 
entries. When the data base is built, all disc space required to 
hold the specified number of entries is allocated and initialized 
to binary zeros.

4.5.2 Detail data set definition

The detail data set is used to hold data records that occur 
multiple times for a given key value i.e. all order summaries for 
a given customer. The detail is accessed via a pointer from the 
associated master to a linked list of entries in the detail with 
the same key value. A detail unlike a master can have up to 16 
keys, which are in fact pointers from associated masters.

The keys in Automatic masters are added and deleted as records 
are added or deleted to associated detail data sets. The manual 
master entry for a given key value must be added by the user 
application prior to adding any detail records with the same key 
item value. An example detail data set specification would be:

NAME:  ORDER-SUMMARY, DETAIL (10/20), DISC2;
ENTRY. ORDER-NO(ORDER-NO-MASTER), <1/4 path from master based on 
                                     value of order-no >>


       CUSTOMER-NAME(!CUSTOMER-MASTER(TOTAL-DOLLARS)),
       <1/4 path from customer-master based on value of 
          customer-name, additionally the list of all 
          orders for the same customer-name will be
          maintained in sorted sequence by total-dollars >>
       TOTAL-DOLLARS;
CAPACITY: 300000,1000,1000;
 << maximum capacity,[initial capacity [, increment]] >>

The path names are simply the name of the associated master. A 
detail can have from 0 to 16 paths to the same or different 
master data sets. These paths serve to make subsets of records in 
the detail for given key values. Thus it is wise to choose paths 
that will yield a subset that is worth the effort of maintaining 
the pointers and chains. A good rule of thumb is that a path is 
only valuable if it can make a subset that is 15% or less of the 
detail records. Any greater than 15% and it becomes increasingly 
more advantageous to simply read all of the detail records 
selecting for the one you want.

The sort item is used to specify maintenance of the list of data 
entries with the same key value in some sequence other than which 
they are added over time. THIS CAN BE EXTREMELY DANGEROUS if the 
records are not added in sorted sequence and the number of 
entries with the same key value is greater than 40 or 50.
 
4.6 End.

The end statement is a throwback to the SPL programming language, 
its only purpose is to tell the schema compiler that this is the 
end of the program. However without the end statement the data 
base will not compile. Another note is that as will be discussed 
in a moment, the schema compiler requires file equations and run 
parameters to function properly. From time to time you will 
forget these complexities and run the DBschema program without 
them. One solution is to break and abort the program, another 
is to type END. and this will terminate the program.

4.7 Capacities & maximums

4.7.1 HP3000 system limitations

A somewhat official operating system limits white paper can be found at:
http://jazz.external.hp.com/papers/limits/os_limits.html

8190 maximum number of concurrent processes 

2004 databases can be opened concurrently on a system

1140 processes can open one database in modes 1-4 (write)access when
     IMAGE logging is enabled. User logging is still all compatibility mode.

127 DBopens per process without calling DBclose

63  DBOPENs per process to open the same database


4.7.3 Data base level limitations:

1023 data items can be defined in a database

199 data sets can be defined in a database

63 passwords can be defined in a database (65 if 0 & ; count)


4.7.4 Data set limitations:

255 data items can be defined in a data set

16 paths can be defined in a master or detail data set

2,147,483,647 entries maximum in a data set

2,147,483,647 entries maximum in a chain

16,777,215 blocks per data set (recently this was changed from 8,388,607)


4.7.5 Data item limitations:

255 subitems can be defined in a data item

510 bytes for maximum subitem length

4096 bytes for maximum item length

4.7.6 Length limitations:

5120 bytes for maximum block length

252 bytes for maximum b-tree index key size

4.7.7 Transaction limitations:

15 databases can have simultaneous transactions (either of
      DBBEGIN/DBEND or DBXBEGIN/DBXEND transaction pairs)

4MB opened transactions for a process

64MB opened transactions for a volume set


4.8 DBschema

The schema compiler is a program which reads the text file and 
compiles it into an object module called a root file.  The 
compiler prints a heading, listing of the schema, and a summary 
table. The schema compiler is invoked by using the command:

4.8.1 Parameters and file equations

RUN DBschema.PUB.SYS;PARM=n

The parameter value of n is used to tell the compiler program what to 
do as far as input and output files are concerned.

n = 1 ; the schema file has been file equated to DBSTEXT

n = 2 ; the list   "  "  "  "   "    DBSLIST

n = 3 ; both files have been equated as in 1 & 2

If no parameter is supplied, then the DBschema program will expect you 
to enter the schema at your terminal " a pretty good feat ), and 
you will have to break and abort or enter END. as described 
above.
 
4.8.2 Compile options

The schema compiler allows for setting of compile options via a 
record in the text file that starts with $CONTROL, the form of the 
$CONTROL directive is:

$CONTROL [ LIST NOLIST ] [ ,ERRORS= nnn ] [ ,LINES= nnnnn ]  [ ,ROOT ,NOROOT ] 
[ ,BLOCKMAX= nnnn ]  [ ,TABLE ,NOTABLE ] [ ,JUMBO ,NOJUMBO ]

Where:

LIST causes the source text to be printed to the list file, this 
can be  a problem if your  schema is long and  you ran 
DBschema;PARM=1. The default is to list all of the text file as 
the schema is compiled.

NOLIST specifies that only text file records with errors are 
listed. This can be handy for processing long text files and for 
keeping the content of a data base secret.

ERRORS=n sets the maximum number of errors to n. The default 
value is 100. When the schema processor reaches this number of 
errors, it gives up and terminates. This is very useful for 
saving paper and your pride.

LINES=n sets the number of lines per page for the list file, the 
default is 60 lines per page but this number can range from ' to 
32767.

ROOT tells the schema compiler to create the root file if no 
errors are detected. This is the default mode.

NOROOT tells the compiler to compile the schema but not to create 
the root file even if no errors are detected. This option is very 
useful for testing out modifications to the schema before purging 
the existing data base, and saves the trouble of having to go into 
DButil to purge the root.

BLOCKMAX sets the size of the largest physical block of records 
for each data set. The default is 512 words or 2560 words or 20 sectors. 

DBschema determines the number of media records that fit in a
block. Note that DBSCHEMA chooses a block size (less than or equal to
the maximum block size) that makes the best use of disk space, which
often is substantially less than the BLOCKMAX. If the record size is 
greater than 512 words, BLOCKMAX must be set greater than or equal to
the record size. In general, a setting of 2560 will allow DBschema to
choose the best blocking factor.

JUMBO allows a data set whose capacity generates a file greater than
4GB, to automatically become a jumbo data set. Be aware that many
utilities and diagnostic tools are not yet upgraded to handle Jumbo
data sets. IT IS ALSO IMPORTANT TO NOTE THAT THE JUMBO OPTION WORKS BY
SPLITTING THE DATASET INTO MULTIPLE FILES THAT RESIDEING THE POSIX HFS.
THESE FILE MAY NOT GET BACKED UP AND RESTORED WITH EXISTING JOB JCL 
SPECIFICATIONS!

NOJUMBO the default disallows data sets greater then 4GB.

There are two other 2 compiler directives, the $TITLE and $PAGE 
commands are used to format the schema listing for documentation 
purposes. The form of the $PAGE command is:

$PAGE [["some character string"], ]

The $PAGE command causes the compiler to page eject and print the 
content of the character string, if any, at the top of the next 
page. The character string will override the $TITLE command.

The $TITLE command is used to specify a string of characters to 
be printed as a heading for each page printed. It does not cause 
a page to be ejected as is the case with the $PAGE command. An 
example of the $TITLE command would be:

$TITLE [["some character string"],]

The $TITLE command stays in effect until a subsequent $TITLE or 
$PAGE command is processed. If no character string is included, 
then no heading is printed.

4.9 Interpreting the compiler listing

Errors are listed as the schema file is being scanned.  Most 
errors will be due to lack of attention paid to punctuation in 
the form of :,, and .'s. If there are no errors, the message 
UNREFERENCED ITEMS. may appear. While items defined in the items 
section which are not specified in any of the data sets are not 
considered as errors, they should be removed to decrease the 
overhead of processing the item names list.

The summary table at the end of the schema listing is very 
important to understand and retain, as it represents the physical 
realities of the data base, and some of the information cannot be 
found elsewhere. 

DATA SET NAME - Is the name given to the set in the schema SETS. 
section. The MPE file name will be the data base name with a 
counter that starts at 01 and counts up as each data set appears 
sequentially in the schema.

TYPE - Is a one letter denotation: "A" for Automatic masters, "M" 
for manual masters, and "D" for detail data sets.

FLD CNT - Is a counter of the number of data items in each data 
set.

PT CT - Is a path counter. For master data sets this is the 
number of paths that appears in ()'s after the search item. For 
details this is a count of the number of associated masters, 
specified by putting the master set name in ()'s after the 
linking data item.

ENTR LGTH - The length in words of the data portion for the data 
entry. This is the sum of all the individual data items specified 
for the data set.

MED REC - This is the total length of the Image media record. 
This includes data, pointers and other internal information. The 
length of the media record is an important design consideration 
and will be covered in detail later.

CAPACITY - For masters this is the exact number of entries 
specified in the schema in the SETS section. For details, this 
number may be rounded up to fill out the last full block.

BLK FAC - This is the best fit of the number of media records that 
could fit under the BLOCKMAX specified, or the hard coded number 
specified in the capacity. 

BLK LEN - This is the actual length of the block. Any difference 
between this value and the next multiple of 512 is wasted disc 
space. The total number of wasted sectors can be calculated by
multiplying the number of blocks by this difference. The total number of 
blocks is simply the capacity divided by the blocking factor, 
rounded up by one.

DISC SPACE - The total number of sectors including the root file 
that this data base will require when the data base is created. 
Note that for very large data bases it is usually a good practice 
to check if enough disc space is available before proceeding.

ROOT LENGTH - This is the length in words of the compiled root 
file. The size of this varies with the number of passwords, data 
items, sets and security specifications.

BUFFER LENGTH - is the length in words of the largest block it is
unused by TurboIMAGE/XL but provided for compatibility with
TurboIMAGE on previous versions.
 
TRAILER LENGTH - This is the length in words of an area that is 
used by Image to move data base access parameters to and from the 
calling program's stack.

Schema processing errors are documented in Appendix A of the 
Image manual. Errors tend to build on each other as the compiler 
trips over itself, so don't let your pride get hurt when many 
errors are reported. Two helpful hints: one is to specify data 
items uppercase characters and have a handy sorted list of them 
as you build the data sets. Second, pay attention to the 
punctuation. After all, the schema is a program and therefore 
highly cantankerous when it comes to syntax.

4.10 DButil

After the schema has been successfully compiled, the DBschema 
program will leave behind a root file. While interesting, this 
file is of little useful value until the data bases are created. 
The data base is built using the utility program DButil.PUB.SYS, 
as we see in the following discussion DButil is a general utility 
that allows the data base administrator to configure and maintain 
the data base. There are many commands available in DButil and they 
can be categorized them into four groups, the create and purge 
commands, the show command, the enable command and the set 
command.

4.10.1 Create & Purge commands

The CREATE command initializes each data set and saves them as 
privileged MPE files in the same group as the root file. To 
create the data base, you must be logged on with the same 
user account group that was used when the DBschema program was 
executed. The format of the CREATE command is:

CRE[ATE] data base name [/maintenance word]

The data base name is the same as the root file name which is the 
same as the base name specified in the BEGIN statement of the 
SCHEMA. The maintenance word is a password that can be used to 
allow users other than the creator of the data base to use a 
limited set of DButil commands. This user must be logged on into 
the group in which the data base resides.

For large data bases, it is wise to check that enough disc space 
is available in sufficient chunks before executing the create 
command. If the create operation fails due to lack of disc space, 
two problems must be dealt with. First the data sets built must 
be purged using the PURGE command, and second, free space must be 
found or added in sufficient quantities to reattempt the build 
operation.
 
The PURGE command removes the data base files from the system 
directory, and once entered, the data in the data base is forever 
lost. So extreme caution is recommended before using this 
command, always make sure that you are logged on where you think 
you should be because many a production data base has been purged 
because someone thought that they where logged on in the test 
account! The format of the PURGE command is:


PUR[GE] data base name[/maintenance word]

4.10.2 Show command

Once the data base is built the default configuration can be 
displayed with the SHOW command. The SHOW command can provide a 
list of users accessing the data base, any users waiting or 
holding locks on the data base, logging status, configuration 
flags, and the buffer allocations. The format of the SHOW command 
is:
SH[OW] database name[/maint word] ALL    [OFFLINE]
                                         BUFFSPECS
                                         CAPACITY
                                         CIUPDATE
                                         DEVICE
                                         FLAGS
                                         INDEX
                                         LANGUAGE
                                         LOCKS
                                         LOGID
                                         LOGINFO
                                         MAINT
                                         PASSWORDS
                                         SUBSYSTEMS
                                         USERS

MAINT - Displays the maintenance word if any. This is very useful 
if you have forgotten what you originally set it to.

ALL - Displays the combination of MAINT, BUFFSPECS, LOCKS, USERS, 
LOGID, AND FLAGS. This is the parameter that is used most 
frequently and supplies a neat little report on the configuration 
of the data base.

BUFFSPECS - Displays the current buffer specification. The buffer 
specifications will be discussed in detail in the SET command.

CAPACITY - Displays a capacity report for the data base. For each data set it 
shows the current number of entries, the percent of maximum, and for a detail 
data set the dynamic expansion parameters. 

CIUPDATE - Displays the status of critical item update. DISALLOWED means that 
DBupdate can not modify a search or sort item. ALLOWED means that after a call 
to DBcontrol that a program can use DBupdate to modify search or sort items. 
And, ON means that Dbupdate can modify search and sort items.

DEVICE - Displays the device assignment for each data set.

FLAGS - Displays the various states of the data base configurable 
parameters. Each parameter is discussed in detail in the SET and 
ENABLE commands.

LOCKS - Displays the users who are currently holding or 
requesting locks for the data base, data sets or data items. This 
can be very insightful when deadlock situations occur. This happy 
event will be discussed in detail when we get to the DBlock 
intrinsic.

LOGID - Displays the current data base log identifier if any.

USERS - Displays the users and application programs if any, which 
are currently accessing the data base.

PASSWORDS - Displays the user class table and current password 
settings. This is handy if you forget what the passwords to the 
data base were and don't happen to have the schema.

SUBSYSTEMS - Displays the flag that allows or disallows the data 
base for access by Query and other applications that check this 
flag.

OFFLINE - Sends the information to the line printer with the 
formal file designator of DBUTLIST.

4.10.3 Enable and Disable of configuration flags

The enable and disable commands turn on and off internal flags 
for access, dumping, logging, ILR and recovery. These flags are 
very important to understand from the standpoint that they 
control the way that Image operates on the data base as a whole. 
The form for the enable and disable commands is:

EN[ABLE] database name[/maint word] FOR parm[[,parm]...]
Valid parameters are:
   ACCESS DUMPING ILR LOGGING RECOVERY ROLLBACK AUTODEFER
   MUSTRECOVER PREFETCH INDEXING HWMPUT SHADOWING

DIS[ABLE] database name[/maint word] FOR parm[[,parm]...]

ACCESS - Allows or disallows user, or application access to a data base.
This is very useful for securing the data base for periods of time i.e. weekends 
and holidays. It protects the data from being read, written or deleted. It does 
not protect from an accidental restore of a different version of the data base.

DUMPING - This flag, when enabled, tells Image, upon abort, to copy the users 
stack, image buffers and other binary information to an "I" file created in the 
group where the session or job is logged on.

It is very rare to see an abort while in an Image subroutine. However, an Image 
abort usually means that the user or application program was attempting to 
access the database and ran into some type of corrupted data. In this case the 
"I" file would be of assistance in debugging the type and extent of the damage.

This flag defaults to the disabled state and remains that way until a problem 
situation is encountered.

HWMPUT - See http://www.adager.com/TechnicalPapersHTML/HwmPut.html

ILR - Flag for Intrinsic Level Recovery. This facility of Image 
allows the backing out of the last record written or deleted to 
the data base if the system failed prior to the completion of the 
operation. In other words, the intermediate state of the data 
base when a record is being added or deleted is inconsistent, and 
if the operating system or hardware should fail, halt or abort at
this critical stage, we could be left with a damaged data base, 
unless ILR is enabled. Fortunately, HP3000's tend to be very 
reliable systems, and ILR is overkill for most data bases. 
However, ILR is recommended for data bases that contain data 
which the financial wellbeing of your company rests and cannot be 
recovered manually, i.e. online order entry, billing etc. The 
drawback to ILR is that it adds additional processing and disc 
I/o's to each record added or deleted, to the tune of 10% to 15%. 
If you are experiencing a rash of system problems, it is wise to 
enable ILR until things settle down.

LOGGING - Setting this flag is part of a series of steps to turn 
the logging of transactions to disc or tape media on and off. 
This procedure will be discussed in detail later.

PREFETCH - IMAGE is currently 'single threaded' when it comes to
calling intrinsics that change pointer information. This means that only
a single process is executing a call to DBput, DBdelete, or DBupdate (of
a critical item) at a time. This methodology helps to insure against 
situations that create conflicts with pointers that would result in broken chains.

In order to accomplish this, IMAGE uses a flag known as the PUT/DELETE
semaphore to freeze access. This freeze is obtained at the beginning of the
call and released at the end, and is held while entries and pointers are updated.

The Prefetch flag tells IMAGE when to obtain the semaphore freeze.
With Prefetch disabled, (the default), IMAGE obtains the semaphore 
before reading, updating and writing every data block which is to be
modified by the call. After this is successfully completed the semaphore
is released.

With Prefetch enabled, IMAGE reads all the data blocks that are
required for the call before the semaphore is obtained.

It is the holding of the semaphore for a shorter period of time which
could allow more concurrent calls but it is not guaranteed. On systems with memory contention there is the possibility that the necessary blocks may be flushed out of memory by another process before IMAGE is able to update them. This could cause the blocks to be thrashed in and out and is why the default is not to Prefetch.

Thus to gain throughput from Prefetching the system must have memory and CPU available to handle the increased processing, and the application mix would have to consist of multiple processes consisting of mostly calls to DBput, DBdelete or DBupdate. 




RECOVERY- Enables or disables the running of the program that is 
used to apply transactions from the log file. Setting this flag is 
part of the recovery procedure, which will be discussed later. 
This flag defaults to the disabled state and protects from 
corruption for the data base by an inadvertent attempt to reapply 
a transaction log file.

4.10.4 Set parameters

The Set command is used to configure the run time variables of 
the data base. These differ from the parameters of the 
enable/disable command, in that they are more complex than a 
simple on/off status. However, as in the enable/disable commands 
these variables can significantly affect the behavior of the data 
base. The format of the Set command is:


SET database name[/maint word] BUFFSPECS= num bufs(from/to)
                                      LOGID=logid-value
                                      MAINT=[maintenance word]
                                      SUBSYSTEMS = NONE | READ | RW  
                                      PASSWORD j=[password]
                                      LANGUAGE=langid
                                      CIUPDATE = [DISALLOWED ]
                                                 [ALLOWED    ]
                                                 [ON         ]
                                      BTREEMODE1 = <ON | OFF> [, [WILDCARD=]c]



MAINT - This parameter is used exclusively by the creator of the 
data base to set a new maintenance word for the data base. The 
default is no maintenance word, so if it is desirable to allow 
other users who can log on into the data base's group, access to 
the DButil commands then this is the command to set a maintenance 
word. Note, other utilities, such as ADAGER, also make use of the 
maintenance word in this fashion. If no maintenance word is 
provided, then the existing maintenance word if one exists is 
removed.

BUFFSPECS - is backward compatibility only.

LOGID - Is the MPE log identifier used when the GETLOG command 
was issued. Use of this parameter is part of the setting up of 
the logging procedure and is discussed in that full context 
later.

PASSWORD - Allows the changing of the data base passwords. The 
user class number must be used in the read/write security 
specification at the item or set level for the new password to 
have any effect. Usually, data base passwords are hard coded into 
applications and therefore changing them in this fashion can have 
drastic results on the ability of the user or application to open 
the data base. This, however is a good way to test the error 
logic build into the application, to see what happens when the 
data base cannot be successfully opened.

SUBSYSTEMS - A parameter that can be used to allow or disallow 
read and/or write access on an application by application basis. 
This parameter can be checked programmatically via the DBinfo 
intrinsic and is built into QUERY. So in short, this is a way to 
control QUERY access for each data base.

4.10.5 ERASE

The erase command is executed to reinitialize the entire data 
base to its created state. This is the time saving equivalent of 
purging and rebuilding the entire data base. A prime design 
consideration for splitting one data base into multiple data 
bases is that a data base can be erased many times faster than 
the individual records can be deleted. However, you can only take 
advantage of this if the records to be deleted reside entirely in 
their own Image data base. The format of the erase command is:

ERA[SE] data base name[/maintenance word]

NOTE. DButil DOES NOT ASE ARE YOU SURE" SO BE AWARE THAT ALL DATA 
WILL BE WIPED OUT WHEN YOU ENTER THIS COMMAND- IT IS A VERY WISE 
PRACTICE TO CHECE YOUR LOG ON WITH A SHOWME BEFORE ERASING THE 
WRONG DATA BASE.

4.10.6 EXIT

The exit command gets you back to MPE. Almost all utilities allow 
you to enter E to exit, however to be consistent with the three 
letter command notation standard, you must enter EXI to exit.

4.10.7 Dbschema walk through

:DBschema

PAGE 1      HEWLETT-PACKARD 30391C.07.04  TurboIMAGE/3000: DBSCHEMA
        FRI, NOV  6, 1998,  9:11 AM  (C) HEWLETT-PACKARD CO. 1987
 
 
>BEGIN DATA BASE TEST;
>PASSWORDS:
>10 READER;
>20 WRITER;
>
>ITEMS:
>ADDRESS-LINE-1, X40 (10/);
>ADDRESS-LINE-2, X40 (10/);
>ADDRESS-LINE-3, X40 (10/);
>CITY,           X40 (10/);
>CUSTOMER-NAME,  X40 (10/);
>ORDER-NO,       X02 (10/);
>STATE,          X02 (10/);
>TOTAL-DOLLARS,  X10 (10/);
>ZIP,            X10 (10/);
>
>SETS:
>NAME: CUSTOMER-MASTER, M (10/20);
>
>ENTRY: CUSTOMER-NAME(1),
>       ADDRESS-LINE-1,
>       ADDRESS-LINE-2,
>       ADDRESS-LINE-3,
>       CITY,
>       STATE,
>       ZIP;
>CAPACITY: 5;
>
>NAME: ORDER-NO-MASTER, A (10/20);
>
>ENTRY: ORDER-NO(1);
>CAPACITY: 5;
>
>NAME: ORDER-SUMMARY, D (10/20);
>
>ENTRY: ORDER-NO(ORDER-NO-MASTER),
>       CUSTOMER-NAME(CUSTOMER-MASTER),
>       TOTAL-DOLLARS;
>CAPACITY: 300000,1000,1000;
>END.
 
 
 DATA SET      TYPE  FLD  PT  ENTR  MED  MAXIMUM     BLK  BLK  DISC  
   NAME              CNT  CT  LGTH  REC  CAPACITY    FAC  LGTH SPACE  
 
CUSTOMER-MASTER   M  7    1   106   117  5           3    352  16       
ORDER-NO-MASTER   A  1    1   1     12   5           5    61   16       
ORDER-SUMMARY     D  3    2   26    34   300000      15   511  272      
           INITIAL CAPACITY = 1005       INCREMENT ENTRIES = 1005     
                            TOTAL DISC SECTORS INCLUDING ROOT: 320      
 
 
NUMBER OF ERROR MESSAGES: 0     
ITEM NAME COUNT: 9       DATA SET COUNT: 3     
ROOT LENGTH: 652         BUFFER LENGTH: 511     TRAILER LENGTH: 256   
 
ROOT FILE TEST   CREATED.

:dbutil
HP30391C.07.04 TurboIMAGE/XL:  DBUTIL (C) COPYRIGHT HEWLETT-PACKARD COMPANY 1987

>>create test
  Database TEST has been CREATED.
>>show test all
  For database TEST

  Maintenance word is not present.

  Access is enabled.
  Autodefer is disabled.
  Dumping is disabled.
  Rollback recovery is disabled.
  Recovery is disabled.
  ILR is disabled.
  Mustrecover is disabled.
  Logging is disabled.
  Prefetch is disabled.
  Indexing is disabled.
  HWMPUT is disabled.
  Restart is disabled.
  Database has never been stored.
  Database has been modified since last store date.
  Shadowing is disabled.
  Subsystem access is READ/WRITE.

  CIUPDATE is disallowed.
  Dynamic capacity expansion is used for 1 data sets.
  Database has no B-trees.

  Logid is not present.

  XM log set : this database is not attached to an XM log set.

  The language is  0 - NATIVE-3000.

  Buffer specifications:
  8(1/2),9(3/4),10(5/6),11(7/8),12(9/10),13(11/12),14(13/14),
  15(15/16),16(17/18),17(19/120)

No other users are accessing the database.

------------------------------------------------------------------- 

5.0 What are the Data Base Access Intrinsics?

The data base is programmatically accessed via calls to 
intrinsics that contain parameters that are passed back and 
forth. The parameters along with the intrinsic called act as 
commands to Image to perform the specified task. 

It will be much easier if you think of the individual intrinsics
as commands that come in pairs. Just like the seven dwarfs,
happy-grumpy,sleepy-Sneezy, dopey-doc, and bashful. The Image 
intrinsic commands can also be paired, DBopen-DBclose, DBlock-DBunlock, 
DBerror-DBexplain, DBput-DBdelete, DBget-DBfind, and DBupdate. 

Another good discussion/overview of the intrinsics can be found at:
http://www.robelle.com/smugbook/imageapi.html

5.1 Parameters and data types

Before getting into the discussion of the actual commands, we can 
cover some of the general housekeeping issues. How is the 
intrinsic referenced in the different programming languages. 
What are some of the common parameters passed back and forth. 
And, most importantly, how can the program tell if its call was 
successful or not

The syntax for the intrinsic procedure call is different for 
various languages. The general form is presented here. See the 
program examples included for more detail.

COBOL  CALL "DBxxxx" USING parm1, parm2, parm3 ...

FORTRAN CALL DBxxxx (parm1, parm2, parm3 ... )

SPL  DBxxxx (parm1, parm2, parm3 ... )

BASIC  100 CALL XDBxxxx (parm1, parm2, parm3 ... )

PASCAL DBxxxx (parm1, parm2, parm3 ... )

The xxxx indicates the various commands available. The parameters 
vary slightly with each command and are discussed in the next 
section.

5.1.1 Data base parameters

The three most common parameters are the base name, the mode, and 
the status, which are required for most intrinsic calls. The base 
name is the name given in the BEGIN statement of the data base 
schema. The mode parameter has different meanings for different 
Image calls, and for some calls the value is always the same. 
This is done for consistency and compatibility with older 
versions of Image.
 
+----------------------------------------------------------------------+
| INTRINSIC|       Parameters passed to and received from Image        | 
|  COMMAND |  UPPER case are input parameters, lower case are returned |
+----------------------------------------------------------------------+ 
| DBopen   | BASE | PASSWORD  | MODES | status |      |        |       |
| ------   | NAME |           | (1-8) | array  |      |        |       |
|----------------------------------------------------------------------| 
| DBclose  | base | DATA  SET | MODES | status |      |        |       |
| -------  | name | NAME OR # | (1-3) | array  |      |        |       |
|----------------------------------------------------------------------| 
| DBlock   | base | QUALIFIER | MODES | status |      |        |       |
| ------   | name |           | (1-6) | array  |      |        |       |
|----------------------------------------------------------------------| 
| DBunlock | base | DUMMY     | MODE  | status |      |        |       |
| -------- | name | PARAMETER | = 1   | array  |      |        |       |
|----------------------------------------------------------------------| 
| DBerror  |      |           |       | STATUS |      | buffer |       |
| -------  |      |           |       | ARRAY  |      |        |       |
|----------------------------------------------------------------------| 
| DBexplain|      |           |       | STATUS |      |        |       |
| ---------|      |           |       | ARRAY  |      |        |       |
|----------------------------------------------------------------------| 
| DBput    | base | DATA  SET | MODE  | status | LIST | BUFFER |       |
| -----    | name | NAME OR # | = 1   | array  |      |        |       |
|----------------------------------------------------------------------| 
| DBdelete | base | DATA  SET | MODE  | status |      |        |       |
| -------- | name | NAME OR # | = 1   | array  |      |        |       |
|----------------------------------------------------------------------| 
| DBget    | base | DATA  SET | MODE  | status | LIST | buffer | ARG   |
| -----    | name | NAME OR # | (1-8) | array  |      |        | VALUE |
|----------------------------------------------------------------------| 
| DBfind   | base | DATA  SET | MODE  | status | ITEM |        | ARG   |
| ------   | name | NAME OR # | = 1   | array  |      |        | VALUE |
|----------------------------------------------------------------------| 
| DBupdate | base | DATA  SET | MODE  | status | LIST | BUFFER |       |
| -------- | name | NAME OR # | = 1   | array  |      |        |       |
|----------------------------------------------------------------------|
| DBbegin  | base | LOGFILE   | MODE  | status | TEXT |        |       |
| -------  | name | TEXT      | = 1   | array  | LEN  |        |       |
|----------------------------------------------------------------------|
| DBend    | base | LOGFILE   | MODE  | status | TEXT |        |       |
| -----    | name | TEXT      | = 1   | array  | LEN  |        |       |
|----------------------------------------------------------------------|
| DBinfo   | base | QUALIFIER | MODE  | status |BUFFER|        |       |
| -------- | name |           |       | array  |      |        |       |
|----------------------------------------------------------------------|
| DBxbegin | base | QUALIFIER | MODE  | status |BUFFER|        |       |
| -------- | name |           |       | array  |      |        |       |
|----------------------------------------------------------------------|
| DBxend   | base | QUALIFIER | MODE  | status |BUFFER|        |       |
| -------- | name |           |       | array  |      |        |       |
+----------------------------------------------------------------------|


5.1.2 The status parameters

The status parameter is the only parameter required for each and 
every call to Image. It is used to communicate the degree of 
successfulness and provide other useful data back to the calling 
program. The status parameter is a ten word array with the 
following format:

Word 1, Zero if call was good or a number if something went 
wrong, negative numbers indicate bad parameters or conditions,
positive error numbers indicate that the procedure encountered
some exception condition. See appendix A of the Image manual 
for more details.

The negative error numbers generally indicate some type of
problem with the parameters being passed to the Image procedure.
Problems with parameter number one range from -11 to -19,
problems with parameter number two range from -21 to -29, ETC.

For DBput, DBdelete, DBget, DBfind, and DBupdate:

Word 2, The current entry length in words.

Word 3/4, The current record number.

Word 5/6, The chain length, returned by DBfind.

Word 7/8, Backward pointer to the previous entry in the cabin.

Word 9/10, Forward pointer to the next entry in the chain.

For all the other intrinsics see individual descriptions.

5.2 DBopen & DBclose

DBopen and DBclose are the first and last intrinsic commands 
passed to Image. They are used to begin and end the application's 
access to the data base. Just like any other file you must have 
at least read access to the data base you are attempting to open.

5.2.1 Description

Upon calling DBopen Image:

Opens the root file, but does not open any of the sets until 
referenced by the data set parameter in any of the other 
intrinsics.

Uses the password provided to determine the security restrictions 
that will apply for subsequent calls under this DBopen.

Uses the mode provided to determine allowable concurrent 
assessors.

Upon calling DBclose Image:

5.2.2 Parameters

The form of the DBopen intrinsic call is:

DBopen (BASE,PASSWORD,MODE,status)

BASE - Is a character string that contains the name of the data 
blank, then the data base name followed by a semicolon or a 
blank. If the data base is successfully opened, Image puts the 
data base ID word in the space you allocated in the first two 
bytes of the BASE array. This base ID identifies this DBopen call 
to Image for subsequent accesses. Therefore, if you overlay the 
base name variable in your program you will not be able to 
execute any further data base intrinsics under the current 
DBopen. To hard code the location of the data base or access a 
data base in another account, simply append the .GROUP and or 
.ACCOUNT to the string of characters you pass to image as the 
BASE parameter.

PASSWORD - Is a string of characters corresponding to the 
passwords built into the schema followed by an optional user 
identifier. The password is used to establish the user's class 
number which is then mapped to the access capabilities at the set 
and item levels. The user identifier is used by the recovery 
program to assist in identifying users who log on under the same 
group and account.

The construct is the left justified password followed by the 
optional user identifier followed by a space. If a semicolon is 
used the class number will be 63 if the user is logged on exactly 
as the user who used DBschema and DButil to create the data base. 
If the password provided matches one of the passwords in the 
password table, then the class number is mapped from the password 
table. If the password does not match any of the defined 
passwords, then a class number of zero is assigned and Image will 
check to see if this class number has been specified on any of 
the sets or items within those sets.

MODE - Tells Image what type of access is requested, the valid 
modes are:

1 - Modify access with enforced locking, the other users may open 
only with modes 1 and 5. This is the most useful mode in that it 
allows all of the intrinsic calls. However, our password must 
allow read or write access to the specific sets or items 
requested.

2 - Update access same as mode 1 but disallows calls to DBput 
and DBdelete. The other users must open in either mode 2 or mode 
6. I have never seen this mode used, probably included for 
completeness.

3 - Modify exclusive, same as mode 1 but since no one else is 
allowed to access the data base, no locking is required and no 
other concurrent modes of open are allowed. This mode has 
performance considerations for batch jobs that add or delete 
large numbers of entries. Mode 3 turns off the code required to 
share the data base and allows more efficient use of the buffers.

4 - Modify, allow concurrent read access. This is the same as one 
but no locking is required because the only other assessors 
allowed will be mode 6.

5 - Read, allow concurrent modify access. This is a read only 
mode that will allow any number of other mode 1 and ' assessors. 
This mode is the second most popular mode, it is used by report 
programs that only need to read and process the entries.

6 - Read, allow concurrent modify access. This is a read only 
mode that allows concurrent modes of 6, 2, 4, or 8.

7 - Read, exclusive. This mode would only be used to improve the 
performance of a large batch job that required the data to remain 
the same for the length of the procedure.

8 - Read, allow concurrent read access in mode 6.

STATUS - Is a ten word array that returns information about the 
results of the intrinsic call, if the DBopen worked the status 
array would contain:

Word Contents
 1 The condition word of 0.
 2 The user class number assigned from 0 to 64.
 3 The current size of the global buffer area in words.
 4 The current size of the user buffer area in words.
 5/10 Not documented unless an error occurs

Typical non-zero condition words:
 -11 Bad base parameter
 -21 Bad password parameter
 -31 Bad mode parameter
 -92 Data base not created

The form of the DBclose intrinsic call is:

CALL DBclose(BASE,DATA SET,MODE,status)

BASE - Is the unmodified parameter used in the DBopen call.

DATA SET - Is the name of the individual data set to be closed. 
This parameter's usage is dependent on the mode provided.

MODE - Directs Image to close a set or the entire data base. A 
mode of 1 terminates access to the entire data base. A mode of 2 
closes the data set referenced in the DATA SET parameter. And a 
mode of 3 reinitializes the data set referenced in the DATA SET 
parameter but the set will remain opened.
 
Modes 2 and 3 are used to reposition the current record pointer 
so that a DBget of the next record will return the first record 
in the data set.

STATUS - The condition word will contain a zero if the close was 
performed as requested. Common non-zero status words are similar 
to the DBopen.

5.2.3 Walk through

NOTE. THAT DBDRIVER IS DOCUMENTED IN SECTION 6.5. In these walk through pay 
special attention to the CPU and elapsed times, these show the relative 
performance profiles for each type of Image data base call.


:DBDRIVER
DBDRIVER C.05.00   FRI, NOV  6, 1998,  9:23 AM  (C) HEWLETT-PACKARD 1978

Command: !B=TEST
Command: !Q=;
Command: !M=1
Command: O
Elapsed time (MS) = 287                 CPU = 117       
Baseid=%000001  Class=64   Current DBG size=32767   DBU size=10240  
Command: ?S
0000 0040 7fff 2800 0000 0191 4163 88d4 0001 0000 

Command: C
Elapsed time (MS) = 21                  CPU = 21        
Command: ?S
0000 0040 7fff 2800 0000 1193 4163 88d4 0001 0000


            
5.3 DBlock & DBunlock

Image leaves the locking of individual data entries or groups for 
entries totally up to the application designer and programmer. 
Locking allows shared access to the data base and strategies of 
locking are highly application dependent. One general rule does 
apply however, the larger the number of shared assessors, the 
smaller the group of entries locked. We have seen that the mode 
of DBopen can serve as a form of data base lock, however these 
exclusive access modes would not be appropriate for an application 
with more than one concurrent user.

So, the DBlock and DBunlock commands allow programmatic locking of 
the entire data base, a specific data set, or a specific set of entries 
within a data set with a particular data item value.

IT IS EXTREMELY IMPORTANT TO UNDERSTAND THAT THESE "LOCKS" ARE 
LOGICALLY HELD AND THAT THERE IS ABSOLUTELY NO CHECKING DONE TO 
SEE IF THE LOCKED ENTRIES ARE INDEED THE ONES BEING UPDATED.

IT IS ALSO EXTREMELY IMPORTANT TO UNDERSTAND THAT ALL ACCESSORS 
MUST LOCK AT A GIVEN LEVEL IN ORDER TO RECEIVE THE BENEFITS OF 
GREATER CONCURRENT ACCESS.

In other words, one application locking at the data base level 
will negate the effects of all of the other applications locking 
at the data set or data item level.

5.3.1 Description

Upon calling DBlock Image:

Will check the current lock list to see if a conflict exists. If 
no conflict exists, Image will add the lock request to the lock 
list. If a conflict exists, then one of two things will occur 
depending on the mode, which tells Image to wait or return. If 
the mode is set for unconditional locking, the program suspends 
and waits until all of the conflicting entries in the lock list 
have been unlocked. If a conditional lock mode was requested and 
a conflict exists, the status is returned to the calling program 
which can then choose to repeat the call or go on to other 
transactions. NOTE THAT THE ONLY OTHER WAY TO CLEAR A PROGRAM 
SUSPENDED UNCONDITIONALLY ON A DBlock IS TO RESTART THE SYSTEM, 
BUT AN UNCONDITIONAL LOCK IS THE ONLY WAY TO GET IN LINE FOR A 
LOCK REQUEST.

Locks should never be held in a situation where the user may have 
to enter some data, instead, your application should lock around 
the logical transaction. More on this topic as we cover 
additional intrinsics.
 
Upon calling DBunlock, Image will:

Verify that the mode is set to 1, remove the calling programs 
lock from the lock list, and reactivate any programs which had 
been suspended because of this entry in the lock list.

5.3.2 Parameters

The form of the DBlock intrinsic call is:

Call DBlock(BASE,QUALIFIER,MODE,status)

QUALIFIER - Is ignored in modes 1 and 2, is the data set name for 
modes 3 and 4, and is an array of lock descriptions for modes ' 
and 6. The format of the lock descriptor is a table within a 
table:

General format:
Word Contents:
 1 The number of lock descriptors
 2 The first lock descriptor
 n The second lock descriptor
 m The third lock descriptor

Lock descriptor format:
Word Contents
 1 The length of the descriptor
 2 An array containing the data set, data item and a 
  relational operator and value of the item or items to be 
  locked.

MODE - Tells Image what level and whether to lock conditionally 
or unconditionally. The odd modes are unconditional and the even 
modes are conditional. Modes 1 and 2 lock at the data base level. 
Modes 3 and 4 lock at the data set level and modes 5 and 6 lock 
on the content of a data item or items.

STATUS - The condition word for DBlock is always zero. Word two 
of the status area must be checked to see if the lock was 
actually granted as requested. For modes 1 through ' this value 
will be one if successful and zero if not. For modes ' and   this 
value will depend on the number of lock descriptors coded in the 
request.

Typical non-zero condition words:
 20 Data base lock conflict (modes 2,4,6)
 22 Data set lock conflict (modes 2)
 23 Data entry lock conflict (mode 4)
 24 Data item lock conflict (mode 6)
 25 Entry or entries already locked (mode 6)
 
The form of the DBunlock intrinsic call is:

Call DBunlock(BASE,DATA SET,MODE,status)

DATA SET - Is not used but is probably included for some planned 
functionality that was never added, or to be compatible with the 
number and types of parameters for the DBlock call.

MODE - Must be set to 1.

STATUS - The condition word for DBunlock is always zero Word two 
of the status area must be checked to see if the unlock was 
actually granted as requested. For modes 1 through 4 this value 
will be one if successful and zero if not. For modes 5 and 6 this 
value will depend on the number of lock descriptors coded in the 
original lock request. Data set locks released are counted as one 
descriptor.

Typical non-zero condition words:
 -11 Bad BASE parameter
 -31 Bad MODE parameter

5.3.3 Walk through

!M=1
Command: LOCK
Elapsed time (MS) = 0                   CPU = 0         
Command: ?S
0000 0001 0000 2800 0000 1199 4163 88d4 0001 0000   
                 
Command: RELEASE
Elapsed time (MS) = 0                   CPU = 0         
Command: ?S
0000 0001 0000 2800 0000 119a 4163 88d4 0001 0000                    


              Mixed Mode Unconditional Locking Example


Time|   DBopen I    |  DBopen II    |  DBopen III   |  DBopen IV 
----|---------------|---------------|---------------|--------------
 1  |DBlock MODE= 5 |               |               |
    |ORDER-SUMMARY  |               |               |
    |ORDER-NUMBER= 1|               |               |
    |<<GRANTED>>    |               |               |
----|---------------|---------------|---------------|--------------
 2  |               |DBlock MODE=3  |               |
    |               |ORDER-SUMMARY  |               |
    |               |<<WAITING 1 >> |               |
----|---------------|---------------|---------------|--------------
 3  |               |               |DBlock MODE= 5 |
    |               |               |ORDER-SUMMARY  |
    |               |               |ORDER-NUMBER= 7|
    |               |<<WAITING 1 >> |<<WAITING 2 >> |
----|---------------|---------------|---------------|--------------
 4  |DBunlock       |               |               |
    |               |<<GRANTED>>    |<<WAITING 1>>  |
----|---------------|---------------|---------------|--------------
 5  |DBlock MODE= 5 |               |               |
    |ORDER-SUMMARY  |               |               |
    |ORDER-NUMBER= 2|               |               |
    |<<WAITING 2 >> |               |<<WAITING 1 >> |
----|---------------|---------------|---------------|--------------
 6  |               |               |               |DBlock MODE= 1
    |<<WAITING 2>>  |               |<<WAITING 1 >> |<<WAITING 3>> 
----|---------------|---------------|---------------|--------------
 7  |               |DBunlock       |               |
    |<<GRANTED>>    |               |<<GRANTED>>    |<<WAITING 1>>
----|---------------|---------------|---------------|--------------
 8  |DBunlock       |               |               |
    |               |               |               |<<WAITING 1>>
----|---------------|---------------|---------------|--------------
 9  |               |               |DBunlock       |
    |               |               |               |<<GRANTED>>
 
5.4 DBerror & DBexplain

These intrinsics are used to format the status array into a 
buffer or the STDLIST of the application program. These error 
messages are meant for debugging purposes and should not be 
presented to the end user in any form. The better application 
programs will print an error listing to the system line printer 
so that as much information about the error condition can be 
accurately captured as possible.

5.4.1 Description

Upon calling DBerror Image will:

Use the status array to determine the error from a prior call and 
look up that error in a message catalog. Then load the text into 
the buffer provided and set the length parameter to the length of 
the text.

This buffer then can be used to print a complete error incident 
report which should include the application name, the users 
current sign on, logical device and time of day.

Upon calling DBexplain Image will:

Use the status array to format an almost incomprehensible error 
message on the applications STDLIST device. This call should be 
used for debugging purposes only because a user would never be 
able to correctly copy down or relay the formatted message to 
anyone who could act on it.

5.4.2 Parameters

The form of the DBerror Intrinsic is:

DBerror(STATUS,buffer,length)

STATUS - Is the status array from a previous Image call.

BUFFER - Is the text error message returned by DBerror.

LENGTH - Is the number of characters in the buffer to a maximum 
  of 72.

There is no status returned from the DBerror call so we can 
really never know if there were any errors in the DBerror 
intrinsic call.

The form of the DBexplain call is:

DBexplain(STATUS)

STATUS - Is the status array from a previous Image call.

There is no status returned from the DBexplain call, so again we 
really never know if this intrinsic worked.

5.4.3 Walk through

Command: !b=BADDB
Command: !Q=;
Command: !M=1
Command: OPEN
Elapsed time (MS) = 58                  CPU = 43        
NO SUCH DATABASE
Command: ?S
ffff 0000 0034 0d50 0000 0191 4163 88d4 0001 0000                    
Command: XPLAIN

TURBOIMAGE ERROR AT ???????; RETURN STATUS = -1
DBOPEN, MODE 1, ON BADDB
NO SUCH DATABASE
HEX DUMP OF STATUS ARRAY FOLLOWS:
ffff 0000 0034 0d50 0000 0191 4163 88d4 0001 0000    

5.5 DBput & DBdelete

Well, now that we can open and close the data base and explain 
the possible errors, we can get down to the business of getting 
data into and deleting data out of a data base. The DBput and 
DBdelete intrinsic calls can only be made to data bases opened in 
modes 1,3 or 4, otherwise known as modify access. The user class 
granted from the password must be on the write side of the 
read/write class list for the data set whose entries are being 
added or deleted.

Data entries are individually added by calls to the DBput 
intrinsic. Entries are manually added to manual masters and 
details by calling DBput. Image automatically adds entries to
automatic masters on an as required basis to serve as index 
records as detail entries are added to associated masters.

For details indexed by manual masters, your application must add 
the manual entry prior to attempting to add the detail entry. 
This feature can be used to make sure that, for example, the 
vendor number exists prior to the adding of P.O.'s to a purchase 
order detail.

When adding to a manual master, the search item must be unique, 
if an entry is already present with the same search item, an 
error is returned.

To delete an entry, you must first read the entry with the DBget 
intrinsic. If the entry is a detail, and it is the only entry on 
a chain from an automatic master, then the master entry will also 
be deleted. If the entry to be deleted is in a manual master, 
then all associated detail chains must be empty for the delete to 
be successful.

If the data base was opened in mode 1, then the DBdelete sequence 
must be accompanied by a preceding call to DBlock and followed by 
a call to DBunlock, to insure that another user does not delete 
the entry between when the record was read and when it was 
deleted.

5.5.1 Description

Upon calling DBput, Image will:

Check to see if the data set is an automatic master, a manual 
master or a detail data set. If it is an automatic master, Image 
will return an error message.

If the DBput is to a manual master, Image will:

Check to see if the search item is in the list specified, and see 
if the search item is unique by hashing the key.
 
Check to see if a lock covers the data base, data set or some 
value of the search item if the open mode was 1.

If the primary location is occupied, Image must determine if the 
entry at that location is a primary entry or a secondary entry on 
a secondary chain to another primary. If the entry at the hashed 
location was a primary, then the new entry is added to its 
secondary chain at the nearest open record. If the entry hashed 
to a location that was occupied by a secondary, then the 
secondary is moved to another location and the new entry is made 
a primary at its hashed location.

If the DBput is to a detail, Image will:

Check the free space chain of deleted entries. If any entries 
have been deleted, then the most recently deleted location is 
used. If no records have ever been deleted or all delete entries 
have been reused, then new entries are added at the EOF.

Add the detail to all chains from associated masters by finding 
each associated master chain head and modifying the backward 
pointer and chain count of each master. Then finding the old last 
entry in the chain so that it now has a forward pointing to the 
new record in the chain. And finally, setting the forward and 
backward pointers of the new entry.

If the detail data set specified that its chains were to be kept 
in sorted sequence, then instead of adding at the end of the 
detail chain, Image will read up the chain to find the logical 
point to add the record in the chain. This can produce 
applications which are spectacularly slow, and should only be 
specified if the average chain length is small or the records are 
added in the same sequence as the sort item.

Upon a call to DBdelete, Image will:

Check to see if there is a current data entry to be deleted, and 
if it is, a manual master or detail entry. If no current record 
is present Image will return an error.

If the current entry to be deleted is a manual master entry, 
Image will:

First, check to see if all of the chain heads are empty. If not, 
Image will return an error for the DBdelete call. The Image will 
identify the type of master entry to be deleted. The entry can be 
either a primary with no secondaries, a primary with secondaries 
or a secondary entry.

To delete a primary entry with no secondaries, Image sets the 
entry to zeros, flips the bit in the bit map to 0, and updates 
the free space count in the label of the data set.
 
To delete a primary entry that has secondary entries, Image finds 
the next entry in the secondary chain and migrates it to the 
primary location, overlaying the primary to be deleted, zeros out 
the secondary's old location, flips the bit in the bit map and 
updates the free space count on the label of the data set.

To delete a secondary entry, Image hashes the key value of the 
secondary to find the primary entry, locates the preceding and 
following entries in the secondary chain, if any, and updates 
their forward and backward pointers, updates the secondary chain 
counter in the primary entry, flips the bit in the bit map and 
updates the free space count on the label of the data set.

5.5.2 Parameters

The form of the DBput Intrinsic is:

DBput(BASE,DATA SET,MODE,status,LIST,BUFFER)

BASE - Is the unmodified parameter used in the DBopen call.

DATA SET - Is the name of the individual data set to which the 
entry is to be added.

MODE - Must be set to 1.

STATUS - The condition word will contain a zero if the entry was 
added as requested. When successful, the status will also hold 
other interesting information:

Word Contents
 1 	The condition word
 2 	Word length of the buffer array
 3/4	Two word relative record number of the new entry
 5/6 	Count of the number of entries in the chain, if this was
 7/8	a master, it is the number of secondaries, if this was a
  	detail, it is the number of entries in the detail chain.
 9/10 If this was a detail this is a pointer to the next entry 
 	on the chain, If a master its zeros.

LIST - This is the list of data items to be added in the entry. 
This list can be the data item names, numbers, or a special 
character. The data item numbers are simply the number given in 
the sequence defined in the item table of the schema. Two special 
characters may be used, the * in the list tells image to use the 
identical list as was specified in the previous call, an A tells 
Image to use all of the data items for the data set. Both of 
these special lists save processing time because if data item 
names are specified, Image must look them up in the item table 
and cut and paste the contents of the buffer to match the actual 
entry data record.
 
All of the search or sort items must be in the list provided or 
else Image will not be able to complete the DBput request.

BUFFER - Is an array containing the values of the data items to 
be added. The individual fields of the buffer must correspond to 
the sequence and type of the items in the LIST.

Typical non-zero condition words:
 -12 No lock covering entry to be added.
 -14 Cannot do a DBput in current DBopen mode.
 -21 Bad data set in DATA SET parameter.
 -23 Do not have write access to data set.
 -24 Cannot write to an Automatic master.
 -52 Bad LIST or item in list.
 -53 Missing search or sort item.
 16 Data set is full (delete entries or resize)
 43 Attempt to add duplicate master entry.
 1xx Missing manual master entry on path number xx

The form of the DBdelete Intrinsic is:

DBdelete(BASE,DATA SET,MODE,status)

BASE - Is the unmodified parameter used in the DBopen call.

DATA SET - Is the name of the individual data set to which the 
entry is to be removed.

MODE - Must be set to 1.

STATUS - The condition word will contain a zero if the entry was 
removed as requested. When successful, the status will remain as 
it was after the preceding DBget:

Word Contents
 1 	The condition word.
 2 	Zero.
 3/4 	Two word relative record number of the deleted entry.
 5/6 	Count of the number of entries in the chain, if this was
 7/8 	a master it is the new number of secondaries, if this 
  	was a detail it is not modified. 
 9/10 Not modified from the preceding call.

Typical non-zero condition words:
 -12 No lock covering entry to be deleted.
 -14 Cannot do a DBdelete in current DBopen mode.
 -21 Bad data set in DATA SET parameter.
 -23 Do not have write access to data set.
 17 No entry has been previously found.
 44 Attempt to delete a master with detail chains.
 
5.5.3 Walk through

Command: !Q=CUSTOMER-MASTER
Command: !M=1
Command: !L=CUSTOMER-NAME
Command: !E=TEST CUSTOMER1
Command: PUT
Elapsed time (MS) = 41                  CPU = 19        
Command: ?S
0000 0014 0000 0005 0000 0001 0000 0000 0000 0000                    
Command: !E=TEST-CUSTOMER2
Command: PUT
Elapsed time (MS) = 3                   CPU = 2         
Command: ?S
0000 0014 0000 0004 0000 0002 0000 0000 0000 0000   


Command: DELETE
Elapsed time (MS) = 2                   CPU = 2         
Command: ?S
0000 0000 0000 0004 0000 0002 0000 0000 0000 0000
      

5.5.4 Image master secondary chain processing

NOTE: THIS SECTION NEEDS UPDATING FOR MASTER DYNAMIC EXPANSION!

Secondary master entries happen when unlike search item values 
hash to the same relative record address. The frequency of this 
event is dependent upon the capacity of the master and the 
content of the search item.

When this happens Image finds the nearest open entry using the 
bit map, and places this entry at that relative record number and 
identifies it as a secondary by chaining it to the entry that got 
put in the primary location first.

A secondary chain consists of the first entry to hash to a given 
relative record address and all of its secondaries linked 
together with forward and backward pointers.

It is important to understand the performance and practical 
implications of secondary chains because they can cause severe 
performance degradation and erroneous processing.

The secondary chain head is made up of three fields, a count of 
the number of entries in the chain, a forward pointer to the next 
entry on the chain, and a backward pointer to the previous entry 
on the chain. If the count is 1 then the entry is a primary with 
no secondaries. If the count is 0 then the entry is a secondary 
to some other primary entry. And if the count is greater than 1 
then the entry is a primary and the count is the number of 
secondary entries on the chain.

An example of the ORDER-NO-MASTER after four entries have been 
added:

                         ORDER-NO-MASTER

Relative| Secondary Chain Head |   Detail Chain Head  | 
 Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO
--------|-----|--------|-------|-----|--------|-------|----------
  180   |     |        |       |     |        |       |
--------|-----|--------|-------|-----|--------|-------|----------
  181   |  2  |  187   | 187   |  1  |  133   | 133   |    1
--------|-----|--------|-------|-----|--------|-------|----------
  182   |  1  |   0    |  0    |  1  |    2   |   2   |    9
--------|-----|--------|-------|-----|--------|-------|----------
  183   |     |        |       |     |        |       |
--------|-----|--------|-------|-----|--------|-------|----------
  184   |     |        |       |     |        |       |
--------|-----|--------|-------|-----|--------|-------|----------
  185   |  1  |   0    |  0    |  1  |  511   | 511   |    5
--------|-----|--------|-------|-----|--------|-------|----------
  186   |     |        |       |     |        |       |
--------|-----|--------|-------|-----|--------|-------|----------
  187   |  0  |   0    |  0    |  1  |  358   | 358   |    7
--------|-----|--------|-------|-----|--------|-------|----------

 
5.5.5 Migrating secondaries and DBput

Suppose that an order number of 10 is to be added, and it hashes 
to relative record number 182. Relative record number 182 has a 
primary entry so Image will attempt to locate the new secondary 
entry in the same block as its primary and link them together via 
forward and backward pointers.

                         ORDER-NO-MASTER

Relative| Secondary Chain Head |   Detail Chain Head  | 
 Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO
--------|-----|--------|-------|-----|--------|-------|----------
  180   |  0  |   0    |  0    |  1  |   5    |  5    |   10
--------|-----|--------|-------|-----|--------|-------|----------
  181   |  2  |  187   | 187   |  1  |  133   | 133   |    1
--------|-----|--------|-------|-----|--------|-------|----------
  182   |  2  |  180   | 180   |  1  |    2   |   2   |    9
--------|-----|--------|-------|-----|--------|-------|----------
  183   |     |        |       |     |        |       |


Now suppose that order number 22 is to be added, and it also 
hashes to relative record number 182. This new entry is also 
linked to the secondary chain in the following fashion.

                         ORDER-NO-MASTER

Relative| Secondary Chain Head |   Detail Chain Head  | 
 Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO
--------|-----|--------|-------|-----|--------|-------|----------
  180   |  0  |   0    | 183   |  1  |   5    |  5    |   10
--------|-----|--------|-------|-----|--------|-------|----------
  181   |  2  |  187   | 187   |  1  |  133   | 133   |    1
--------|-----|--------|-------|-----|--------|-------|----------
  182   |  3  |  183   | 180   |  1  |    2   |   2   |    9
--------|-----|--------|-------|-----|--------|-------|----------
  183   |  0  |  180   |  0    |  1  |  101   | 101   |   22

 
Migrating secondaries happen when a new entry's search item value 
hashes to a relative record occupied by a secondary entry, or 
when a primary entry with secondary entries is deleted.

In the first example order number 2 is added and hashes to 
relative record number 183, which is already occupied by a 
secondary entry. In this case the secondary entry is moved to 
relative record number 18' so that the new entry can occupy its 
rightful primary location.


                         ORDER-NO-MASTER

Relative| Secondary Chain Head |   Detail Chain Head  | 
 Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO
--------|-----|--------|-------|-----|--------|-------|----------
  180   |  0  |   0    | 184   |  1  |   5    |  5    |   10
--------|-----|--------|-------|-----|--------|-------|----------
  181   |  2  |  187   | 187   |  1  |  133   | 133   |    1
--------|-----|--------|-------|-----|--------|-------|----------
  182   |  3  |  184   | 180   |  1  |   2    |  2    |    9
--------|-----|--------|-------|-----|--------|-------|----------
  183   |  1  |   0    |  0    |  1  |   1    |  1    |    2
--------|-----|--------|-------|-----|--------|-------|----------
  184   |  0  |  180   |  0    |  1  |  101   | 101   |   22
--------|-----|--------|-------|-----|--------|-------|----------

In this second example order number 1 is to be deleted, and this 
will cause the next entry in the secondary chain to move into the 
primary location.


                         ORDER-NO-MASTER

Relative| Secondary Chain Head |   Detail Chain Head  | 
 Record |Count|Backward|Forward|Count|Backward|Forward|ORDER-NO
--------|-----|--------|-------|-----|--------|-------|----------
  180   |     |        |       |     |        |       |
--------|-----|--------|-------|-----|--------|-------|----------
  181   |  2  |  187   | 187   |  1  |  133   | 133   |    1
--------|-----|--------|-------|-----|--------|-------|----------
  182   |  2  |  184   | 184   |  1  |    5   |   5   |   10
--------|-----|--------|-------|-----|--------|-------|----------
  183   |  1  |   0    |  0    |  1  |   1    |  1    |    2
--------|-----|--------|-------|-----|--------|-------|----------
  184   |  0  |   0    |  0    |  1  |  101   | 101   |   22
--------|-----|--------|-------|-----|--------|-------|----------

 
5.6 DBget & DBfind

This is where all of the discussion about access methods pays 
off. How information is retrieved from the data base is critical 
to the success of the application. Hopefully, you have a standard 
application that reads the data 70% of the time and writes it 30% 
of the time. In other words data is only writing out to disc if 
you are going to read it back more than once.

The DBfind and DBget Intrinsics work together to allow rapid 
retrieval of data entries. As we'll see, Image uses directed, 
calculated, serial and chained access via these two intrinsics 
for all entry retrieval.

DBfind is actually short for DBfindchainhead, remembering the 
long name will remind you that the DBfind intrinsic does not 
actually find any data. This intrinsic is solely used to set up 
the current path for subsequent chained access into a specific 
detail data set.

5.6.1 Description

Upon calling DBget, Image will:

Check to see if the data set is opened and if not open it. Then 
verify that the user class number has read access to all of the 
items in the list. The next step depends on the mode of the 
DBget:

MODE = 1, Reread the current record: (masters & details)

First, Image will determine if there is a current record from a 
previous DBget, if not, then an error will be returned. Then 
Image will check the bit map to the current record's block to see 
if a record at that location still exists. If so, then the record 
is read and the requested items are returned in the buffer.

MODE = 2, Read the next valid record: (masters & details)

Image will scan the bit maps in the current and subsequent blocks 
until an entry is found. Then the record is read and the 
requested items are returned in the buffer.

MODE = 3, Read relative record number N: (masters & details)

Image will read in the block containing the requested record, 
check the bit map to see if a record at that location exists and 
if so, read the record and return the requested items in the 
buffer.
 
MODE = 5, Read the next record in the chain: (details only)

Image will get the block of the next record on the chain which is 
the current path, check the bit map to see if a record exists at 
the forward pointer location and if so, read the record and 
return the requested items in the buffer.

MODE = 6, Read the previous record in the chain: (details only)

Image will get the block of the prior record on the chain which 
is the current path, check the bit map to see if a record exists 
at the backward pointer location and if so read the record and 
return the requested items in the buffer.

MODE = 7, Read by calculating a relative record number (masters)

Image will hash the argument provided generating a relative 
record number. The block containing that record is read in and 
the bit map is checked to see if a record exists at that 
location. If no record exists, then Image will return an error. 
If a record exists, then Image must read the record and compare 
the argument to the search item value of that record. If they 
match, then the requested items are loaded into the buffer. If 
they don't match, then Image must read each record on the 
secondary chain, matching on the argument and search item values. 
If no secondary entry matches, then Image returns an error. If a 
secondary does match, then Image loads the items requested into 
the buffer.

MODE = 8, Read the primary entry (masters)

This is identical to mode 7 except that the primary record will 
be returned even if a secondary argument is provided. This call 
was probably provided because this logic is required for a 
DBdelete of a secondary master entry. This saves chasing the 
secondary chain back to the primary entry to update the secondary 
chain count.


Upon calling DBfind, Image will:

First determine the appropriate master by matching the data item 
name specified to a unique master. The argument provided is 
hashed in a form of DBget mode=7 to locate the master entry which 
contains the chain head required. If no master entry exists, then 
an error is returned. If a master entry does exist, then the 
number of entries on the chain, and the forward and backward 
pointers are returned.
 
5.6.2 Parameters

The form of the DBget Intrinsic is:

DBget(BASE,DATA SET,MODE,status,LIST,buffer)

BASE - Is the unmodified parameter used in the DBopen call.

DATA SET - Is the name of the individual data set from which the 
entry is to be retrieved.

MODE - Determines the access method to be used:

1 - Reread the current record, very useful for making sure that 
someone has not modified or deleted it before we attempt to do 
the same to someone else.

2 - Serially read the next entry in the data set. For masters 
this could cause a great number of IO's as the next valid entry 
can only be found by looking in the bit maps of the next blocks 
until one is found that has an entry.

3 - Reverse read the previous entry in the data set. For masters 
this could have the same adverse effects as the forward serial 
read.

4 - Use the argument parameter to read the entry at a specific 
relative record location. This is useful if you have stored the 
relative record number of the entries in another file. Query uses 
this trick when sorting a large file. Instead of building a sort 
file as large as the entire detail, it builds a sort record 
consisting of the sort items and the relative record number of 
the entry, then after the sort, goes back to get the rest of the 
data using directed DBgets.

5 - Chained read of the next entry in the detail chain. Must be 
preceded by a DBfind at some point to establish the current path.

6 - Reverse chain read of the previous entry in the detail chain. 
Must be preceded by a DBfind call to establish the current path.

7 - Calculated read to locate a master entry that matches the 
value in the argument parameter.

8 - Read of the entry that occupies the primary address of the 
value in the argument parameter. This was provided because the 
DBdelete intrinsic needed to be able to quickly get to the 
primary entry when deleting a master entry that happened to be a 
secondary. 
 
STATUS - The condition word will contain a zero if the entry was 
retrieved as requested. When successful, the status will also 
hold other interesting information:

Word Contents
 1	The condition word
 2 	Word length of the buffer array
 3/4	Two word relative record number of the entry retrieved.
 5/6  If this was a master, it is the number of secondaries, 
   	If this was a detail, it is set to zero.
 7/8  If this was a detail, this is a pointer to the prior 
  	entry on the current chain, if a master, it's zeros.
 9/10 If this was a detail, this is a pointer to the next
  	entry on the current chain, if a master, it's zeros.

LIST - This is the list of data items to be retrieved into the 
buffer. This list can be the data item names, numbers, or a 
special character. The data item numbers are simply the numbers 
given in the sequence defined in the item table of the schema. 
Two special characters may be used, the * in the list tells image 
to use the identical list as was specified in the previous call, 
an @ tells Image to use all of the data items for the data set. 
Both of these special lists save processing time because if data 
item names are specified, Image must look them up in the item 
table and cut and paste the contents of the buffer to match the 
actual entry data record.

All of the search or sort items must be in the list provided or 
else Image won't be able to complete the DBput request.

BUFFER - Is an array to receive the values of the data items. 
The individual fields of the buffer will correspond to the 
sequence and type of the items in the LIST.

Typical non-zero condition words:
 -11 Bad BASE parameter.
 -21 Bad DATA SET parameter.
 -31 Bad MODE parameter.
 -52 Bad LIST or item in list.
 10 Beginning of data set from a call in mode 3.
 11 End of data set from a call in mode 2.
 12 Tried to read before the first record from a call in 
  mode 4.
 13 Tried to read past the end of file from a call in mode 4
 14 Reached the top of the chain from a call in mode 6.
 15 Reached the end of the chain from a call in mode 5.
 17 No entry at/for the argument value specified.
 
 1, Broken chain, Image detected that the entry reached via 
  a call in mode 5 or 6 does not belong on the current 
  chain. This can happen quite easily if your application 
  doesn't issue a lock covering the current chain, and 
  another user deletes the entry your application 
  "thinks" is the next entry. It can also happen for real 
  and can be very difficult to fix. This is why we will 
  discuss data logging and recovery in upcoming chapters.

The form of the DBfind Intrinsic is:

DBfind(BASE,DATA SET,MODE,status,ITEM,ARGUMENT)

BASE - Is the unmodified parameter used in the DBopen call.

DATA SET - Is the name of the individual data set from which the 
entry is to be retrieved.

MODE - Must be 1 for a DBfind call.

STATUS - The condition word will contain a zero if the chain head 
was retrieved as requested. When successful, the status will also 
hold other interesting information:

Word Contents
 1 	The condition word
 2/3 	Zeros 
 5/6 	The number of entries in the chain. This can be very 
  	useful in that the application can very quickly 
  	determine how many records will have to be processed
  	to satisfy a given request.
 7/8	Pointer to the relative record number in the detail data 
 	set of the last entry in the chain. 
 9/10 Pointer to the relative record number in the detail data
  	set of the first entry in the chain.

ITEM - This is the search item whose chain head we are looking 
for. The combination of this ITEM and the detail set name in 
DATA SET are all Image needs to determine the appropriate master 
to access and retrieve the chain head.

ARGUMENT - Is the value of the data item named in the ITEM 
parameter.

Typical non-zero condition words:
 -11 Bad BASE parameter.
 -21 Bad DATA SET parameter.
 -31 Bad MODE parameter.
 -52 Bad LIST or item in list.
 17 No entry at/for the argument value specified.
 
5.6.3 Walk through

5.7 DBupdate & DBcontrol

It used to be that DBupdate could only change the value of
non-search and non-sort items of the current record. If a search
or sort item must had to be changed then a call to DBdelete and 
DBput was required. This was one of the major design flaws of 
Image. To change a search item of one path into a detail, you were
forced to delete the entire entry and re-add the entry to all paths.
The difference between systems resources for the two types of operations 
is substantial.

Now database access modes 1, 3, or 4, allow DBupdate to modify the
values of detail data set search and sort items if permitted by the
critical item update (CIUPDATE) option setting in combination with a call
to DBcontrol.  Master data set key item values cannot be modified with
DBupdate. 


5.7.1 Description

Upon calling DBupdate, Image will:

Check to see if the user class has write access to the data items 
in the list provided. Then Image will check to see if there is a 
current record. If not, Image will return an error.

Image will then compare the value each item in the buffer to the 
value of each item in the record. If an item is different, then 
Image checks to make sure that if it is a search or sort item that 
CIUPDATE is allowed.

5.7.2 Parameters

The form of the DBupdate Intrinsic is:

DBupdate(BASE,DATA SET,MODE,status,LIST,BUFFER)

BASE - Is the unmodified parameter used in the DBopen call.

DATA SET - Is the name of the individual data set in which 
the entry is to be updated.

MODE - Must be set to 1.

STATUS - The condition word will contain a zero if the entry was 
updated as requested.

Word Contents
 1	The condition word
 2 	Word length of the buffer array
 3/4 	Same two word relative record number of the entry 
 	retrieved with the preceding DBget.
 5/6	If the preceding DBget was a master, it is the number 
  	of secondaries. If this was a detail, it is set to zero.
 7/8	If the preceding DBget was a detail, this is a pointer 
  	to the prior entry on the current chain, if a master, 
 	it is zeros.
 9/10 If the preceding DBget was a detail, this is a pointer 
  	to the next entry on the current chain.
 
LIST - This is the list of data items to be updated in the entry, 
this list can be the data item names, numbers or a special 
character. The data item numbers are simply the numbers given in 
the sequence defined in the item table of the schema. Two special 
characters may be used, the * in the list tells image to use the 
identical list as was specified in the previous call, an @ tells 
Image to use all of the data items for the data set. Both of 
these special lists save processing time, because if data item 
names are specified, Image must look them up in the item table 
and cut and paste the contents of the buffer to match the actual 
entry data record.

BUFFER - Is an array containing the values of the data items to 
be added. The individual fields of the buffer must correspond to 
the sequence and type of the items in the LIST.

Typical non-zero condition words:
 -12 No lock covering entry to be added in DBopen mode 1.
 -14 Cannot do a DBupdate in current DBopen mode.
 -21 Bad data set in DATA SET parameter.
 -52 Bad LIST or item in list.
 17 No current entry, try a DBget first.
 41 Attempt to update a search or sort item.
 42 Attempt to update a read only item, check your user 
  class number against the data set and item read/write 
  lists.

5.7.3 Walk through

DBDRIVER C.05.00   TUE, NOV 10, 1998,  3:27 PM  (C) HEWLETT-PACKARD 1978

Command: !B=TEST
Command: !M=3
Command: !Q=;
Command: O
Elapsed time (MS) = 285                 CPU = 120       
Baseid=%000001  Class=64   Current DBG size=32767   DBU size=10240  
Command: !Q=ORDER-SUMMARY
Command: G
Elapsed time (MS) = 17                  CPU = 16        
030061  052105  051524  026503  052523  052117  046505  051040  
 0  1    T  E    S  T    -  C    U  S    T  O    M  E    R      
020040  020040  020040  020040  020040  020040  020040  020040  
                                                                
020040  020040  020040  020040  020040  031062  030060  020040  
                                         2  2    0  0           
020040  020040  
                
Command: !E=02
Command: !M=1
Command: !L=ORDER-NO
Command: UPDATE
Elapsed time (MS) = 1                   CPU = 1         
DBUPDATE attempted to modify value of critical item--key, search or sort
Command: !M=5
Command: KONTROL
Elapsed time (MS) = 1                   CPU = 0         
Command: ?S
0000 0000 0000 0001 0000 319b 4163 88d4 0005 0000                    
Command: !M=1
Command: UPDATE
Elapsed time (MS) = 14                  CPU = 8         
Command: EXIT


5.8 DBbegin & Dbend

These intrinsics are used to bracket a logical transaction in the 
Image transaction log file. A logical transaction may consist of 
many Intrinsic calls to perform a particular task. Transaction 
logging recovery uses the DBbegin and DBend to determine if the 
entire logical transaction was completed. If DBRECOVER does not 
find a corresponding DBend to a DBbegin, it will not apply the 
partially completed logical transaction.

Both intrinsics allow posting of a text message to the log file. 
This can be useful if you intend to build custom programs to 
process the log file to print reports.

5.8.1 Description

Upon calling DBbegin, Image will:

First check to see if DBbegin has been called twice without a 
call to DBend, you can only have one logical transaction going at 
a time. Then write a transaction begin record out to the log 
file. This record may contain text passed via the DBbegin call.

5.10.2 Parameters

The form of the DBbegin Intrinsic is:

DBbegin(BASE,LOGTEXT,MODE,status,TEXTLEN)

BASE - Is the unmodified parameter used in the DBopen call.

LOGTEXT - Is an array containing up to 512 characters or 25  
words of binary data. This data is only used if you have a custom 
program for processing the Image transaction log file.

MODE - Must be set to 1.

STATUS - The condition word will contain a zero if the record was 
written as requested.


Word Contents
 1 The condition word
 2 Whatever it was prior to this call

TEXTLEN - The number of words in the LOGTEXT.

Typical non-zero condition words:
 -151 TEXTLEN too large
 -152 Called DBbegin twice in a row without a DBend
 71 Logging not enabled


The form of the DBend Intrinsic is:

DBend(BASE,LOGTEXT,MODE,status,TEXTLEN)

BASE - Is the unmodified parameter used in the DBopen call.

LOGTEXT - Is an array containing up to 512 characters or 25  
words of binary data. This data is only used if you have a custom 
program for processing the Image transaction log file.

MODE - Tells Image how to end the logical transaction:

1 - Write the end of logical transaction record to the logfile.

2 - Write the end of logical transaction record to the logfile 
and flush the transaction logging buffers in memory to disc or 
tape. This insures that the logical transaction just completed is 
captured in the log file.

STATUS - The condition word will contain a zero if the record was 
written as requested.


Word Contents
 1 The condition word
 2 Whatever it was prior to this call

TEXTLEN - The number of words in the LOGTEXT.

Typical non-zero condition words:
 -151 TEXTLEN too large
 -153 Called DBend without a prior call to DBbegin.
 71 Logging not enabled

5.9 DBxbegin, DBxend, & DBxundo

These intrinsics are called to dynamically commit or roll back a sequence
of DBput, DBdelete, and Dbupdate calls that completed successfully inside 
a dynamic transaction. The start of the sequence is designated by a previous 
call to DBxbegin. DBxundo cannot be called to roll back a transaction if 
AUTODEFER is enabled because transaction management is exclusive of AUTODEFER.

5.9.1 Description

Upon calling DBxundo, Image will undo all previous calls to DBput,
DBdelete, and Dbupdate since DBxbegin was called. A DBxundo is implicit
performed if the program ends without a call to DBxend! This feature is
very handy for protecting logical transactions in the event of a program
abort. 

5.9.2 Parameters

The form of the DBbegin Intrinsic is:

DBxbegin(BASE,LOGTEXT,MODE,status,TEXTLEN)

BASE - Is the unmodified parameter used in the DBopen call.

LOGTEXT - Is an array containing up to 512 characters or 25  
words of binary data. This data is only used if you have a custom 
program for processing the Image transaction log file.

MODE - Must be set to 1.

STATUS - The condition word will contain a zero if the record was 
written as requested.


Word Contents
 1 The condition word
 2 Whatever it was prior to this call

TEXTLEN - The number of words in the LOGTEXT.

Typical non-zero condition words:
 -151 TEXTLEN too large
 -152 Called DBxbegin twice in a row without a DBxend
 

The form of the DBxend Intrinsic is:

DBxend(BASE,LOGTEXT,MODE,status,TEXTLEN)

BASE - Is the unmodified parameter used in the DBopen call.

LOGTEXT - Is an array containing up to 512 characters or 25  
words of binary data. This data is only used if you have a custom 
program for processing the Image transaction log file.

MODE - Tells Image how to end the logical transaction:

1 - End of dynamic transaction.

2 - End of dynamic transaction and if logging write the end of logical 
transaction record to the logfile and flush the transaction logging buffers in 
memory to disc or tape. This insures that the logical transaction just completed 
is captured in the log file.

STATUS - The condition word will contain a zero if the record was 
written as requested.


Word Contents
 1 The condition word
 2 Whatever it was prior to this call

TEXTLEN - The number of words in the LOGTEXT.

Typical non-zero condition words:
 -151 TEXTLEN too large
 -153 Called DBxend without a prior call to DBxbegin.
 71 Logging not enabled


The form of the DBxundo Intrinsic is:

DBxundo(BASE,LOGTEXT,MODE,status,TEXTLEN)

BASE - Is the unmodified parameter used in the DBopen call.

LOGTEXT - Is an array containing up to 512 characters or 256 
words of binary data. This data is only used if you have a custom 
program for processing the Image transaction log file.

MODE - Must be an integer equal to 1.

STATUS - The condition word will contain a zero if the record was 
written as requested.

Word Contents
 1 	The condition word
 2/10 Whatever it was prior to this call

TEXTLEN - The number of words in the LOGTEXT.

Typical non-zero condition words:

-215	 XM error encountered when rolling out dynamic transactions.    
-11  	 Bad database reference.                             
-151 	 Text length greater than 512 bytes.                 
-218   Output deferred is incompatible.

5.9.3 Walk through

Command: !B=TEST
Command: !M=3
Command: !Q=;
Command: O
Elapsed time (MS) = 369                 CPU = 120       
Baseid=%000001  Class=64   Current DBG size=32767   DBU size=10240  
Command: !M=1
Command: S
Elapsed time (MS) = 0                   CPU = 0         
Command: ?S
0000 0040 7fff 2800 0000 31a4 4163 88d4 0001 0000                    
Command: !Q=CUSTOMER-MASTER
Command: !E=TEST CUSTOMER
Command: !L=@
Command: P
Elapsed time (MS) = 19                  CPU = 17        
Command: ?S
0000 006a 0000 0003 0000 0001 0000 0000 0000 0000                    
Command: W
Elapsed time (MS) = 104                 CPU = 32        
Command: ?S
0000 006a 0000 0003 0000 31a6 4163 88d4 0001 0000                    
Command: G
Elapsed time (MS) = 0                   CPU = 0         
NO CURRENT RECORD OR THE CURRENT RECORD IS EMPTY (CONTAINS NO ENTRY)


5.10 DBinfo

This intrinsic is used to programmatically access the root file 
and return information about the data base. The type of 
information that can be returned is mostly limited to the item, 
set and path definitions.

DBinfo can be used to construct applications that are smart 
enough to pre-map data item names to numbers, or check to see if 
there is enough room in the data set for the number of entries 
about to be added.

5.10.1 Description

Upon calling DBinfo, Image will:

Perform the appropriate lookup requested in the MODE parameter 
and return the data into the buffer.

5.10.2 Parameters

The form of the DBinfo Intrinsic is:

DBinfo(BASE,QUALIFIER,MODE,status,buffer)

BASE - Is the unmodified parameter used in the DBopen call.

QUALIFIER - Is the data item or data set name or number depending 
on the MODE selected.

MODE - Tells DBinfo what type of information to return. Modes 101 
to 104 return data item information. Modes 201 to 204 return data 
set information. Modes 301 and 202 return path identifying 
information. Modes 401 and 402 return logging information and 
mode 501 returns the subsystem access flag from the root file.

101 - Looks up data item number and type of access for data item 
provided in the QUALIFIER parameter. The data item number is 
returned in the buffer. If the data item number is negative then 
the item can be updated or the entry containing the item can be 
added using DBput or removed using DBdelete.

This mode is useful for finding the data item numbers for 
specific data items. This can speed up access to the data items 
in a data set if a call is repeated many times.

102 - Looks up the definition of a data item provided in the 
QUALIFIER. The data item name, data type, sub-item length, and 
sub-item count is returned in the buffer.
 
103 - Looks up all of the data items in the data base. The 
qualifier is not used. The format of the data items returned is 
the same as mode 102.

104 - Looks up the definition of all the data items for the data 
set name or number provided in the QUALIFIER. The definition of 
each data item as in mode 102, are returned in the buffer.

201 - Looks up data set number and type of access for the data 
set provided in the QUALIFIER parameter. The data set number is 
returned in the buffer. If the data set number is negative, then 
an entry in the data set can be added using DBput or removed 
using DBdelete.

This mode is useful for finding the data set numbers for specific 
data sets. This can speed up access to the data set if a call is 
repeated many times.

202 - Looks up the definition of a data set provided in the 
QUALIFIER. The data set name, set type, entry length, blocking 
factor, number of entries, and the capacity of the set are 
returned in the buffer.

This mode is useful for writing a custom program to check the 
capacity and percent full factor of each data set.

203 - Looks up all of the data sets in the data base and 
describes the type of access available. The QUALIFIER is ignored, 
and a word array is returned that contains a count of the number 
of data sets in word one and the data set numbers available in 
the rest of the array. The data set number is negative if data 
entries can be added using DBput and removed using DBdelete.

204 - Looks up all of the data sets in which the data item passed 
in the QUALIFIER occurs. The format of the buffer is the same as 
in mode 203.

301 - Looks up the paths into a detail or out of a master data 
set identified in the QUALIFIER. The path information is returned 
as a word array in the buffer. The first word is the path count, 
then for each path the data set number, search item number, and 
sort item number are returned. The path information is arranged 
in the same sequence as was specified in the schema.

302 - Looks up the search item for a master or the primary path 
search item for a detail. The search item number is returned in 
the Qualifier.
 
401 - Looks up current transaction logging information. The 
QUALIFIER is ignored. The first four words of the buffer contain 
the current log identifier name. The fifth through seventh words 
contain flags for data base logging, user logging, and a 
transaction flag. Words eight and nine contain the current user 
transaction number.

402 - Looks up Intrinsic Level Recovery information. The 
QUALIFIER is ignored. Word one of the buffer is a flag 
indicating whether ILR is enabled. Words two through four are the 
data and time that ILR was enabled, and words five through 
fourteen indicate the last intrinsic and data set that was 
processed via ILR.

501 - Looks up the current subsystem access flag. The qualifier 
is ignored. The buffer contains one word which is zero. If no 
access should be allowed, one if only read access is allowed and 
three if both read and write access are allowed.

STATUS - The condition word will be zero if the information 
request successfully processed.

Word Contents
 1 The condition word
 2 The length of the buffer returned

BUFFER - Is returned to the calling program, the contents vary 
depending on the MODE selected.

Typical non-zero condition words:
-31 Bad mode parameter
 50 The buffer array was too small to hold the requested 
  data. Remember that certain modes can cause a very large 
  buffer to be returned.

5.10.3 Walk through

Command: !B=TEST
Command: !Q=;
Command: !M=1
Command: O
Elapsed time (MS) = 463                 CPU = 111       
Baseid=%000001  Class=64   Current DBG size=32767   DBU size=10240  
Command: !M=203
Command: I
Elapsed time (MS) = 10                  CPU = 10        
3       -1      -2      -3      
Command: !Q=1
Command: !M=202
Command: I
Elapsed time (MS) = 23                  CPU = 6         
CUSTOMER-MASTER   M   106   3     0   0   5         5       

------------------------------------------------------------------- 

6.0 What are the Utilities and Tools?

6.1 DB2disk & Disk2DB

These are very useful contributed system library (CSL) utilities
That can unload all or part of a data base to a flat file and put
The flat file back in to all or part of the data base. If you don't 
Have a copy of the CSL you can contact WWW.INTEREX.ORG.
 
6.2 DBUNLOAD & DBLOAD

DBUNLOAD is a utility that copies the data entries out of the 
individual data sets to tape. This utility is useful for small 
data bases to change the data base design, capacities, or to pack 
the detail chains. For larger Image data bases these tasks will 
be many times easier and faster using one of the many data base 
restructuring tools available today. The following discussion 
about DBUNLOAD and DBLOAD serve as a generic example of the 
unload and load process, even though in their case the operation 
is to tape.

The DBUNLOAD program first checks to see if it can gain exclusive 
access to the data base. If not, it prints the error message DATA 
BASE IN USE and terminates.

The DBUNLOAD program has two entry points, CHAINED and SERIAL. 
The default mode of operation is CHAINED. In serial mode the 
entries are read using DBget mode 2, in chained mode the detail 
data sets are unloaded via DBget mode 5. The path used in the 
chained unload is determined in the schema by specifying the "!" as 
a primary path indicator. If no primary path was specified, then 
the first path will be used.

If a broken chain is encountered, the DBUNLOAD will attempt to 
process the chain backwards (mode 6(c) and see if all of the 
entries can be read. Data on missing entries is printed before 
the data set summary.

The data sets are processed in the same sequence as they were 
defined in the schema. The DBget intrinsic is called and the data 
set number is appended to the front of the record as it is 
written out to tape.

To DBUNLOAD an Image data base:

[FILE DBUNLOAD[=dbname] [;DEV=tapeldev]

RUN DBUNLOAD.PUB.SYS [,CHAINED]
   [,SERIAL]
WHICH DATA BASE base name[/maintenance word]

DATA SET set number: n ENTRIES
END OF VOLUME v, e ERRORS RECOVERED
DATA BASE UNLOADED
END OF PROGRAM

Where:

dbname - Is a file ID that will appear on the console to help 
identify the reply being requested, and the tape(s) being read.
 
tapeldev - Is the logical device number or class of the tape 
drive to be used in the store operation.

base name - Is the name of the data base to be stored. You can't 
specify a group or an account, the ones you are currently logged 
on are assumed.

maintenance word - good luck trying this without one if you 
aren't logged on as the creator of the data base.


The DBload utility program reads the data entries unloaded to 
tape and DBputs them back to the data base. The data base name 
must be the same as the data base unloaded and must exist in the 
same group and account as you are currently logged on.

The changes that you can make to the actual structure of the data 
base with the DBUNLOAD, DButil, and DBLOAD utilities are very 
limited. You can change the capacities of individual data sets, 
add data items only to the end of the data entry, and add data 
sets at the end of the schema. For these reasons many other 
utilities have been developed to perform the more often required 
types of data base changes on just the data sets that require 
modification.

For some applications whose design requires heavy access to 
detail chains that average a number greater than the number of 
entries in the detail block, the effects of a chained DBUNLOAD 
and DBLOAD can be dramatic. The benefits of packing the detail 
chains must be balanced by the amount of time and effort expended 
in keeping them packed.

For small data bases, the maintenance sequence for packing the 
detail entries would be to store the data base off to tape with 
DBSTORE or the MPE STORE command, DBUNLOAD with the CHAINED entry 
point, ERAse the data base with DButil, and to DBLOAD. THIS CAN 
BE A VERY LONG PROCEDURE! PLEASE TAKE THE TIME TO ESTIMATE THE 
NUMBER OF HOURS, DAYS OR WEEKS THIS MAY TAKE BEFORE JUMPING IN 
WITH BOTH FEET.

An example of a typical reload happened on a Thanksgiving day weekend. A
DBUNLOAD/DBRELOAD was started on the previous Wednesday afternoon. The DBUNLOAD 
took a little over three hours, and produced almost two reels of tape at 
1600BPI. By Monday morning the DBRELOAD was two inches into the front of the 
first reel, and moving mighty slow. Needless to say, there were some 
very anxious users and systems people.

Their first question was, "How long will it take to complete the 
DBRELOAD" and their second question was, "What do we do now" 
Well, after looking at the schema for the data base, and doing
some simple arithmetic based on the number of IO's required and 
the present IO rate of their system, The conclusion was that the 
DBRELOAD would be completed about mid-Friday morning! This forced 
the decision to abort the reload and restore the data base. It 
turned out that they simply wanted to expand the capacity of one 
of the smaller detail data sets. They were using the DBUNLOAD and 
DBLOAD utilities because they did not have ADAGER or one of the 
other many data base capacity changing utilities.

The end solution was to restore the data base to one of the 
company's systems that had one of these tools and perform the 
capacity change. The entire process took 45 minutes.

To DBLOAD an Image data base:

[FILE DBLOAD[=dbname] [;DEV=tapeldev]

RUN DBLOAD.PUB.SYS,defer
WHICH DATA BASE base name[/maintenance word]

DATA SET set number: n ENTRIES
END OF VOLUME v, e READ ERRORS RECOVERED
DATA BASE LOADED
END OF PROGRAM

Where:

dbname - Is a file ID that will appear on the console to help 
identify the reply being requested, and the tape(s) being read.

tapeldev - Is the logical device number or class of the tape 
drive to be used in the store operation.

defer - Tells Image to enable autodefer for the run of the DBLOAD 
program. This can speed up the DBLOAD since this tells Image to 
write the buffer out when full, not entry by entry. If the 
system fails you would have to start the reload over again anyway 
so getting it over with as soon as possible is desirable.

base name - Is the name of the data base to be stored. You can't 
specify a group or an account, the ones you are currently logged 
on are assumed.

maintenance word - good luck trying this without one if you 
aren't logged on as the creator of the data base.
 
6.3 Capacity, structure change, and diagnostic utilities

6.3.1 ADAGER (The Adapter/Manager for Image/3000 Data bases)
http://www.adager.com/AdagerGuideTOC.html


6.3.2 Flexibase for Image database monitoring, maintenance and management
http://www.triolet.com/Flexibase.html


6.3.3 DBGENERAL (Bradmark)
Demo request, no online documentation
http://www.bradmark.com/dgimage2.htm


6.4 Logging and Recovery

Transaction logging and recovery represents a recognition of the 
fact that computer systems are subject to software and hardware 
failures. In order to protect the reliability of the data stored 
in the data base in between DBSTORE or MPE STORE operations, the 
data base software must provide a secure method for capturing and 
reapplying transactions to the data base. The most frequent cause 
requiring  transaction recovery is the inadvertent restoring of 
data bases over top of the current production version of the data base. 
Second is a hardware failure of a disc that contained part of the data 
base. And third is the accidental use of QUERY to delete records online.

6.4.1 Procedures for logging and recovery

There are three separate procedures involved with Image logging: 
installation, startup and shutdown. The initialization procedure 
is a one time event and seems more complicated than it is really is. The startup 
and shutdown procedures are usually incorporated into the full backup jobs and 
the system start up and shutdown processes.

The TurboIMAGE manual describes the necessary steps in great detail in the 
logging and recovery section.

6.4. Overhead of logging

Construct your own test with DBdriver your results should be similar. 

  KEY   W/O      WITH
 VALUE LOGGING  LOGGING 
 !E=01 CPU = 23 CPU = 24

 !E=02 CPU = 3  CPU = 4
 !E=03 CPU = 3  CPU = 4
 !E=04 CPU = 2  CPU = 4
 !E=05 CPU = 3  CPU = 4
 !E=06 CPU = 2  CPU = 3
 !E=07 CPU = 3  CPU = 4
 !E=08 CPU = 5  CPU = 4
 !E=09 CPU = 2  CPU = 4
 !E=10 CPU = 3  CPU = 4

TOTAL(2/10)=26      =35

LOGGING OVERHEAD= 35/26 = 35%


6.5 DBDRIVER

DBDRIVER - Is an HP supplied utility program that allows calling 
the Image intrinsics without writing a program. DBDRIVER is 
useful for demonstrating and testing how the Image intrinsics 
work, without the structure of a programming language. DBDRIVER 
is executed simply by entering:

:RUN DBDRIVER.PUB.SYS[,entry point]

This allows all Image intrinsics to be executed by entering a one 
letter command. The appropriate parameters must be set prior to 
executing the command. The command abbreviations are as follows:

B[egin]        Calls DBbegin
C[lose]        Calls DBclose
D[elete]       Calls DBdelete
E[rror]        Calls DBerror
F[ind]         Calls DBfind
G[et]          Calls DBget
I[nfoY         Calls DBinfo  
K[ontrol]      Calls DBcontrol (C=Close)
L[ock]         Calls DBlock
M[emo]         Calls DBmemo
N[d]           Calls DBend     (E=Error)
O[pen]         Calls DBopen
P[ut]          Calls DBput
Q[uit]         Calls DBxend
R[elease]      Calls DBunlock  (U=Update)
S[tart]        Calls DBxbegin
T[pi]          Calls DBtpiinfo (TPI products installed)
U[pdate]       Calls DBupdate
W[aive]        Calls DBxundo
X[plain]       Calls DBexplain (E=Error)
Y              Calls DBcall    (description of the previous call(c) 
Z              Calls DBmaint   (PM only)
 
DBDRIVER also allows parameters to be set before each call and 
displayed after the call is completed. To set parameter values 
you used an - and a letter to signify which parameter you are 
setting. To display a parameter value you use a " and a letter to 
signify which parameter to display.

     Base     - !B=Data Base Name
     Password - !Q=Password
     Mode     - !M=Mode
     Status   - !S=Status Array
     List     - !L=List of Item Names
     Buffer   - !E=Buffer Contents
     Argument - !A=Argument value
     DSET     - !Q=Data set Name
     Item     - !I=Item Name

DBDRIVER has special commands and alternate entry points that are 
useful for documentation and debugging purposes.

Other DBDRIVER Commands:
     /V - Displays version, update, and fix
     /D - Debug
     /S - System Debug (Privileged Mode)
     /E - Exit
     /M - Monitor - Sends all terminal activity to LP
     /W - Turns off Monitor (Dumps to LP)
     //text - Comments
     /C - Displays Code Segment # Tables of Image Procedures.
     /R nl,n2,n3 - Turn on Repeat
                 - Execute Command n1 * n2 Times
                 - Displays Timings after each n2 Times
                 - Add n3 to Each Word in Buffer Each Time
                 - Turn OFF Repeat

DBDRIVER Entry points:
VUF - This entry point displays the version/update/fix levels of 
the IMAGE program files in PUB.SYS and the version/update/fix of 
the IMAGE intrinsics to which DBDRIVER was linked by the loader.

PURGE - This entry point prompts for the name of the I-file to be 
purged. After the purge, the program terminates. I-files are 
created by IMAGE when dumping is enabled.

OOPS - This entry point prompts for purging the DBS and DBG.
 
CLONE - Prints the specified I-file to the printer and 
terminates. If the user doesn't have privileged mode (PM) then 
the program displays "No dice!" and terminates.

PRIV - If no PM then responds "No dice!" and terminates. 
Otherwise, DBDRIVER runs in privileged mode. This allows calls 
to uncallable intrinsics, and access to privileged functions of 
callable intrinsics.
 

6.6 MPEX LISTF DB

If you have MPEX from VESOFT, it has a very useful capability to 
perform a LISTFDA command which will list the capacities, 
blocking factors, device locations and data set types for all or 
seleted data bases on your system.

:RUN MPEX.PUB.VESOFT

%LISTF fileset,DB [;listfile]

FILENAME SET NAME TY ENT- CAPA- FULL% BLK SEC- %BLOCK SECTR
   PE RIES CITY  FCT TORS WASTED WASTE



The LISTF ,DB can show and select on the following criteria:

 - data set's file name and Image set name
 - data set type
 - number of entries in the data set
 - number of sectors the data set takes up
 - the wasted space due to blocking of entries
 
6.7 DBLOADNG/HOWMESSY (Robelle)

Robelle's How Messy tutorial as a self-extracting archive can be found at 
http://www.robelle.com/library/smugbook/howmessy.html

DBLOADNG is a contributed library program that can read every 
entry and every chain in the data base. This function in and of 
itself is very valuable in that it verifies the structural 
integrity of the data base. In addition, the program returns 
statistics about the capacity of the data sets and the locality 
of the entries on secondary or detail chains.

RUNNING Dbloadng

:RUN DBLOADNG.PUB.TELESUP

Data Base Loading Analysis -- DBLOADNG V2.7f

Data Base name: TEST
Password (;) :
Mode (5) : 5
Data Set (@) :

Data Base name:

END OF PROGRAM

The formal file designator for the output file is LOADREPT which 
is equated to the device class LP. The report can be sent to 
another printer via a file equation, IE FILE LOADREPT;DEV=LP2. 
The program can take a long time and consume a high percentage of 
the CPU so it is highly suggested that this is run from a batch job 
at off hours. The report is usually only a few pages, so if run 
from a batch job set up a file equation to defer the 
printing so that the report doesn't get lost.

The report contains the following headings and statistics. Don't 
let the volume of numbers intimidate you. With a few rules of 
thumb you'll be able to rip through this report with the best of 
them.

Data Base - Is the name of the data base requested

User Access Number - Is actually the user class number returned 
from the DBopen. It comes from looking up password provided in 
the passwords table of the root file.

Global DBCA Size - Is the number of words that the global control 
block currently occupies. The amount of space required can be 
analyzed from the table reported generated by DBschema.

Local DBCA Size - Is the size in words of the local buffers for 
the current DBopen.
 
Data Set # - Is the sequence number of this data set as specified 
in the schema. In large data bases with non-descript data set 
names, it may be easier to refer to the data sets by number.

Data Set Name - Is the name given to this data set in the schema.

Type - M for manual master, A for automatic master, and D for 
detail.

Capacity - Maximum number of entries that the data set can 
contain.

Entries - The current number of entries actually in the data set 
as a result of a call to DBput.

% Load Factor - This is calculated as the number of entries 
divided by the capacity times 100. The load factor is a very 
important statistic for Image data sets. If the data set becomes 
full then you have two unpleasant options, delete some entries or 
increase the size of the data set.

% Secondaries - Is calculated for master entries as the number of 
entries that have a zero in their chain count divided by the 
current number of entries times 100. The occurrence of 
secondaries is governed by two things, the load factor and the 
content of the key being hashed. As the load factor increases, so 
does the incidence of secondary entries, and the more similar the 
key values being hashed, the more secondary entries there are.

 
Max Blocks - This is the largest occurrence of contiguous blocks 
without a free entry. On a DBput, if the entry being added is a 
secondary entry or is a primary that has its entry already 
occupied by a secondary, then Image will have to find a free 
entry to place the new secondary or migrate the existing 
secondary to. This means that adjacent blocks will have to be 
read in and their bit maps searched until a free entry is 
located. This can be quite time consuming. In general, the same 
factors governing the occurrence of secondaries will drive the 
size of this value. However, for large masters the percents can 
hide the effects of a local cluster of entries.

Blockfactor - This is the number of entries per block that Image 
assigned based on the media record size and the BLOCKMAX 
specified in the schema. A low block factor results from the data 
content of the entry and the number of paths specified. A small 
number of entries per block aggravates the effects of secondary 
chains and can explain large Max Block values.
 
"*", "S", "!" - The * means that this is a stand alone master or 
detail with no paths. The S shows that this path is sorted, and 
the ! indicates that this is the primary path as defined in the 
schema.

Search Item - Is the data item name of the key item as specified 
in the schema. Masters may only have one search item, and details 
can have up to 16 search items.

Max Chain - This is the length of the longest secondary chain for 
masters and the longest detail chain for the path into a detail.

Avg Chain - This is the average length of the secondary chain for 
a master or the average chain length for a path into a detail. As 
the average chain length approaches ten percent of the number of 
entries, the benefits of maintaining the chain are outweighed by 
the overhead to maintain it. It becomes faster and simpler to 
read the detail serially selecting the entries of interest.

Std Dev - Is the standard deviation of the average chain length. 
This value indicates the normalcy of the spread of the length of 
the chains. In other words a comparison of the average and 
maximum chain lengths. A small standard deviation indicates a 
uniformity of chain lengths. A large standard deviation shows 
that the average chain length is based on a wide spread of chain 
lengths and that further investigation into the data base design 
is required.

Expected Blocks - Is a calculation of the average number of 
blocks that each chain should require based on the current 
blocking factor. This value is on average the perfect case number 
of IO's required to read an entire chain.

Average Blocks - Is the actual average number of IO's required to 
read an entire chain. Note that this number is only valid if the 
standard deviation of the average chain length is small.

Elongation - This is the ratio of perfect blocking to actual 
average blocking, or more simply stated, as a multiplier of 
resources CPU and Disc IO's required beyond the perfect case. If 
this ratio is one or close to one, then the benefits of 
performing a chained unload and load or DETPACK of these paths 
will be negligible. However as this ratio grows then the 
benefits of repacking can be very great, if this path is used 
frequently.

% Inefficient Pointers - This is calculated as the total number 
of pointers to other blocks divided by the total number of 
pointers times 100. This percentage also is a good indicator of 
the expected benefits of reorganizing.

Number of Primaries - Actual number of master entries that have a 
secondary chain counter greater than zero.
 
Number of Secondaries - Actual number of master entries that have 
a zero in the chain counter.

Number of Chains - For masters, this is the number of entries 
that have a secondary chain counter greater than one. For 
details, this is the number of nonzero master chain heads.

External Pointers - This is a count of the number of entries that 
have forward pointers to other blocks. If the block is adjacent 
or close by, it may not require an IO to get the next block, 
depending on the use of MPE\V disc caching or MPE\XL mapped 
files.

Contiguous Pointers - Is a count of the number of pointers that 
point to the next adjacent block. If the average chain is longer 
than the blocking factor, then this can't be avoided. Generally 
the larger the blocking factor and the shorter the average chain, 
the smaller occurrence of contiguous pointers. This is where 
buffering and disc caching really pay off.

Total Blocks - Calculated as the number of entries divided by the 
blocking factor rounded up.

Full Blocks - The number of blocks that have no free entries.

Empty Holes - This is a count of secondary pointers that skip 
over free entries in masters. These holes are created because 
Image does not migrate secondaries when other primary entries are 
created. This is because of the overhead involved and the chance 
that the primary may be re-added as with the case of a DBdelete 
and a DBput to change the value of a search item for some other 
path in a detail. This count is a good indicator of volatility in 
a master. When this number becomes large, a reload or change of 
capacity of the master will repack the secondary chains.

Messages:

Capacity is not prime - The master data set indicated has a 
capacity which is not a prime number. Generally, capacities that 
are even numbers or that have many repeating digits yield a 
greater number of secondary entries. The message also provides 
the next higher and lower prime numbers.

Load factor greater than 80% - When a data set becomes full you 
have two options, delete some records or increase the capacity. 
Both options can take a great deal of time, so it's a good idea 
to know well in advance and make appropriate plans to increase 
capacity well in advance.
 
When master data sets get past the 50% full mark, secondary 
entries become an issue. When the data set gets greater than 80% 
full nearly every entry will be a secondary. This causes extra 
processing overhead that can be simply avoided by maintaining 
capacities at a reasonable level.

Secondary Percentage greater than 30% - This means that greater 
than three out of ten entries is a secondary, and that all 
operations to this data set incur overhead processing these 
secondary chains. The solution is to increase the capacity and 
investigate the key values being used. Character keys should vary 
the most on the right, and numeric keys should vary the most on 
the left.

Average chain length greater than 1.5 - When the average 
secondary chain length grows past 1.5 entries, each operation to 
the data set is requiring on average 1.5 times as much resources 
as normal. This is another method for assessing the existence of 
poor capacity or hashing performance. The solution is the same as 
when the secondary percentage is greater than 30%.

Number of entries read not equal to the Image count - This is a 
common message for Image data bases that have been in existence 
for a long time and have experienced a number of system failures. 
The number of entries that Image thinks are in use differs from 
the actual number of records found by actually reading the entire 
data set. This message could also indicate that the data base was 
modified during the DBLOADNG run. But it also could indicate more 
serious problems.

For detail data sets, it is very important for the records used 
counter to be in synchronization with the number of actual 
entries, because if the counter drops to zero, Image will start 
adding records at the first entry, as if the data set had been 
erased. If however the counter is off, then this will not happen 
and each new entry will be added on the delete space chain in 
last deleted next used sequence. The performance implications are 
dramatic.

------------------------------------------------------------------- 

7.0 How do I begin the design process?

Probably the most mystical part of any data base management 
system is. how does the design specification actually come into 
being. This is more of an art than a science, but like any 
artistic endeavor, three things are certain. Practice makes 
permanent, perfection is fleeting, and you'll always wish you 
would have done something else when you have run out of time.

The initial design process can be broken up into four smaller 
projects which should get simpler as you proceed, depending upon 
how well the previous step was accomplished. The first step is to 
define the data elements that will become the palette of colors 
from which the data base will be painted. The second step is to 
hierarchically relate each data item to each other (sounds more 
complicated than it really is). The third step is to fold the 
hierarchies previously created into a two-level network data 
base. And the last design step is to prototype and test the 
initial design for structural conflicts.


Another good basic discussion on data base design can be found at:
http://www.aics-research.com/pdf/chap2.pdf


7.1 Defining the data elements

Data elements are defined by collecting all of the application 
documentation and coordinating content of each screen and report. 
It is very time consuming to redefine the size of a field once 
you get to the prototyping phase, so take your time and get the 
type and size of each data entry down pat. The naming convention 
should be kept simple. Picture an end user typing the data item 
name into some report generating program. Don't make the mistake 
of mixing upper and lower case, it looks neat but it will make 
programming and the use of ad hoc report writers much more 
difficult.

Keep in mind that storing numbers as type X and type U means that 
the compiled application programs will spend many CPU cycles 
converting these display fields to binary values in order to do 
any arithmetic functions.

7.2 Building hierarchical structures

The next step, which is almost never done, is simply to make a 
two dimensional matrix by listing each data element horizontally 
and vertically. In each cross-match of data items, list the ratio 
of each data item to the other. The relationship will be zero for 
none, or one for 1:1, two for 2:1 and so forth. This, in effect, 
builds a table that represents a hierarchy that will be 
invaluable in the next step.
 
7.3 Folding hierarchies into two level networks

The final step in defining the schema is to collapse the 
hierarchy of data items into Image's two possible levels of 
structure, namely master and detail data sets. This is done by 
originally defining each group of related data items in as many 
detail data sets as are required. The process can be more easily 
pictured as raking the data items into piles of data items that 
are then defined as detail data sets.

The next step is to place the capacity values on each data set. 
The capacity of each data set will determine the need for key 
values. Remember the example of the library with only one or two 
hundred books. The card catalog is really a waste because all of 
the titles can be searched in short order. So the first rule is. 
if the detail is less than two hundred entries forget about any 
key paths and either scan the table for each access or build an 
array to hold them in your application program.

The next step is to identify the need for online access to the 
detail data set. If a data item in the data set makes a natural 
search item, then define an automatic master, and place a 
capacity on this master. If the ratio of master entries is 1:1 
then you've got a good index, as the ratio of detail entries 
increases i.e. 1:100 then the value of the index starts to drop. 
In the process of selecting index data items remember to select 
no more than three data items to serve as paths. The marginal 
value of each additional path drops off dramatically after the 
third path while the overhead of maintaining the path is a linear 
function.

If no data item in the proposed detail data set is a likely index 
value, you have two options. One is to concatenate two or more 
data items together hoping that the ratio of the new concatenated 
key to detail entries will be small enough to be useful. The 
other choice is to invent a new unique index data item. Both 
choices have their good and bad points, the concatenated key is 
discussed shortly.

Finally, if any of the details has only one index, this may be a 
candidate for combining the two data sets into a single manual 
master. This will reduce the amount of IO to process data 
entries, but there are two major drawbacks to manual masters with 
data. One is that the data content reduces the number of key 
records per block, and the second is that the only way to 
retrieve data on any of the non-key data items is to serially 
search the entire master, reading empty blocks and all.
 
7.4 Prototyping to resolve structural conflicts

There is no substitute for filling the data base up with test 
data and running prototype applications and reports against the 
data base. If you wait until the data base goes into production 
to perform this type of testing live, it will be very difficult 
to make any changes in the design.

Without testing under a production-sized load, you will not find 
the types of problems that will lead to performance degradation 
later on. This is especially true of the higher level programming 
languages and report writers that unless specifically directed 
will perform all data retrieval operations via a serial read of 
the data sets required. This method works very well on small test 
data bases, but when the data base starts to fill up, the run 
time of the reports and response times at the terminals also go 
up.


7.5 Walk through

Suppose that we needed to design an Image data base for order 
entry. The first step would be to collect existing 
documentation, invoice forms, order books, price sheets, ETC.

The second step would be to design the basic screens and reports 
required to eliminate the paper system. These interactive 
screens and report layouts allow definition of the data item 
names, types and sizes. The net result should be a table of data 
items and a list of access requirements.

The list of access requirements is derived from the functionality 
of the interactive screens. For example the order entry screen 
requires the ability to look up any order based on the order 
number, customer number, sales account. ETC. These requirements 
should be listed in a separate table along with acceptable 
transaction response times.
 
                 Order Data Base
                       Data Items Table


                   Item Names       Type Len

                   BUYER-NAME,        X  30
                   BUYER-PHONE,       Z  10
                   CUSTOMER-ADDR-1,   X  30
                   CUSTOMER-ADDR-2,   X  30
                   CUSTOMER-CITY,     X  20
                   CUSTOMER-NAME,     X  30
                   CUSTOMER-NUMBER,   X   6
                   CUSTOMER-PO,       X  16
                   CUSTOMER-STATE,    X   2
                   CUSTOMER-ZIP,      X  10
                   ITEM-NUMBER,       X  16
                   LEAD-TIME,         I   1
                   NET-WEIGHT,        I   2
                   ORDER-NUMBER,      X   6
                   ORDER-DATE,        Z   6
                   ORDER-STATUS,      X   2
                   ORDER-TYPE,        X   2
                   PART-NUMBER,       X  16
                   PART-ADDED,        Z   6
                   PART-DESCRIPTION,  X  30
                   PART-ORDER-QTY,    I   2
                   PART-PRICE,        I   2
                   PART-SHIP-QTY,     I   2
                   PROMISE-DATE,      Z   6
                   SALES-ACCT,        X  12
                   SCRAP-PCT,         I   1
                   SETUP-QTY,         I   2
                   SETUP-UNITS,       I   2
                   SHIP-ADDR-1,       X  30
                   SHIP-ADDR-2,       X  30
                   SHIP-CITY,         X  20
                   SHIP-DATE,         Z   6
                   SHIP-NAME,         X  30
                   SHIP-STATE,        X   2
                   SHIP-ZIP,          X  10
                   STD-UNITS,         I   2
                   UNIT-MEASURE,      X   2
                   WANT-DATE,         Z   6
 
                   Order Data Base
               Access Requirements Table


1. Retrieval  of  customer  information  by  CUSTOMER-NUMBER  and 
SALES-ACCT.

2. Retrieval of order  information by  ORDER-NUMBER, PART-NUMBER, 
SALES-ACCT, and ORDER-STATUS.

3. Retrieval of part information by PART-NUMBER.

The next step is to cross-relate the access requirements with the 
data items table. This allows the bundling of data items into 
data set candidates. The question to ask at each cross point is, 
"Are these data items related, and if so, is it a one to one 
relationship, or a one to many relationship> For example, for 
each CUSTOMER-NUMBER, how many BUYER-NAME's will there be. The 
answer is 1. For each SALES-ACCT how many CUSTOMER-NUMBER's" The 
answer is many (shown as N).

 
                       Order Data Base
                 Data Items Relational Table


  Item Names      Type Len CUSTOMER ORDER   PART  SALES ORDER
                            NUMBER  NUMBER NUMBER ACCT  STATUS
                           -------- ------ ------ ----- ------
 BUYER-NAME,        X  30     1
 BUYER-PHONE,       Z  10     1
 CUSTOMER-ADDR-1,   X  30     1
 CUSTOMER-ADDR-2,   X  30     1
 CUSTOMER-CITY,     X  20     1
 CUSTOMER-NAME,     X  30     1
 CUSTOMER-NUMBER,   X   6     1                    N
 CUSTOMER-PO,       X  16             1                  N
 CUSTOMER-STATE,    X   2     1
 CUSTOMER-ZIP,      X  10     1
 ITEM-NUMBER,       X  16             1       1          N
 LEAD-TIME,         I   1                     1
 NET-WEIGHT,        I   2             1       1          N
 ORDER-NUMBER,      X   6             1            N     N
 ORDER-DATE,        Z   6             1                  N
 ORDER-STATUS,      X   2             1                  1
 ORDER-TYPE,        X   2             1                  N
 PART-NUMBER,       X  16             1       1          N
 PART-ADDED,        Z   6                     1
 PART-DESCRIPTION,  X  30             1       1          N
 PART-ORDER-QTY,    I   2             1                  N
 PART-PRICE,        I   2             1       1          N
 PART-SHIP-QTY,     I   2             1                  N
 PROMISE-DATE,      Z   6             1                  N
 SALES-ACCT,        X  12     1       1            1     N
 SCRAP-PCT,         I   1                     1
 SETUP-QTY,         I   2                     1
 SETUP-UNITS,       I   2                     1
 SHIP-ADDR-1,       X  30     1       1                  N
 SHIP-ADDR-2,       X  30     1       1                  N
 SHIP-CITY,         X  20     1       1                  N
 SHIP-DATE,         Z   6     1       1                  N
 SHIP-NAME,         X  30     1       1                  N
 SHIP-STATE,        X   2     1       1                  N
 SHIP-ZIP,          X  10     1       1                  N
 STD-UNITS,         I   2                     1
 UNIT-MEASURE,      X   2             1       1          N
 WANT-DATE,         Z   6             1       1          N
 

After the data items have been related the next step is to break 
out the one-to-one relationships into natural data sets. Notice 
that at this stage in the process all data sets should not be 
prejudged as master or detail data sets. The one-to-many 
relationships mean that by definition they will have to be 
studied further to decide if they will make useful key data 
elements.


                       Order Data Base
                   Natural Data Sets Table

Sets:

Name:  CUSTOMERS;
Entry:

CUSTOMER-NUMBER,
CUSTOMER-NAME,
CUSTOMER-ADDR-1,
CUSTOMER-ADDR-2,
CUSTOMER-CITY,
CUSTOMER-STATE,
CUSTOMER-ZIP,
SHIP-NAME,
SHIP-ADDR-1,
SHIP-ADDR-2,
SHIP-CITY,
SHIP-STATE,
SHIP-ZIP,
BUYER-NAME,
BUYER-PHONE;
 
Name: ORDERS;

Entry:

ORDER-NUMBER,
ORDER-TYPE,
ORDER-STATUS,
ORDER-DATE,
CUSTOMER-NUMBER,
SHIP-NAME,
SHIP-ADDR-1,
SHIP-ADDR-2,
SHIP-CITY,
SHIP-STATE,
SHIP-ZIP,
SHIP-DATE,
PROMISE-DATE,
WANT-DATE,
CUSTOMER-PO,
PART-NUMBER,
PART-DESCRIPTION,
NET-WEIGHT,
UNIT-MEASURE,
PART-ORDER-QTY,
PART-SHIP-QTY,
PART-PRICE,
ITEM-NUMBER,
SALES-ACCT,
LEAD-TIME;

Name: PARTS;
Entry:

PART-NUMBER,
PART-DESCRIPTION,
SALES-ACCT,
PART-ADDED,
SCRAP-PCT,
NET-WEIGHT,
UNIT-MEASURE,
ITEM-NUMBER,
PART-PRICE,
LEAD-TIME,
SETUP-QTY,
SETUP-UNITS,
STD-UNITS;
 
These data sets were made up by simply reading down the columns 
that contained all 1's. These should default to detail data sets 
with an automatic master linked via the cross-referenced data 
item. This gives a first stage design that looks like this:



+-----------+             +-----------+          +-----------+
| CUSTOMER  |             |   ORDER   |          |    PART   |
| AUTOMATIC |             | AUTOMATIC |          | AUTOMATIC |
|  MASTER   |             |   MASTER  |          |   MASTER  |
+-----------+             +-----------+          +-----------+
      |                         |                      |
      |                         |                      |
      |                         |                      |
     \|/                       \|/                    \|/
      V                         V                      V
+-----------+             +-----------+          +-----------+
| CUSTOMER  |             |   ORDER   |          |    PART   |
|  DETAIL   |             |   DETAIL  |          |   DETAIL  |
+-----------+             +-----------+          +-----------+



What about those one-to-many relationships. To analyze the value 
of these data elements as possible key values we need to build a 
final table that takes into account the number of expected 
entries in the detail data sets and the number of possible key 
values.

The number of possible key value occurrences is divided into the 
number of expected entries in the associated detail data sets. 
This number is the expected average chain length, and thus can be 
used to determine the usefulness of the potential key value.
 

                        Order Data Base
                 One-to-Many Occurrences Table


Data Set Names   #ENTRIES  CUSTOMER ORDER   PART  SALES ORDER
                            NUMBER  NUMBER NUMBER ACCT  STATUS
                           -------- ------ ------ ----- ------
# Of Occurrences                 700  5,000  1,200    50      2
                           -------- ------ ------ ----- ------

CUSTOMER-DETAIL     700           1                  14
ORDER-DETAIL      5,000                  1          100  2,500
PART-DETAIL       1,200                         1


Notice that the SALES-ACCT item is a good key candidate, while 
the ORDER-STATUS item with only two possible occurrences does not 
create a small enough subset of the ORDER-DETAIL to make it 
faster to read the chain than the entire data set.



+-----------+   +-----------+   +-----------+    +-----------+
| CUSTOMER  |   | SALES-ACCT|   |   ORDER   |    |    PART   |
| AUTOMATIC |   | AUTOMATIC |   | AUTOMATIC |    | AUTOMATIC |
|  MASTER   |   |  MASTER   |   |   MASTER  |    |   MASTER  |
+-----------+   +-----------+   +-----------+    +-----------+

      |        /             \       |                 |
      |       /               \      |                 |
      |      /                 \     |                 |
     \|/    /                   \   \|/               \|/
      V    V                     V   V                 V
+-----------+                  +-----------+     +-----------+
| CUSTOMER  |                  |   ORDER   |     |    PART   |
|  DETAIL   |                  |   DETAIL  |     |   DETAIL  |
+-----------+                  +-----------+     +-----------+



This design allows for all of our access requirements except the 
capability to access the order detail data set based on 
ORDER-STATUS. After further investigation it was discovered that 
this capability is required to access the order entries that had 
an ORDER-STATUS of O for opened so that once per day they can be 
printed on a pick ticket report. This requirement is easily 
accommodated by adding a separate detail to hold the order 
numbers as orders are added.
 
Also at this time we can examine the design for single pathed 
detail data sets that are candidates for being turned into a 
manual master. This should only be done if you can answer yes to 
the following questions:

Would you bet $10,000 that no one would ever want to search this 
detail based on one of the other data items

Will the data set contain more than 100 entries

In regard to the first question, it is very simple using ADAGER 
or some other data base change utility to add an automatic master 
to an existing detail. But, it can be very difficult to modify 
applications that formerly accessed a master to perform a DBfind 
to access the new detail.

The second question really applies to having a key into any data 
set, if the number of records is small (fewer than 200(c) why have 
a key in the first place.

With these final changes in place, our data base design now looks 
like this:



+-----------+   +-----------+   +-----------+    +-----------+
| CUSTOMER  |   | SALES-ACCT|   |   ORDER   |    |    PART   |
| AUTOMATIC |   | AUTOMATIC |   | AUTOMATIC |    |   MANUAL  |
|  MASTER   |   |  MASTER   |   |   MASTER  |    |   MASTER  |
+-----------+   +-----------+   +-----------+    +-----------+

      |        /             \       |
      |       /               \      |
      |      /                 \     |
     \|/    /                   \   \|/
      V    V                     V   V
+-----------+                  +-----------+     +-----------+
| CUSTOMER  |                  |   ORDER   |     |OPEN-ORDER#|
|  DETAIL   |                  |   DETAIL  |     |   DETAIL  |
+-----------+                  +-----------+     +-----------+

 
The formal ORDERS schema:


PAGE 1    HEWLETT-PACKARD 32215C.00.42  TurboIMAGE/3000. DBschema 
                                     (C) HEWLETT-PACKARD CO. 1978


Begin Data Base ORDERS;

Passwords:
           1  ANY     ;
Items:
             BUYER-NAME,           X  30  ;
             BUYER-PHONE,          Z  10  ;
             CUSTOMER-ADDR-1,      X  30  ;
             CUSTOMER-ADDR-2,      X  30  ;
             CUSTOMER-CITY,        X  20  ;
             CUSTOMER-NAME,        X  30  ;
             CUSTOMER-PO,          X  16  ;
             CUSTOMER-STATE,       X   2  ;
             CUSTOMER-ZIP,         X  10  ;
             CUSTOMER-NUMBER,      X   6  ;
             ITEM-NUMBER,          X  16  ;
             LEAD-TIME,            I   1  ;
             NET-WEIGHT,           I   2  ;
             ORDER-NUMBER,         X   6  ;
             ORDER-DATE,           Z   6  ;
             ORDER-STATUS,         X   2  ;
             ORDER-TYPE,           X   2  ;
             PART-NUMBER,          X  16  ;
             PART-ADDED,           Z   6  ;
             PART-DESCRIPTION,     X  30  ;
             PART-ORDER-QTY,       I   2  ;
             PART-PRICE,           I   2  ;
             PART-SHIP-QTY,        I   2  ;
             PROMISE-DATE,         Z   6  ;
             SALES-ACCT,           X  12  ;
             SCRAP-PCT,            I   1  ;
             SETUP-QTY,            I   2  ;
             SETUP-UNITS,          I   2  ;
             SHIP-ADDR-1,          X  30  ;
             SHIP-ADDR-2,          X  30  ;
             SHIP-CITY,            X  20  ;
             SHIP-DATE,            Z   6  ;
             SHIP-NAME,            X  30  ;
             SHIP-STATE,           X   2  ;
             SHIP-ZIP,             X  10  ;
             STD-UNITS,            I   2  ;
             UNIT-MEASURE,         X   2  ;
             WANT-DATE,            Z   6  ;
 
Sets:

Name:  CUSTOMER-MST,Automatic(/0,1);
Entry: CUSTOMER-NUMBER(1);
Capacity:        1009;

Name:  SALES-ACCT-MST,Automatic(/0,1);
Entry: SALES-ACCT(2);
Capacity:        101;

Name:  CUSTOMER,Detail(/0,1);
Entry: CUSTOMER-NUMBER(CUSTOMER-MST),
       CUSTOMER-NAME,
       CUSTOMER-ADDR-1,
       CUSTOMER-ADDR-2,
       CUSTOMER-CITY,
       CUSTOMER-STATE,
       CUSTOMER-ZIP,
       SHIP-NAME,
       SHIP-ADDR-1,
       SHIP-ADDR-2,
       SHIP-CITY,
       SHIP-STATE,
       SHIP-ZIP,
       SALES-ACCT(!SALES-ACCT-MST),
       BUYER-NAME,
       BUYER-PHONE;
Capacity:        1000;

Name:  ORDER-MST,Automatic(/0,1);
Entry: ORDER-NUMBER(1);
Capacity:      10007;

Name:  ORDERS,Detail(/0,1);
Entry: ORDER-NUMBER(ORDER-MST),
       ORDER-TYPE,
       ORDER-STATUS,
       ORDER-DATE,
       CUSTOMER-NUMBER,
       SHIP-NAME,
       SHIP-ADDR-1,
       SHIP-ADDR-2,
       SHIP-CITY,
       SHIP-STATE,
       SHIP-ZIP,
       SHIP-DATE,
       PROMISE-DATE,
       WANT-DATE,
       CUSTOMER-PO,
       PART-NUMBER,
       PART-DESCRIPTION,
       NET-WEIGHT,
       UNIT-MEASURE,
       PART-ORDER-QTY,
       PART-SHIP-QTY,
       PART-PRICE,
       ITEM-NUMBER,
       SALES-ACCT(!SALES-ACCT-MST),
       LEAD-TIME;
Capacity:       10000;

Name:  PARTS,Manual(/0,1);
Entry: PART-NUMBER(0),
       PART-DESCRIPTION,
       SALES-ACCT,
       PART-ADDED,
       SCRAP-PCT,
       NET-WEIGHT,
       UNIT-MEASURE,
       ITEM-NUMBER,
       PART-PRICE,
       LEAD-TIME,
       SETUP-QTY,
       SETUP-UNITS,
       STD-UNITS;
Capacity:       5003;

Name:  OPEN-ORDERS,Detail(/0,1);
Entry: ORDER-NUMBER;
Capacity:       1000;

End.


 DATA SET      TYPE  FLD  PT  ENTR  MED  CAPACITY    BLK  BLK  DISC
   NAME              CNT  CT  LGTH  REC              FAC  LGTH SPACE

CUSTOMER-MST      A  1    1   3     14   1009        36   507  120
SALES-ACCT-MST    A  1    2   6     23   101         22   508  24
CUSTOMER          D  16   2   151   159  1002        3    478  1340
ORDER-MST         A  1    1   3     14   10007       36   507  1116
ORDERS            D  25   2   136   144  10002       3    433  13340
PARTS             M  13   0   53    58   5003        8    465  2508
OPEN-ORDERS       D  1    0   3     3    1002        167  512  28
                            TOTAL DISC SECTORS INCLUDING ROOT: 18495


NUMBER OF ERROR MESSAGES: 0
ITEM NAME COUNT: 38      DATA SET COUNT: 7
ROOT LENGTH: 1308        BUFFER LENGTH: 512     TRAILER LENGTH: 256

------------------------------------------------------------------- 

8.0 How do I improve on an existing design?

8.1 Common design situations

The majority of system managers and programmers inherit someone 
else's data base design. Does this mean that you are stuck and 
there is not much that can be done. The answer is certainly no! 
On most HP3000 systems, the Image data bases are at the heart of 
all the applications and are the place where the real work gets 
done. If you've read up to this point, you should be aware of the 
many possible configuration issues that are important whether you 
originally designed the data base or not. The following 
discussion is an attempt to identify some of the common design 
problems built into Image data bases and methods by which you can 
minimize their impact.

8.2 The concatenated key

Very often in the design of a data base you'll have a group of 
related data items that form a nice detail data set, but no one 
item is a natural key. One choice would have been to assign an 
artificial key, as is the case with customer numbers, vendor 
numbers etc. These numbers serve as unique identifiers and make 
good key values because there is usually only one customer record 
per customer number or vendor record per vendor number.

However, it is not always possible to build such indices, and the 
only other choice is to pick two or more data items in the data 
set that when put together make a small enough chain in the 
detail as to deserve an index. An example would be concatenating 
the customer name and zip codes together creating a new data item 
called NAME-ZIP. This is purely a bet that we don't get too many 
customer names that are the same in any given zip code.

As a side note. it is a good idea when concatenating data items 
together, to place the data items that vary the most to the 
right, and the more static data items to the left. This is 
because the hashing algorithm that generates the relative record 
number from the data item content starts at the right and uses 
these bits to "randomize> the rest of the bits from right to 
left. The net result will be a more even spread of entries in the 
master data set.

8.2.1 Using DBfind on multiple paths to find the shortest chain

One method that sounds complicated, but in reality is very simple 
to implement, is to make use of the feature of DBfind that brings 
back the number of entries on the chain. When a screen is 
designed, it is very helpful to highlight the fields that can be 
used to search the data, and if the request includes multiple 
search items into the detail data set, it would be advantageous 
to perform multiple DBfinds and then do the chained reads on the 
shortest chain. Or if the chains are greater than a predetermined 
length, request the user to enter any additional data.

8.2.2 Implications for data dictionaries

In order to use a data dictionary with concatenated keys, the 
data dictionary must be able to break out the data items into 
sub-fields. This is very common today but when data dictionaries 
first became available it was a big problem.

8.2.3 Implications for report generators

The real problems with concatenated keys come out when reports 
are generated. Unless the report writer is capable of breaking 
out the sub-fields, you may have great difficulty entering the 
selection criteria as a mess of concatenated characters. Query 
does not have the ability to prompt for parts of a data item, and 
many a person has experienced great frustration attempting to 
perform a Query find on a concatenated key.

8.3 The need for generic keys

Many applications have the need to be able to retrieve entries on 
a generic or close match basis. There is no method built into 
Image to provide this functionality, although some have tried. 
There are add-on software packages that build data structures 
within Image data sets that can provide some of these 
capabilities. The following discussion covers some ideas that 
may work for you.

8.3.1 The two bytes, four bytes, 6 bytes approach

One method is to create a new data item that is a shortened 
version of the complete data item. An example would be the first 
two characters of the customer name. This key would allow chained 
retrieval of all customers whose names started with the same two 
characters. This could be a very long chain, so we could add
another new data item that is the first four characters of the 
customer name, and so forth. Remember, the amount of overhead to 
add and delete a detail entry is directly proportional to the 
number of associated masters. Thus this method, while providing a 
limited solution, is very expensive in terms of the CPU and Disc 
resources to make it work.

8.3.2 Using KSAM versus BTREES

NOTE: NEED TO ADD BTREE DISCUSSION IN THIS SECTION

Hewlett-Packard does provide a file system and access method to 
facilitate generic key retrieval, in the form of KSAM, an acronym 
for keyed sequential access method. KSAM files are not widely 
used for primary data storage because, while providing generic 
access functionality, it has three design flaws that make it 
unsuitable for production data storage. The first is that locking 
can only occur at the file level, this causes very long response 
times when more than ten users are running an application against 
a common KSAM file. Second, KSAM files do not share buffers for 
all users on the system, and therefore the more users who have 
the file opened, the more memory is required for redundant 
buffers. And lastly, KSAM files can be corrupted easily when a 
system failure, abort or halt is encountered, and it can take a 
long time to recover these files afterwards.

However, one solution to the need for generic access is to unload 
the master and load the data into a KSAM file on a nightly basis. 
Then if a generic access is required, the KSAM file is searched 
to bring back the exact key value of interest, which is then used 
to perform a DBfind to bring back the detail data entries.

8.3.3 Third Party Indexing (TPI)

NOTE: NEED TO ADD TPI DISSCUSSION

8.3.4 Sound-indexing, the removal of vowels in character keys

Many generic accesses are an attempt to find close matches on 
name data items, CUSTOMER-NAME, VENDOR-NAME, EMPLOYEE-NAME, etc. 
One method that works well is to create a squashed name by 
removing all of the vowels and repeating consonants, and 
truncating after four characters have been found,. thus HORNSBY 
becomes HRNS. This allows for many different spellings that sound 
alike, and will allow a chained read of names that are close 
sounding approximations.
 
8.4 Open vs. closed order data sets

A very common design problem is moving data from one data set to 
another, usually as the status or timing of the data changes, for 
example, moving orders from open orders to closed orders, or 
inventory transactions to inventory history. The jobs or process 
that move these entries around tend to be very CPU and Disc IO 
intensive, not to mention tricky to program and test. The 
following describes these common problems and provides some 
alternatives.

8.4.1 Moving data from one path to another

In many applications it is necessary to process transactions from 
one state to another. To illustrate, let's use an example of a 
simple order entry data base with an open order detail and a 
closed order detail with two associated masters both based on 
order number. Now, every day at shipping time the open orders are 
picked and moved to the closed orders. This requires a 
DBfind/DBget/DBdelete on the open orders and a DBput on the 
closed orders. This can be a very time consuming process that 
takes longer as the number of orders increase. A better method 
would have been to keep an open orders detail that contained only 
the order number of unpicked orders, and an order detail that 
contains all orders with a new field called ORDER-STATUS. Thus 
the processing at shipping time would be DBget/DBdelete on open 
orders and a DBget/DBupdate on the orders data set. Note the 
DBdelete is not as intensive because there is no path into open 
orders, it is a simple stand alone detail..

8.4.2 Sequential processing and DBdelete (moving secondaries)

One of the things to be very careful about with Image data bases 
is when serially processing an Image master and performing 
DBdeletes. You must manually check the status after the DBget or 
re-read the current record to verify that a secondary that was in 
a higher relative record number hasn't moved to the current 
record which you've just deleted. If you don't check this 
condition manually, it could lead to a situation in which your 
program appears to have processed all of the entries but really 
has not. This problem gets worse as the number of entries grows 
and the master becomes saturated.

8.4.3 Using a DBupdate versus a DBput and DBdelete

One way to improve response time is to include a deleted flag in 
the data entry and logically delete records by updating this 
flag instead of actually performing the DBdelete. This is very 
handy for data entries that tend to be extremely volatile. It is 
also handy when someone accidentally deletes an entry or group of 
entries inadvertently. The entries can be physically deleted when 
the space is required, and the processing time is available.

Another benefit of this method is that it can be used to always 
be adding to the end of a detail set, a feature which can 
substantiate performance gains if the detail entries for a given 
chain are added at the same time, i.e. order line items in an 
order detail. The drawback is that all applications using the 
data set would have to recognize and ignore the logically deleted 
entries.

8.5 The need for archiving of data from the data base

In the great hurry to get data into the data base, the more 
important question of how long will it stay and how to get rid of 
it are often overlooked. This over site will lead to a startling 
discovery about Image which is IT TAKES JUST AS LONG TO DELETE AN 
ENTRY AS TO PUT IN THE ORGINAL ENTRY. So, if you have a staff of 
order entry clerks entering orders two shifts per day, five days 
a week for two years, how long will it take to archive out the 
shipped orders

For many applications the problem is not knowing what entries to 
keep and which to delete, and for others the problems is when to 
actually delete the entries.

8.5.1 Government and corporate regulations

One of the main reasons that data is allowed to accumulate in a 
data base is a lack of corporate policy on record keeping, or on 
the other hand some applications are forced by various government 
regulations to keep records online.

The decisions on policy for record keeping should be made if at 
all possible before the design of the data base or the purchase 
of a package. How the various applications deal with monthly 
amounts of entries will be dramatically different than processing 
two or three years of entries.

8.5.2 Using expiration dates on data entries

The problem of knowing which entries to keep and which to delete 
can be very difficult if two types of data items are not included 
in the data entry. One data item is an expiration date. That 
simply is a timer that tells when it's okay to delete the entry. 
And the other is a last accessed date, which is updated when the 
data entry is retrieved and/or updated. Both of these data items 
can work together to provide a method for deleting entries.

8.5.3 The problems of month and year-end archiving

By including expiration and last accessed data items, it is 
possible to perform archival runs on daily and weekly cycles 
instead of combining this type of processing with other month and 
year-end activities. These periods are usually heavy enough 
without adding the additional overhead of archival.
 
8.6 Validation of logical relationships

Logical relationships are the interdependencies of data that the 
application builds into the data base design. For example, a 
single customer entry per customer number, entries in an 
inventory data base that are referenced by the production 
planning database. These relationships should be well documented, 
and auditing applications put in place to verify or balance one 
data base against another. The point being is that it is much 
better to find and correct problems by cycle checking the data 
bases than in a big month or year-end run.

8.6.1 Locking strategies

Locking across multiple data bases requires multiple rin (MR) 
capability, which can be very dangerous if implemented 
improperly. Poor locking strategies are one of the primary ways 
in which data bases or data sets within a data base can get out 
of balance. A key strategy is to perform locks in the same 
sequence in all applications, and to lock and reread entries 
prior to an update transaction.

If the same locking sequence is used throughout all applications, 
one of the major causes of deadlocks will be avoided. Besides 
requiring a restart of the systems, deadlocks also contribute to 
fragmented application of transactions that can logically corrupt 
data sets or data bases.

Other logical corruption occurs by coding applications so that 
they blindly DBupdate or DBdelete/DBput the current record. To 
insure logical integrity, the application should reread the 
current record to see if it still exists, and if so, if it is the 
same as it was when it was initially read. The situation where 
multiple users go after the same record simultaneously is 
application dependent. But it happens more frequently than you 
would think. The results are that an update that was thought to 
be processed gets "unprocessed> or you might even get a false 
indication of a broken chain because the next or previous record 
was deleted before you got to it.

8.6.2 Matching multiple data bases

Unfortunately, the only way to match and verify multiple data 
sets or data bases is to build an application program that audits 
the content of the entries. It is sometimes useful to create 
control data sets that contain record counters and "hash> totals. 
A "hash> total is simply the total accumulation of one or more of 
the data entries in a data set. In its simplest form, it could be 
the total of all of the dollar-sales data items for all data 
entries. These record counters and totals provide a quick method 
of balancing even very large files.
 
8.6.3 Matching data bases across multiple systems

One of the data management problems unique to distributed 
computers is the management of multiple data bases on different 
computer systems. There are some third party products that 
attempt to apply transactions from one data base to the other and 
vice versa, but the amount of time spent keeping the data bases 
in synchronization is always underestimated, and it always turns 
out that specialized applications are required to assist in 
auditing and adjusting the content of the data bases.

8.6.4 Audit trails for data base changes

The most valuable aid in the battle to keep data bases balanced 
is the transaction audit trail. This report is a history of the 
detail and total net changes to the data base data sets. Once the 
data base gets out of balance, the process of putting it back is 
predicated in a large part by someone's ability to figure out 
what happened. Without an audit trail, it is almost impossible to 
recreate the transactions exactly.

These reports are some of the easiest to produce as the 
application program updates, adds, or deletes entries in the data 
base. But all too often the only information these programs 
generate is the end of program message.

8.7 Validation of physical relationships

Physical relationships are the internal pointers that Image 
maintains for you. The path from a master to a detail, and the 
chains maintained in the detail are made up of pointers and 
counters that physically make up the data base structure.

These pointers are protected as much as possible by Image and 
MPE, but computers as reliable as they are, are still prone to 
failures. As � have witnessed repeatedly, human error can be 
even more devastating.

8.7.1 Finding broken chains

There are two ways to find a broken chain in an image data base. 
One way is just to wait, and sooner or later at two or three in 
the morning during a crucial month-end or year-end job, you'll 
find one. The other way is to go looking for them, and the only 
way to validate an Image data base is to read down each chain. 
The contributed library utility DBCHECK, and ADAGER's diagnostic 
functions should be run on a periodic basis.  
 
8.7.2 Setting up a logging strategy

When the data base becomes physically corrupted via hardware, 
software or human error, you are faced with the decision of 
whether to attempt to patch up the data base if possible, or 
restore the data base from a backup copy. If the physical media 
has been destroyed by a head crash or partially overwritten, the 
chances of patching up the data bases are slim. If however the 
extent of damage is a broken chain or two, these can sometimes be 
mended by doing a chained unload and load of the affected data 
sets.

The only other option depends on if you were logging the 
transactions to a log file on a separate disc or tape drive. In 
this case you can restore from the backup and reapply the logged 
transactions with little or no loss of production data. The 
length of time between backups of the data base should be 
determined by the number of transactions in the interval. 
Generally, the greater the volume of transactions, the shorter 
the interval between back ups.

8.7.3 Using data base audit utilities

The best way to use these audit utilities is to set up production 
job streams for each data base:

!JOB DBCHEKUP,CREATOR.DATABASE
:RUN DButil.PUB.SYS
SHOW DBname ALL
EXIT
:RUN QUERY.PUB.SYS
B=dbname
;
5
FO SETS
EXIT
:RUN DBLOADNG.PUB.TELESUP
dbname
;

!EOJ

If you are unsure about which data bases are located in what 
accounts on your system, a neat trick is to perform a LISTF 
@01.@.@,2. This will list all files that end in 01 in each group 
and account. An Image data base must have at least one data set, 
which will be the data base name with 01 appended for a file 
name. The files listed with a PRIV file code are Image data sets.
 
8.8 Changing a key value

A major design problem is updating a search item value. The only 
way that Image allows this to happen is via a DBdelete and a 
DBput combination. This problem is usually worse for details as 
they can have up to sixteen search items.

8.8.1 Calculating number of IO's needed

The main issue for deleting and adding a data item is the CPU and 
Disc IO required to perform the transaction. A system with this 
kind of application will usually be able to serve about half as 
many users as the same system performing simple DBputs.

The number of IO's required for each DBdelete and DBput is on 
average 2+2(number of paths if a detail, thus a detail with three 
paths would require sixteen IO's to complete the transaction. But 
we must also multiply by the number of users out of the total 
signed on users who will perform this transaction in a given time 
period.

Thus, when deciding on key values, we must take into 
consideration the likelihood that the data item may be changed. 
Usually these types of key values can be stored in separate data 
sets (remember the open versus closed orders example(c) and 
eliminate the need for making the data item a search item and 
also the need to perform a DBdelete/DBput versus a DBupdate.

8.8.2 Assessing effects of volatility

The performance impact of changing a key value for each 
transaction should be thoroughly tested and understood before 
going into production. These are the kind of gotchas that can 
create the need for a total rewrite before the application can go 
into production.

8.8.3 Implications for locking strategies

When deleting an entry from one path and adding to another path, 
the only choice is to lock at the data set level, and as this is 
a relatively long transaction, the wait time for other 
transactions can become excessive. This can be verified by using 
DButil and using the show command to see how often and how long 
users are waiting in the locking queue.

8.9 Data content of masters

Manual masters can have optional data items that can have 
dramatic design implications. Problems can occur with serial 
retrieval and decreased number of records per block.
 
8.9.1 Data in a master must be retrieved serially

The fact that an entry can be retrieved only by one unique key 
value means that all other data items in the data entry can only 
be retrieved via a serial search. For example, an employee manual that uses the 
social security number as the unique key. 
What do you do when someone doesn't know their social security 
number, serial search until you match on last name. This could 
take awhile if you have a large number of employees.

8.9.2 Wide masters allow fewer keys per block

The other issue is that the number of key entries per block 
decreases as the record gets wider. This can aggravate the 
processing of secondaries by increasing the likelihood that the 
secondary chain spans multiple blocks.

8.9.3 Creating a new detail data set

The solution is to drop the master down to a detail data set and 
add the paths required for online interactive access. This can 
not be done with a simple DBUNLOAD/DBLOAD, you must have a data 
dictionary unload/load utility, a data base restructuring 
utility, or break down and write a custom program.

8.10 Removed

8.11 Eliminating keys that are not used

Probably the most powerful thing you can do to an Image data base 
to improve performance is to remove paths to details that aren't 
being used. The secret is picking out these paths and finding the 
computer time to actually make the changes.

Unused paths can be identified by the average and maximum chain 
lengths. A path with very long chains cannot be used 
interactively and probably take longer to process than reading 
the detail serially.

The main benefits of removing a path are reducing the amount of 
CPU time and disc IO's for each transaction and the amount of 
disc space required for the master and the detail.

8.11.1 Wasted CPU time and disc IO's

The number of disc IO's to DBput or DBdelete is 2+2(number of 
paths). By deleting a path we reduce the multiplier by one. So, 
going from four paths to three paths means going from ten IO's to 
eight IO's, for a savings of 20%. In general, for DBputs and 
DBdeletes the amount of CPU required is directly related to the 
number of disc IO's, thus we would expect a corresponding savings 
in CPU time.

The number of IO's actually required can be greatly increased by 
the presence of secondary chains in the case of associated 
masters and by the use of sort and extended sort items in the 
detail.

8.11.2 Wasted disc space

Each entry in the master must have six words for each associated 
detail, and each entry in the detail must have four words for 
each set of forward and backward chain pointers. This doesn't 
sound like much, but it adds up fast.
 
8.12 Logging and recovery

The simple reason for logging is that it is the only way to 
recover transactions that have been applied to the data base 
since the last backup. For some applications, losing data is not 
a big problem, but for most it's a major catastrophe. Image 
logging is not very difficult to install, and transactions are 
not that difficult to recover. The hard part is making sure that 
logging is always enabled. System restarts, and various 
restructuring utilities turn off logging and usually manual 
intervention is required to re-enable the data base for logging.

8.12.1 Predicting recovery time

The amount of time between backups is determined by the number of 
transactions in the log file. Since each transaction must be 
DBput, DBdelete, or DBupdated back into the data base, the amount 
of time to recover can be as long as the amount of time between 
backups. It can be very difficult to estimate the actual number 
of transactions and their type. This is why DBRECOV can be run to 
produce a transaction statistics report that identifies the 
transactions type and count. With this data and the formula for 
number of IO's per transaction, you can estimate the number of 
hours, days or weeks it would take to recover the transactions if 
you had to.

8.12.2 Intrinsic level recovery (ILR)

8.13 Error handling

When designing an Image application, consideration should be 
given to the processing of errors and the facility for debugging 
and tracing the conditions which lead to the error. When � 
consult on data base and application design, my favorite trick is 
to use DButil to change the data base passwords to some bogus 
values, to see how the application will handle the error on the 
DBopen call.
 
Very often I'll find that the program doesn't even check the 
status after the DBopen and aborts on the next intrinsic call. 
Other applications will display the standard Image error message 
by calling DBexplain or DBerror. But the best applications will 
print an error log report stating the date, time, program, user 
logon and the Image error, along with other useful information 
generic to the particular application.

This type of error handling is basic to being able to 
successfully support the application once it goes into 
production. You simply can't count on the user being able to 
document and relate enough information to quickly diagnose and 
solve the problem.

8.13.1 Logging and tracking of image errors

By printing out error occurrence reports you can track the errors 
over time to pick out trends or possible application flaws. This 
is also useful to spot users who may be in need of training and 
assistance. There are usually two types of problems that arise in 
Image applications. reproducible hard errors, and erratic 
intermittent errors. � like to classify problems as type A 
"twisty little passages all alike> and type A "twisty little 
passages all different". The trick to solving the reproducible 
problems is to be able to set up the exact conditions when the 
problem or error occurred. And the trick to solving the 
intermittent problems or errors is to gather as much hard 
documentation as possible and to try to find common patterns.

8.13.2 User interface to image errors

Image errors are simply not intended for end user consumption. 
When designing your application system, build your own message 
catalog of errors and warning messages. Your HP3000 comes with 
the facility to build, access and maintain custom message 
catalogs. The HP3000 also comes with an extensive help facility 
that includes the capability for building custom help catalogs, 
but this is another topic altogether.

8.14 Critical design assumptions

In any design, you are forced to make certain assumptions based 
on the knowledge of the applications at that time. This next 
section is dedicated to a list of assumptions that frequently 
cause major design problems in the long run. The fun and 
challenging part of debugging Image data base performance 
problems is figuring out what the data base designer must have 
been thinking (or not thinking of(c) when the data base was under 
construction.

8.14.1 The ratio of reads to writes

On any HP3000 the operating 
system and the Image data base software has been optimized for a 
mix of 75% read orientated applications and 25% write orientated 
applications. Systems whose design deviates from this average 
will not get the same number of interactive users with the same 
response time profiles as the three to one read bases systems.
 
The 75% reads and 25% writes rule of thumb

The assumptions made about performance of Image data base 
applications in regard to the average number of users that a 
specific model of HP3000 can support is based largely on 
applications that DBget three entries for every one DBput.

Application designs that do not match this standard profile i.e. 
order entry, will not get the same number of users with the same 
response times as the standard applications. This is because it 
takes a lot more work to perform DBputs in terms of CPU and Disc 
IO's and DBputs to a shared data base require at least set level 
locking.

So what has to be done to the design of applications that perform 
three DBputs for each DBget. The answer: is keep the structure of 
the data base as simple as possible, minimizing the number of 
paths into details, and thus the amount of work for the DBput to 
perform.

Another solution is to edit and write the transactions to a stand 
alone detail, and then have a batch background process running at 
a lower priority that reads the stand alone transaction detail 
and posts it to the appropriate data set. This disassociates the 
data entry user from the DBput with attached masters, thus 
allowing maximum throughput from the data entry application.

But whatever the work around, keep an eye out for these DBput 
orientated applications, it is much easier to design transaction 
files into the application in advance than attempting to retrofit 
once the application is in production.

8.14.3 Unload and reload times

When designing a data base, one of the considerations as when to 
split up data bases and data sets is the amount of time it will 
take to unload and reload the data base and each data set. When 
this estimate exceeds two days, make sure that you have funds in 
your project budget for one of the data base tools that can 
unload and load the database or individual data sets.

Even if you don't plan to pack detail data sets on a regular 
basis, when you come across a broken chain, you'll be faced with 
a decision to restore back to an old version of the data base 
prior to the corruption (if you have one(c) or to unload and reload 
the data base or affected data sets.

8.14.4 Using sorted chains

The assumptions made about the overhead of designing in sorted 
chains have been historically one of the biggest Image data base 
design problems. This is because of two reasons. the first is 
that the data base designer is usually also the application 
programmer looking for ways to cut corners in writing application 
code, and the second reason is that the performance problems do 
not show up until you perform volume testing or go into 
production.

8.14.4.1 Location of the sort item in the record

When placing a sort data item in a data entry, care should be 
taken to place the item at the end of the data entry. This is 
because when determining the proper location in the chain for the 
data entry being added, Image will first compare the sort data 
item values if they are equal, then Image will use any data items 
in the data entry that are specified after the sort data item as 
sub sort items. Needless to say the overhead for sub sort items 
can be very costly, especially if they were specified 
unknowingly.

8.14.4.2 Adding entries in the chain in other than sort sequence

When data entries are added to a detail that has a sorted chain, 
Image will start at the end of the chain examining each entry to 
find the correct location in the chain to add the new entry. If 
the detail entries are added in descending sorted sequence and no 
entries exist on the chain that have a sort item value greater 
than the sorted entries, the overhead of maintaining the sorted 
chain is minimal.

If, however, the entries are added in random sequence, the amount 
of work in terms of CPU and disc IO's increases with each entry 
added to the chain. This can lead to batch jobs that run for 
days, and interactive response times measured in hours!
 
8.14.4.3 Making use of chronological sequencing

If no sort item is specified, then Image will add entries to the 
chain in chronological order. So if you want the entries on a sorted, then 
simply add them to the chain in sorted 
sequence. Beware that some data base restructuring tools do not 
preserve the chronological sequencing of detail chains when they 
unload and reload.

8.14.5 Using sort intrinsics

Most sorted chains can be replaced by simply sorting the entries 
at the time at which they need to be presented in sorted order. 
More often than not this is in a batch job anyway. MPE comes with 
a SORT and MERGE utility program which allows programmatic calls 
to sort and merge data entries. These are much more flexible and 
powerful than using Image sorted chains.

8.14.6 Master hashing performance profiles

All key types do not hash the same. The design decision on the 
data type and content of a master search item is usually based on 
the assumption that hashing is uniformly efficient no matter what 
the key type or content. The truth is that hashing performance 
varies greatly for different data types, and even for the same 
data types with different data contents.

8.14.6.1 Short character keys

Short character keys are probably the most widely used key data 
type. Short character are those that range from X10/U10 to X2/U2. 
This type of key exhibits especially poor hashing performance 
when the value of the key occurs in sequence. For example, date 
keys are typically X6, and the value of date is MMDDYY. Storing 
the content in this manner is going to cause problems for two 
reasons, one is that to sort on date you would have to specify 
two sort fields, YY and MMDD. The second is that when hashing, 
Image starts with the right most bytes and uses these to flip and 
twiddle the bits going to the left. In this case the YY part of 
the key will be used every time and we'll wind up with a large 
percentage of entries on a secondary chain.

So if your data base has these short character keys, you can do 
two things to improve the hashing performance. One is to make the 
right most bits as variable as possible, and the other is to size 
the short character masters at 45% to 50% full.
 
8.14.6.2 Sequential numeric keys

Image uses a different method to hash numeric key values data 
types of I,J,K, and R. The right most 31 bit of the key item 
value is loaded into a double integer, reduced by 1, divided by 
the capacity of the master data set, then the remainder is 
multiplied by the capacity of the master data set and incremented 
by 1 to finally arrive at a primary relative number.

So, if the capacity were 100 and the key value was 50, then the 
relative record number would be calculated as follows:

1. Subtract 1 from the right most 31 bits 50 - 1 = 49
2. Divide by the capacity of the master 49 / 100 = 0.49
3. Multiply the remainder by the capacity .49 * 100 = 49
4. Add 1 to the result of step 3  49 + 1 = 50

For numeric key types, if the left most 31 bits is less than the 
capacity of the master, then the primary relative record address 
will be the value of those 31 bits. Let's do an example of a key 
value of 231 with the capacity being 100.

1. Subtract 1 from the right most 31 bits 231 - 1 = 230
2. Divide by the capacity of the master 230 / 100 = 0.23
3. Multiply the remainder by the capacity .23 * 100 = 23
4. Add 1 to the result of step 3  23 + 1 = 24

When using numeric data types, you are in direct control of 
record placement via the content of the key value. This can be a 
method of guaranteeing that no secondary entries ever exist, if 
and only if the key value is between 1 and the capacity of the 
data set. However, if the key values are greater than the 
capacity of the data set, then you could find yourself in a 
situation where every entry being added is a secondary entry.

8.14.6.3 Modeling hashing performance

In either case, whether you go with character or integer keys, it 
is always a good idea to load your proposed data base up with 
good test data to see what the occurrence of secondaries will be. 
Then you can alter the type of the key, the content of the key 
value, or the capacity to see if you can improve on the hashing 
profiles.

Secondaries are only of consideration if the chains point to 
other blocks. If the chain is wholly contained in the current 
block then the effort to process the chain is minimal. However, 
if the chain crossed many blocks, then multiple IO and deblocking 
operations are required. This leads back to the discussion of the 
data content of masters. Remember that the more data items 
contained in the master record, the fewer data entries per block. 
Thus, masters that have data items generally have more secondary 
chain pointers to other blocks.
 
8.15 Capacity planning

NOTE: THIS SECTION NEEDS TO BE REDONE IN LIGHT OF MDX & DDX

8.15.1 The MDX/DDX trap

The use of MDX,DDX and utilities for altering data set capacities
make it very easy to forget that some day this data set will get
corrupted, and if logging isn't being used, the only choice will
be an unload and a reload of at least the data sets involved. THERE
IS A BIG DIFFERENCE BETWEEN CHANGING THE CAPACITY OF A DETAIL DATA 
SET AND UNLOADING AND RELOADING IT. So whenever capacities are changed, 
the amount of time to DBunload and DBreload should be calculated. If
the time required is not acceptable, then your only alternative 
is transaction logging.

8.15.2 Hashed and serial read performance issues

Increasing the capacity of a master data set will generally 
improve the overall hashing performance, unless you pick a bad 
value for the new capacity. There is no method for accurately 
predicting the hashing performance for a large master data set, 
but even numbers and multiples of ten are surefire losers.
 
When increasing the capacity of data sets, be aware that 
applications that search serially will take longer. This will 
affect master serial searches immediately and detail serial 
searches as new entries are added.

8.16.2 The marginal benefits of repacking

Reloading or packing detail data sets is a way of reducing the 
number of IO's required to retrieve a detail chain. The 
performance improvement can be measured in the elongation factor 
before and after the reload or pack. This difference represents 
the average number of IO's per chained access saved.
 
8.16 Removed

8.17 Placement of data sets on disc drives

With MPE/XL mapped file access, the emphasis on file placement
has waned. However, for IO intensive systems, or systems with a 
large number of disc drives, disc management is still a big
problem. The following is a brief discussion of how to balance IO
across disc drives, which data sets to separate, and the effects
of using a separate volume for Image logging.

8.17.1 Balancing disc IO

HP disc drives can perform a maximum of twenty to twenty-five 
IO's per second, and as we've seen, the DBput and DBdelete 
intrinsics can generate a large number of IO's. The best method 
for balancing IO is to audit the IO requirements for the entire 
system, building a list of busiest files. This data can be 
gathered by using the contributed library program FILERPT. 

This list can then be used to separate the busiest files onto 
separate disc drives. On MPE/XL systems, the primary concerns 
are the larger randomly accessed (Image masters) files, as the 
smaller files will be wholly mapped into memory.

8.17.2 Separating master and details

In general, related master and detail data sets should be placed 
on separate disc drives. This is because they create head 
contention and thus slow down the DBput, DBfind/DBget, or 
DBdelete. This can be a very complicated task for a system with 
many data bases and many data sets. Again, we are only concerned 
with the most active data sets.

8.17.3 Separate volumes for disc logging

Using up a disc drive for logging Image transactions to disc is 
usually a big waste and very much overkill. Simply taking care
to locate the logfile on a separate disc from the data base data 
sets and documenting the label location of the logfile is going 
to be sufficient for most sites. The disc, if not used as a 
separate volume, could add significantly to the overall throughput 
of the system.