2.1. - The appearance of DATA BASES CONCEPTS
2.2. - Defining DATA BASES
2.3. - The advantages of using DATA BASES
2.4. - Examples of DATA BASES
The appearance of DATABASES was a natural consequence of the data processing technology development, viewed in two ways:
- In a quantitative way - as a growth of the number of data that could be processed on a computer; millions and much more of data records could be stored, updated and processed in the memory of the computers
- In a qualitative way - as an increase of data structures complexity that could be processed at one time
Three steps can be emphasized in data processing development:
- First step - the separation between DATA NAMES and VALUES
- Second step - the increase of data structures complexity
- Third step - the separation between DATA and PROCEDURES
The separation between DATA NAMES and VALUES originates two levels of DATA DESCRIPTION:
- A concrete level of data description - description with VALUES - called PHYSICAL LEVEL of DESCRIPTION
- An abstract level of data description - description with NAMES - called LOGICAL LEVEL of DESCRIPTION
The presence of the abstract level of data description will facilitate:
- The GROUPING of data description
- The defining of DATA TYPES
- The association of defined RESTRICTIONS
- The association of appropriated OPERATIONS
and for this reasons this first step is a very important step to the DATABASES definition.
Regarding the increase of data structure complexity three stages of DATA PROCESSING may be defined if the DATA PROCESSING is viewed as a unit of two topics - the DATA and the CALCULUS:
- First stage described by - simple data and complex calculus
The simple data were few unstructured data used for scientific calculation and the most representative processing language was FORTRAN.
- Second stage described by - complex data and simple calculus
The complex data were many structured data grouped into records and used for simple economical calculation (adding and subtracting data); the most representative processing language was COBOL. At this time the importance of data retrieval was found.
- Third stage described by - complex data and complex calculus
At this time to the complex structured data were added largess libraries of complex functions specialized in processing several types of data and data structures, from data of type numeric and character, to data of type array, records and lists.
When the volume of data was not so extended the data structures may be stored in the internal memory and the communication with the external memory may be made with an initial READ and a final WRITE; the representative processing language was PASCAL.
When the volume of data was very large the data structures had to be stored only in the external memory. From this reason the transfer of the data structures between the internal and the external memory had to be made by data retrieval; the representative processing language was the data manipulation language of the DATA BASE MANAGEMENT SYSTEMS (DBMS).
The third step in data processing development - the separation between DATA and PROCEDURES - is the most important in defining the approach of DATABASE.
The complexity of data caused more and more updates that must be operate on the initial data definition, at both the physical and logical level. All this modification of data structures will affect the associated procedures. Introducing of several level of abstraction in data structures definition solved this difficulty.
J. D. Ullman describes in [2] three LEVEL of ABSTRACTION in a DATABASE SYSTEM. Figure 1 presents a schema of this DataBase fragmentation in level of abstractions.
Fig. 1. The level of abstraction in a DATABASE
At the lowest level of abstraction with which we deal, there is a physical database that resides permanently on secondary storage devices.
This PHYSICAL DATABASE consist of:
The set of DATA VALUES that are stored on the secondary storage devices according to the physical representation of values on the device (codify methods, compression techniques ans.)
The set of DEVICE CONTROL INFORMATION that are stored together with the user's DATA
In the middle of the abstraction we find the CONCEPTUAL LEVEL represented by the LOGICAL DATABASE, that groups the description of data by means of NAMES.
The LOGICAL DATABASE consist of:
- The DATA DESCRIPTION MODEL
- HIERARCHICAL data model
- NETWORK data model
- RELATIONAL data model
- The set of DATA DESCRIPTION CHARACTERISTICS:
- NAMES
- TYPES
- VALUE'S LIMITS
- The DATA DEFINITION LANGUAGE (DDL)
The LOGICAL DATABASE is also called DATA DESCRIPTION SCHEMA.
The EXTERNAL LEVEL represents the last level of abstraction.
This level is build by means of:
- A set of VIEWS that contains parts of the data described in the LOGICAL DATABASE and stored in the PHYSICAL DATABASE
- A set of VIRTUAL DATA that extend the information, which can be retrieved, from the DATABASE
The data actual stored in the DATABASE (in the LOGICAL and PHYSICAL database) are called REAL DATA.
The data that can be obtained from the REAL DATA by means of a function like:
VD = f (RD)
is called VIRTUAL DATA.
Example:
In a database for PERSONS:
- The born date in stored as a REAL DATUM with the name BD
- The age, AG, is treated as a VIRTUAL DATUM; it is not stored in the DATABASE as value, it appears only on the external level, defined in a view and is calculated by means of the following function:
AG = CD - BD
where:
CD is the current date memorized in the system
BD is the born data stored in the database.
At the EXTERNAL LEVEL, within a view, may also be changed the data model of the DATA STRUCTURES.
Example:
At the CONCEPTUAL LEVEL we can deal with a RELATIONAL DATA MODEL and at the EXTERNAL LEVEL we can see a NETWORK one.
The approach of the DATABASE SYSTEMS as a lot of level of abstraction defines two interfaces:
- PHYSICAL INTERFACE - that assures the PHYSICAL IMMUNITY of DATA defined as the stability of the CONCEPTUAL LEVEL when the INTERNAL LEVEL is changed
- LOGICAL INTERFACE - which assures the LOGICAL IMMUNITY of DATA, defined as the stability of the EXTERNAL LEVEL when the CONCEPTUAL LEVEL is changed.
Both INTERFACES allows getting the DATA INDEPENDENCE.
At end, since the VIEWS of DATA are incorporated at the EXTERNAL LEVEL in the application procedures, the abstraction level approach assures the IMMUNITY of the PROCEDURES when the LOGICAL or the PHYSICAL LEVEL of DATA is changed. By this we have achieved the most specifically feature of the DATABASE SYSTEMS, namely the DATA INDEPENDENCE against the PROCEDURES.
Must you say quickly if a software package is a DATABASE MANAGEMENT SYSTEM (DBMS), you can only verify if the tested product assures the DATA INDEPENDENCE against the PROCEDURES.
All DATABASE MANAGEMENT SYSTEMS (DBMS) performs three main functions:
- DATA DEFINITION function
- DATA MANIPULATION function
- USER INTERFACE function.
There are many other functions that can be carried out by the DBMS, including the following:
- DATA SECURITY function
- DATA INTEGRITY function
- DATA ACCESS SHARING function
- DATA RECOVERY function
- DATA ACCES CONTROL function
There is much definition of DATABASES in the literature. We present a concise definition based on the above-related topics:
A DATABASE is a stored data collection having the following characteristics:
- Assures the DATA INDEPENDENCE proved by:
* The presence of the DATA SCHEME (possibly DATA SUBSCHEME) and the appropriate DATA DEFINITION LANGUAGE (DDL)
- Assures accesses (possibly shared access) to large volumes of data proved by:
* The presence of the PHYSICAL DATA ACCESS LEVEL and the appropriate DATA MANIPULATION LANGUAGE (DML) and also the SECURITY and RECOVERY functions.
The advantages off using DATABASE are very numerous. From this we present the most important one:
- Facility in control of large data collection
* Complex data structures can be defined with DDL
* A great number of retrieval and update operations are offered by DML
Make shorter the designing period of the great systems projects
* An open data structure is assured by the data independence
* A great flexibility is offered by the programming interface
Adaptability to the frequent modifications of the initial specifications
* The immunity of the procedures is assured by the data independence
- Facility in designing integrated projects
* The presence of a unique data structure description in the data base dictionary
- A better control of the data consistence
* The concepts of data base integrity (entity and referential integrity) provide data consistence
- Answering the unexpected questions
* The two queering languages Structure Query Language (SQL) and Query by Example (QBE) assure this flexibility
Flexibility by the evolution of the designing approach
* The use of data structure views permits the changes of the designing conception
Shorter time of data retrieval
* Can be obtained by using the interactive SQL language
Symbolic example:
We give as example of DATABASE the oldest application in this domain, namely the LIST of PART processing. This application proceeds from the mechanical engineering and consists in to determine the requirements of PARTS involved in producing an AGGREGATE.
The general schema of the Data Structure is:
where q is the needed quantity of COMPONENT to produce 1 piece of COMPOUND
An example of DATA STRUCTURES is as follow:
where : A, B, .... are AGGREGATES or PARTS
q is the NEEDED QUANTITY
Two collection of data are necessary to describe this structure :
- a PRINCIPAL collection with the elements: A, B, ....
- a LINKAGE collection with the bounds: A B q 1
B D q 2
The standard procedures for Data Structure processing are:
DETAILED EXPLOSION - the list of the TREE data structures
CUMULATE EXPLOSION - the cumulated list of all COMPONENTS in a given COMPOUND
IMPLOSION - the list of all COMPOUNDS for a given COMPONENT
Concert example:
Let us consider the structure of a computer network, described as following:
- Computer Network
- SERVER (1)
- Internal Memory
Main Memory 128 Mb
Cache Memory 1024
Ports
External Memory
Hard Disk 2GB
Drive 2GB
Disk Controller SCASI
CD drive
Floppy disk
3.5 inches
5.25 inches
Display
Monitor
Video Adapter
Keyboard
Carcass
Mouse
- STATION (8)
- Internal Memory
Main Memory 32 Mb
Cache Memory 256 Kb
Ports
External Memory
Hard Disk 1GB
Drive 1GB
Disk Controller IDE
CD drive
Floppy disk
3.5 inches
5.25 inches
Display
Monitor
Video Adapter
Keyboard
Carcass
Mouse
Input Data:
Technological information about the products structure
Commercial information about the requirements quantities of products
Output Data (Reports):
the list of required articles that must be manufactured
the list of required materials
the list of the requirements updates for a changed commercial requirements
|