Call us on 44 7700 900693

Home » Essays » Harry

Harry

Deductive Databases-Theory Meets Practice
Carlo Zaniolo
MCC
3500 West Balcones Center Drive
Austin, Texas 78759
USA
Abstract

Deductive Databases are coming of age with the emergence of efficient and easy to use
systems that support queries, reasoning, and application development on databases through
declarative logic-based languages. Building on solid theoretical foundations, the field has ben-
efited in the recent years form dramatic advances in the enabling technology.
This progress is demonstrated by the completion of prototype systems offering such levels of generality, performance and robustness that they support well complex application development.
Valuable know-how has emerged from the experience of building and using these systems: we have
learned about algorithms and architectures for building powerful deductive database systems,
and we begin to understand the programming environments and paradigms they are conducive
to. Thus, several application areas have been identified where these systems are particularly
effective, including areas well beyond the domain of traditional database applications.
Finally, the design and deployment of deductive databases has provided new stimulus and a focus to
further research into several fundamental issues. As a result, the theory of the field has made
significant progress on topics such as semantic extensions to Horn logic and algorithms for
compilation and optimization of declarative programs. Thus, a beneficial interaction between
theory and practice remains one of the strengths of Deductive Databases as the field is entering
the ???90s and the age of technological maturity.

Background

Deductive Databases that support logic-based languages. Databases began in the ???7Os, with most of the early work
Interest in the area of Deductive
focusing on establishing the theoretical foundations for the field. An excellent review of this
work and the beneficial impact that it had on various disciplines of computing, and the database
area in particular, is given in [GMN]. Throughout the ???70s and the first part of the ???80s concrete
are coming
queries,
reasoning,
of age with
and
application
the emergence
development
of efficient
and easy to use systems
on databases
through
declarative

2

system implementations of these ideas were limited to few ground breaking experiments [Kell]. This
situation contrasted quite dramatically with the significant system-oriented developments that were
taking place at the same time in two fields very close to deductive databases. The first field was
relational databases, where systems featuring logic-based query languages of good performance,
but limited expressive power, were becoming very successful in the commercial world. The second
field is Logic Programming, where successive generations of Prolog systems were demonstrating
performance and effectiveness in a number of symbolic applications, ranging from compiler writing
to expert systems.

A renewed interest in deductive database systems came about as a result of the flare-up of
attention and publicity generated by the idea of Fifth Generation Computing. It was realized that
the rule based reasoning of logic, combined with the capability of database systems of managing
and efficiently storing and retrieving large amounts of information could provide the basis on which
to build the next-generation
of knowledge base systems. As a result, several projects were started
that focused on extending Prolog systems with persistent secondary-based storage management
facilities [Rash] or on coupling Prolog with relational databases [JaCV, KuYo, Li, CeGW]. Several
commercial systems are now available that support the coupling of SQL databases with Prolog or
expert system shells. In particular, is the system described in [Boc, Levi] provides close integration
between Prolog and Database facilities, and smart algorithms for supporting recursive queries
against the database.

Yet several other researchers were critical of the idea of using Prolog as a front-end to relational
databases. In particular, it was noted that the sequential left-to right execution model of Prolog
was a throw-back to navigational query languages used before relational systems. In relational
systems, the user is primarily responsible for correct queries, and the system takes care of finding
efficient sequencing of joins (query conjuncts), thus optimizing navigation through the database-a
special module called the query optimizer sees to that [Seta]. In Prolog, instead, the programmer
must carefully select the order of rules and of goals in the rules, since the correctness, efficiency
and termination of the program depend on it. A second problem follows from the fact that efficient
Prolog implementations
are based on a abstract machine (WAM) and features (pointers) that rely
on the assumption that data resides in main memory rather than secondary store [War]. Thus
a number of research projects opted for an approach that builds more on extensions of relational
database technology than on adaptations of Prolog technology. While several of these projects
limited their interests to extending query languages with specific constructs such as rules and
recursion, projects such as NAIL! [Meta] and f!DL [Cetal, NaTs] feature declarative languages
of expressive power comparable to Prolog. This paper recounts and summarizes the author??™s
experience in designing, developing and deploying the tDL: system.
Overview
The motivation
To provide support for advanced database applications,
knowledge based applications.
To provide better support for traditional database applications by integrating the application
development and database queries into one language-thus
solving the impedance mismatch
problem.
for designing and building

the LDL: system was twofold:

with a focus on expert systems and
A serious problem with current database applications is due to the limited power of languages
such as SQL, whereby the programmer has to write most of the application in a procedural lan-
guage with embedded tails to the query language. Since the computing paradigm of a procedural
language, such as COBOL, is so different from the set-oriented declarative computation model of
a relational language, an impedance mismatch occurs that hinders application development and
can also cause slower execution [CoMa]. R ea 1ization of this problem has motivated a whole line
of database research into new languages, commonly called database languages (BaBu]. The typical
approach taken by previous researchers in database languages consisted in building into proce-
dural languages constructs for accessing and manipulating
databases [Sch77, Rosh]. Persistent
languages, where the database is merely seen as an extension of the programming language, rep-
resent an extreme of this emphasis on programming languages. In a sharp departure from these
approaches, f!Dt focuses on the query language, and extends it into a language powerful enough
to support the development of applications of arbitrary complexity.
Rather than extending cur-
rent database query languages such SQL, however, LDL: builds on the formal framework of Horn
clause logic-a choice that had less to do with the well-known shortcomings of SQL, than with
the influence of Prolog (a language based on Horn clause logic). In fact, we were impressed with
the fact that this rule-based language was effective for writing symbolic applications and expert
applications as well as being a powerful and flexible database query language [Zanl].

A closer examination on why Horn clauses represent such a desirable rule-based query language
reveals the following reasons:

Horn Clauses are akin to domain relational calculus [Ull], which offer two important advan-
tages with respect to tuple calculus on which languages such as SQL are based-but
the
two calculi are known to be equivalent in terms of expressive power. One advantage is that
domain calculus supports the expression of joins without explicit equality statements; the
other is that lends itself to the visualization of queries -both
benefits vividly demonstrated
by QBE [Ull].
Horn clauses support recursion and complez terms (through function symbols) thus eliminat-
ing two important limitations of relational query languages and systems.
Horn clauses have a declarative
and least fixpoint [Llo, vEKo].
Horn clauses can also be used effectivel

As the last two points suggest, Horn clauses can be used effectively as either a declarative
query language or navigational one [Zanl]. In the declarative interpretation
of Horn Clauses, the
order of goals in a rule is unimportant
(much in the same way in which the order of conjuncts
in a relational query is immaterial).
The navigational interpretation
of Horn clauses follows from
the operational semantics of Prolog. Under this interpretation,
goals are executed respectively
in a left-to-right
order, and the programmer is basically entrusted with the task of using this
information to write terminating
and efficient programs, For instance, when the goals denote
database relations, the order defines a navigation through the database records; the programmer

semantics based on the equivalent
as a navigational
notions of minimal
query language.

model

A

must carefully select the best navigation,
limits the size of intermediate results.

A most critical decision in designing f!DLl was to follow the path of relational systems and
build on the declarative semantics, rather than on the operational interpretation
of Horn clauses.
This approach was considered to be superior in terms of data independence and ease of use.
Indeed this approach enables the user to concentrate on the meaning of programs, while the
system is now entrusted with ordering goals and rules for efficient and safe executions. A further
step toward declarative semantics was taken by freeing the user from the concern of whether
forward chaining or backward chaining should be used in executing a set of rules. Current expert
system shells frequently support only one of these two strategies; when they provide for both,
they leave to the programmer the selection of the proper strategy for the problem at hand and
its encoding as part of the program.
In f!DL, the actual implementation
is largely based on
a forward chaining strategy which is more suitable for database applications
[CetaO]. But the
compiler has also the capability of using rule rewrite methods, such as the magic set method
or the counting method [BMSU, SaZl, SaZZ], to mimic backward chaining through a bottom-
up computation.
Thus the f!D.C user is also provided automatically
by the system with the
functionality
and performance benefits of backward chaining. This substantial progress toward
declarative programming represents one of the most significant contributions to the technology of
rule-based systems brought about by the research on deductive database systems in the recent
years.

Another major area of progress for deductive databases is that of semantics. Indeed many
other constructs beyond Horn clauses are needed in a language such as f DL to support application
development. In particular, f!DL: includes constructs supporting the following notions:

Negation

Sets, includin

Updates [NaKr, KNZ]

Don??™t-care

Most of these constructs (excluding set terms) are also in Prolog-they
were added because
they were needed for writing actual applications. But, in Prolog, their semantics is largely based
on Prolog??™s operational model. Therefore, a major challenge of the L DL research was to define a
formal declarative semantics for these constructs, in a way that naturally extends the declarative
semantics of Horn clauses. The problem of extending the power of declarative logic is in fact the
second main area of recent advances promoted by research in deductive databases. Of particular
interest is the fact that many open problems in knowledge representation
and non-monotonic
reasoning have been given a clearer definition, and in some cases brought close a solution by these
new advances [MaSu, Prs2]

The combined challenge of designing a powerful and expressive language, with declarative
semantics, and efficient techniques for compilation and optimization describes the whole first phase
of LDL research. This began in mid 1984, and culminated in the implementation
of the first
prototype at the end of 1987. This prototype compile L:DL: into a relational algebra based language
FAD for a parallel database machine [Bor]. Rule rewriting methods, such as magic set and counting,

e.g., one that takes advantage of access structures

and

[ApBW,

Naq, Przl],

grouping

and nested relations [BNST, ShTZ],

non-determinism

were used to map recursive programs into equivalent ones that can be supported efficiently and
safely by fixpoint iterations [BMSU, SaZl, SaZ2]. A d escription of this system is given in [Cetal].

The implementation
of the first LDL prototype confirmed the viability of the new technology,
but did little to transfer this technology from the laboratory to actual users, since FAD is only
available on an expensive parallel machine. Seeking a better vehicle for technology transfer, a new
prototype system was designed with the following characteristics:

Portability
Efficiency,
Open System Architecture.