Sei sulla pagina 1di 5

The University Database Integration:

An AutoMed Example
AutoMed Technical Report 12: Draft Version A

Peter Mc.Brien
Dept. of Computing, Imperial College, pjm@doc.ic.ac.uk

Abstract
This reports describes a contrived set of database schemas, and how they are integrated
using the AutoMed approach. It serves to illustrate several common techniques used in a
AutoMed integration of databases.

1 Introduction
AutoMed [2] supports many methodologies for performing data integration and hence forming a
network of pathways joining schemas together. Here we describe a simple methodology based
on forming union-compatible schemas, the general structure of which is illustrated in Figure 1.
Each of the n local schemas LSi is first transformed into a “union” schema U Si . These n union
schemas are syntactically identical, and this is asserted by creating a sequence of id transformation
steps between each pair of union schemas U Si and U Si+1 , of the form id (U Si : c, U Si+1 : c) for
each schema construct c. These id transformations between pairs of union schemas are generated
automatically by the AutoMed software. An arbitrary one of the union schemas can then be
selected for further transformation into the global schema GS. This two stage process reflects first
schema conformance, followed by schema integration and restructuring.

global
GS
schema

6
?
id id union
U S1 ¾ - U S2 ¾ - U S3 ¾ . . . - U Si ¾ . . . - U Sn compatible
schemas
6
? 6
? 6
? 6
? 6
?

LS1 LS2 LS3 ... LSi ... LSn local


schemas

Figure 1: A general AutoMed Integration

1
2

LS1 dept(dname)
LS5 university(uname)
staff(id,name,sex,dname)
college(cname,uname)
dept(dname,street,cname)
LS2 staff(id,name,dname)
staff(id,name,sex,dname)
male(id)
female(id)
US college(cname)
dept(dname,street,cname)
LS3 dept(deptname)
degree(dcode,title,dname)
degree(dcode,title,dname)
staff(id,name,sex,dname)
person(id,dname)
student(id,sex,dname)
male(id)
female(id)
GS college(cname)
dept(dname,street,cname)
LS4 dept(dname)
degree(dcode,title,dname)
student(id,sex,dname)
person(id,name#,sex,dname)
degree(dcode,dname)

Figure 2: Example schemas

2 The University Global Database Example


Figure 2 gives some specific schemas that illustrate the integration approach of Figure 1. Primary
key attributes are underlined, foreign key attributes are in italics and nullable attributes are
suffixed by #.
The schema US1 is the simplist to integrate into US since the only constructs it contains are
exactly the same in extent as the same constructs in US. Hence the pathway LS1 → US in Exam-
ple 1 involves only extend transformations to add each of the tables student, college and degree,
plus the two attributes of dept which are present in US but not LS1 , and add tramsformations to
assert the missing primary and foreign key constraints:

Pathway 1 LS1 → US

1 extendTable(hhstudentii)
2 extendField(hhstudent,idii)
3 extendField(hhstudent,sexii)
4 extendField(hhstudent,dnameii)
5 addPK(hhstudent,hhstudent,idiiii)
6 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
7 extendTable(hhcollegeii)
8 extendField(hhcollege,cnameii)
9 addPK(hhcollege,hhcollege,cnameiiii)
10 extendTable(hhdegreeii)
11 extendField(hhdegree,dcodeii)
12 extendField(hhdegree,titleii)
13 extendField(hhdegree,dnameii)
14 addPK(hhdegree,hhdegree,dcodeiiii)
15 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
16 extendField(hhdept,streetii)
17 extendField(hhdept,cnameii)
18 addFK(hhdept,hhdept,cnameii,college,hhcollege,cnameiiii)
3

In Example 2, transformations 19 – 30 use extend transformations to state that the tables


student, college and degree in US cannot be derived from LS2 . Then 33 – 36 use the dname
attribute of person to derive the dept table in US, and use extend transformations for the two
attributes street and cname that cannot be derived from LS2 . Finally, in 41 – 49 the male and
female relations of LS2 are restructured into the single sex attribute of staff.

Pathway 2 LS2 → US

19 extendTable(hhstudentii)
20 extendField(hhstudent,idii)
21 extendField(hhstudent,sexii)
22 extendField(hhstudent,dnameii)
23 addPK(hhstudent,hhstudent,idiiii)
24 extendTable(hhcollegeii)
25 extendField(hhcollege,cnameii)
26 addPK(hhcollege,hhcollege,cnameiiii)
27 extendTable(hhdegreeii)
28 extendField(hhdegree,dcodeii)
29 extendField(hhdegree,titleii)
30 extendField(hhdegree,dnameii)
31 addPK(hhdegree,hhdegree,dcodeiiii)
32 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
33 addTable(hhdeptii, [x | (y, x) ← hhstaff,dnameii])
34 addField(hhdept,dnameii, [(x, x) | x ← hhdeptii])
35 extendField(hhdept,streetii)
36 extendField(hhdept,cnameii)
37 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
38 addFK(hhstaff,hhstaff,dnameii,dept,hhdept,dnameiiii)
39 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
40 addFK(hhdept,hhdept,cnameii,college,hhcollege,cnameiiii)
41 addField(hhstaff,sexii, [(x,′ M′ ) | x ← hhmaleii] ++ [(x,′ F′ ) | x ← hhfemaleii])
42 deleteFK(hhmale,hhmale,idiiii, hhperson,hhperson,idiiii)
43 deletePK(hhmale,hhmale,idiiii)
44 deleteField(hhmale,idii, [(x, x) | x ← hhmaleii])
45 deleteTable(hhmaleii, [x | (x,′ M′ ) ← hhstaff,sexii])
46 deleteFK(hhfemale,hhfemale,idiiii, hhperson,hhperson,idiiii)
47 deletePK(hhfemale,hhfemale,idiiii)
48 deleteField(hhfemale,idii, [(x, x) | x ← hhfemaleii])
49 deleteTable(hhfemaleii, [x | (x,′ F′ ) ← hhstaff,sexii])

The example uses the Intermediate Query Language (IQL), which is the defualt query
language supported by the AutoMed implementation. In IQL ++ is the bag union operator and
the construct [e | Q1 ; . . . Qn ] is a comprehension [1]. The expressions Q1 to Qn are termed
qualifiers, each qualifier being either a filter or a generator. A filter is a boolean-valued
expression. A generator has syntax p ← c where p is a pattern and c is a bag-valued expression.
In IQL, the patterns p are restricted to be single variables or tuples of variables.
The pathway LS3 → US contains extend steps 50 – 57 to add the missing student and college
tables, which are textually the same as 19 – 25 . It then renames deptname, adds the missing
attributes of dept, renames person to staff, and adds the missing name attribute. Finally, in steps
64 – 71 it does the same restructuring as steps 41 – 49 of LS2 → US, converting the male and
female relations into the single sex attribute of staff.

Pathway 3 LS3 → US
4

50 extendTable(hhstudentii)
51 extendField(hhstudent,idii)
52 extendField(hhstudent,sexii)
53 extendField(hhstudent,dnameii)
54 addPK(hhstudent,hhstudent,idiiii)
55 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,deptnameiiii)
56 extendTable(hhcollegeii)
57 extendField(hhcollege,cnameii)
58 addPK(hhcollege,hhcollege,cnameiiii)
59 renameField(hhdept,deptnameii, hhdept,dnameii)
60 extendField(hhdept,streetii)
61 extendField(hhdept,cnameii)
62 renameTable(hhpersonii, hhstaffii)
63 extendField(hhstaff,nameii)
64 addField(hhstaff,sexii, [(x,′ M′ ) | x ← hhmaleii] ++ [(x,′ F′ ) | x ← hhfemaleii])
65 deleteFK(hhmale,hhmale,idiiii, hhperson,hhperson,idiiii)
66 deletePK(hhmale,hhmale,idiiii)
67 deleteField(hhmale,idii, [(x, x) | x ← hhmaleii])
68 deleteTable(hhmaleii, [x | (x,′ M′ ) ← hhstaff,sexii])
69 deleteFK(hhfemale,hhfemale,idiiii, hhperson,hhperson,idiiii)
70 deletePK(hhfemale,hhfemale,idiiii)
71 deleteField(hhfemale,idii, [(x, x) | x ← hhfemaleii])
72 deleteTable(hhfemaleii, [x | (x,′ F′ ) ← hhstaff,sexii])

The pathway LS4 → US contains a sequence of extend steps for its missing information.

Pathway 4 LS4 → US

73 extendTable(hhstaffii)
74 extendField(hhstaff,idii)
75 extendField(hhstaff,nameii)
76 extendField(hhstaff,sexii)
77 extendField(hhstaff,dnameii)
78 addPK(hhstaff,hhstaff,idiiii)
79 addFK(hhstaff,hhstaff,dnameii,dept,hhdept,dnameiiii)
80 extendTable(hhcollegeii)
81 extendField(hhcollege,cnameii)
82 addPK(hhcollege,hhcollege,cnameiiii)
83 extendField(hhdegree,titleii)
84 extendField(hhdept,streetii)
85 extendField(hhdept,cnameii)

The pathway LS5 → US contains a sequence of extend steps for its missing information and
also three contract steps to remove the university relation and its attributes.

Pathway 5 LS5 → US

86 renameTable(hhpersonii, hhstaffii)
87 extendTable(hhstudentii)
88 extendField(hhstudent,idii)
5

89 extendField(hhstudent,sexii)
90 extendField(hhstudent,dnameii)
91 addPK(hhstudent,hhstudent,idiiii)
92 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
93 extendField(hhdegreeii)
94 extendField(hhdegree,dcodeii)
95 extendField(hhdegree,titleii)
96 extendField(hhdegree,dnameii)
97 addPK(hhdegree,hhdegree,dcodeiiii)
98 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
99 deleteFK(hhcollege,hhcollege,unameii,university,hhuniversity,unameiiii)
100 contractField(hhcollege,unameii)
101 deletePK(hhuniversity,hhuniversity,unameiiii)
102 contractField(hhuniversity,unameii)
103 contractTable(hhuniversityii)

Finally, we list below the pathway from the union schema US to the global schema GS:

Pathway 6 US → GS

104 addTable(hhpersonii, hhstaffii ++ hhstudentii)


105 addField(hhperson,idii, hhstaff,idii ++ hhstudent,idii)
106 addField(hhperson,nameii, hhstaff,nameii)
107 addField(hhperson,sexii, hhstaff,sexii ++ hhstudent,sexii)
108 addField(hhperson,dnameii, hhstaff,dnameii ++ hhstudent,dnameii)
109 addPK(hhperson,hhperson,idiiii)
110 addFK(hhperson,hhperson,dnameii,dept,hhdept,dnameiiii)
111 deleteFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
112 deletePK(hhstudent,hhstudent,idiiii)
113 deleteField(hhstudent,idii, [(x, y) | x ← hhstudentii; (x, y) ← hhperson,idii])
114 deleteField(hhstudent,sexii, [(x, y) | x ← hhstudentii; (x, y) ← hhperson,sexii])
115 deleteField(hhstudent,dnameii, [(x, y) | x ← hhstudentii; (x, y) ← hhperson,dnameii])
116 deleteTable(hhstudentii), [x | x ← hhpersonii; not(memberhhstaffiix)])
117 deleteFK(hhstaff,hhstaff,dnameii,dept,hhdept,dnameiiii)
118 deletePK(hhstaff,hhstaff,idiiii)
119 deleteField(hhstaff,idii, [(x, y) | x ← hhstaffii; (x, y) ← hhperson,idii])
120 deleteField(hhstaff,nameii, [(x, y) | x ← hhstaffii; (x, y) ← hhperson,nameii])
121 deleteField(hhstaff,sexii, [(x, y) | x ← hhstaffii; (x, y) ← hhperson,sexii])
122 deleteField(hhstaff,dnameii, [(x, y) | x ← hhstaffii; (x, y) ← hhperson,dnameii])
123 deleteTable(hhstaffii, [x | x ← hhpersonii; (x, y) ← hhperson,nameii])

3 Conclusions
See http://www.doc.ic.ac.uk/automed/ for more details of AutoMed.

References
[1] P. Buneman et al. Comprehension syntax. ACM SIGMOD Record, 23(1):87–96, 1994.
[2] P.J. McBrien and A. Poulovassilis. Data integration by bi-directional schema transformation
rules. In Proc. ICDE’03 (to appear), 2003.

Potrebbero piacerti anche