Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
An AutoMed Example
AutoMed Technical Report 12: Draft Version A
Peter Mc.Brien
Dept. of Computing, Imperial College, pjm@doc.ic.ac.uk
Abstract
This reports describes a contrived set of database schemas, and how they are integrated
using the AutoMed approach. It serves to illustrate several common techniques used in a
AutoMed integration of databases.
1 Introduction
AutoMed [2] supports many methodologies for performing data integration and hence forming a
network of pathways joining schemas together. Here we describe a simple methodology based
on forming union-compatible schemas, the general structure of which is illustrated in Figure 1.
Each of the n local schemas LSi is first transformed into a “union” schema U Si . These n union
schemas are syntactically identical, and this is asserted by creating a sequence of id transformation
steps between each pair of union schemas U Si and U Si+1 , of the form id (U Si : c, U Si+1 : c) for
each schema construct c. These id transformations between pairs of union schemas are generated
automatically by the AutoMed software. An arbitrary one of the union schemas can then be
selected for further transformation into the global schema GS. This two stage process reflects first
schema conformance, followed by schema integration and restructuring.
global
GS
schema
6
?
id id union
U S1 ¾ - U S2 ¾ - U S3 ¾ . . . - U Si ¾ . . . - U Sn compatible
schemas
6
? 6
? 6
? 6
? 6
?
1
2
LS1 dept(dname)
LS5 university(uname)
staff(id,name,sex,dname)
college(cname,uname)
dept(dname,street,cname)
LS2 staff(id,name,dname)
staff(id,name,sex,dname)
male(id)
female(id)
US college(cname)
dept(dname,street,cname)
LS3 dept(deptname)
degree(dcode,title,dname)
degree(dcode,title,dname)
staff(id,name,sex,dname)
person(id,dname)
student(id,sex,dname)
male(id)
female(id)
GS college(cname)
dept(dname,street,cname)
LS4 dept(dname)
degree(dcode,title,dname)
student(id,sex,dname)
person(id,name#,sex,dname)
degree(dcode,dname)
Pathway 1 LS1 → US
1 extendTable(hhstudentii)
2 extendField(hhstudent,idii)
3 extendField(hhstudent,sexii)
4 extendField(hhstudent,dnameii)
5 addPK(hhstudent,hhstudent,idiiii)
6 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
7 extendTable(hhcollegeii)
8 extendField(hhcollege,cnameii)
9 addPK(hhcollege,hhcollege,cnameiiii)
10 extendTable(hhdegreeii)
11 extendField(hhdegree,dcodeii)
12 extendField(hhdegree,titleii)
13 extendField(hhdegree,dnameii)
14 addPK(hhdegree,hhdegree,dcodeiiii)
15 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
16 extendField(hhdept,streetii)
17 extendField(hhdept,cnameii)
18 addFK(hhdept,hhdept,cnameii,college,hhcollege,cnameiiii)
3
Pathway 2 LS2 → US
19 extendTable(hhstudentii)
20 extendField(hhstudent,idii)
21 extendField(hhstudent,sexii)
22 extendField(hhstudent,dnameii)
23 addPK(hhstudent,hhstudent,idiiii)
24 extendTable(hhcollegeii)
25 extendField(hhcollege,cnameii)
26 addPK(hhcollege,hhcollege,cnameiiii)
27 extendTable(hhdegreeii)
28 extendField(hhdegree,dcodeii)
29 extendField(hhdegree,titleii)
30 extendField(hhdegree,dnameii)
31 addPK(hhdegree,hhdegree,dcodeiiii)
32 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
33 addTable(hhdeptii, [x | (y, x) ← hhstaff,dnameii])
34 addField(hhdept,dnameii, [(x, x) | x ← hhdeptii])
35 extendField(hhdept,streetii)
36 extendField(hhdept,cnameii)
37 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
38 addFK(hhstaff,hhstaff,dnameii,dept,hhdept,dnameiiii)
39 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
40 addFK(hhdept,hhdept,cnameii,college,hhcollege,cnameiiii)
41 addField(hhstaff,sexii, [(x,′ M′ ) | x ← hhmaleii] ++ [(x,′ F′ ) | x ← hhfemaleii])
42 deleteFK(hhmale,hhmale,idiiii, hhperson,hhperson,idiiii)
43 deletePK(hhmale,hhmale,idiiii)
44 deleteField(hhmale,idii, [(x, x) | x ← hhmaleii])
45 deleteTable(hhmaleii, [x | (x,′ M′ ) ← hhstaff,sexii])
46 deleteFK(hhfemale,hhfemale,idiiii, hhperson,hhperson,idiiii)
47 deletePK(hhfemale,hhfemale,idiiii)
48 deleteField(hhfemale,idii, [(x, x) | x ← hhfemaleii])
49 deleteTable(hhfemaleii, [x | (x,′ F′ ) ← hhstaff,sexii])
The example uses the Intermediate Query Language (IQL), which is the defualt query
language supported by the AutoMed implementation. In IQL ++ is the bag union operator and
the construct [e | Q1 ; . . . Qn ] is a comprehension [1]. The expressions Q1 to Qn are termed
qualifiers, each qualifier being either a filter or a generator. A filter is a boolean-valued
expression. A generator has syntax p ← c where p is a pattern and c is a bag-valued expression.
In IQL, the patterns p are restricted to be single variables or tuples of variables.
The pathway LS3 → US contains extend steps 50 – 57 to add the missing student and college
tables, which are textually the same as 19 – 25 . It then renames deptname, adds the missing
attributes of dept, renames person to staff, and adds the missing name attribute. Finally, in steps
64 – 71 it does the same restructuring as steps 41 – 49 of LS2 → US, converting the male and
female relations into the single sex attribute of staff.
Pathway 3 LS3 → US
4
50 extendTable(hhstudentii)
51 extendField(hhstudent,idii)
52 extendField(hhstudent,sexii)
53 extendField(hhstudent,dnameii)
54 addPK(hhstudent,hhstudent,idiiii)
55 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,deptnameiiii)
56 extendTable(hhcollegeii)
57 extendField(hhcollege,cnameii)
58 addPK(hhcollege,hhcollege,cnameiiii)
59 renameField(hhdept,deptnameii, hhdept,dnameii)
60 extendField(hhdept,streetii)
61 extendField(hhdept,cnameii)
62 renameTable(hhpersonii, hhstaffii)
63 extendField(hhstaff,nameii)
64 addField(hhstaff,sexii, [(x,′ M′ ) | x ← hhmaleii] ++ [(x,′ F′ ) | x ← hhfemaleii])
65 deleteFK(hhmale,hhmale,idiiii, hhperson,hhperson,idiiii)
66 deletePK(hhmale,hhmale,idiiii)
67 deleteField(hhmale,idii, [(x, x) | x ← hhmaleii])
68 deleteTable(hhmaleii, [x | (x,′ M′ ) ← hhstaff,sexii])
69 deleteFK(hhfemale,hhfemale,idiiii, hhperson,hhperson,idiiii)
70 deletePK(hhfemale,hhfemale,idiiii)
71 deleteField(hhfemale,idii, [(x, x) | x ← hhfemaleii])
72 deleteTable(hhfemaleii, [x | (x,′ F′ ) ← hhstaff,sexii])
The pathway LS4 → US contains a sequence of extend steps for its missing information.
Pathway 4 LS4 → US
73 extendTable(hhstaffii)
74 extendField(hhstaff,idii)
75 extendField(hhstaff,nameii)
76 extendField(hhstaff,sexii)
77 extendField(hhstaff,dnameii)
78 addPK(hhstaff,hhstaff,idiiii)
79 addFK(hhstaff,hhstaff,dnameii,dept,hhdept,dnameiiii)
80 extendTable(hhcollegeii)
81 extendField(hhcollege,cnameii)
82 addPK(hhcollege,hhcollege,cnameiiii)
83 extendField(hhdegree,titleii)
84 extendField(hhdept,streetii)
85 extendField(hhdept,cnameii)
The pathway LS5 → US contains a sequence of extend steps for its missing information and
also three contract steps to remove the university relation and its attributes.
Pathway 5 LS5 → US
86 renameTable(hhpersonii, hhstaffii)
87 extendTable(hhstudentii)
88 extendField(hhstudent,idii)
5
89 extendField(hhstudent,sexii)
90 extendField(hhstudent,dnameii)
91 addPK(hhstudent,hhstudent,idiiii)
92 addFK(hhstudent,hhstudent,dnameii,dept,hhdept,dnameiiii)
93 extendField(hhdegreeii)
94 extendField(hhdegree,dcodeii)
95 extendField(hhdegree,titleii)
96 extendField(hhdegree,dnameii)
97 addPK(hhdegree,hhdegree,dcodeiiii)
98 addFK(hhdegree,hhdegree,dnameii,dept,hhdept,dnameiiii)
99 deleteFK(hhcollege,hhcollege,unameii,university,hhuniversity,unameiiii)
100 contractField(hhcollege,unameii)
101 deletePK(hhuniversity,hhuniversity,unameiiii)
102 contractField(hhuniversity,unameii)
103 contractTable(hhuniversityii)
Finally, we list below the pathway from the union schema US to the global schema GS:
Pathway 6 US → GS
3 Conclusions
See http://www.doc.ic.ac.uk/automed/ for more details of AutoMed.
References
[1] P. Buneman et al. Comprehension syntax. ACM SIGMOD Record, 23(1):87–96, 1994.
[2] P.J. McBrien and A. Poulovassilis. Data integration by bi-directional schema transformation
rules. In Proc. ICDE’03 (to appear), 2003.