Sei sulla pagina 1di 10

General SQL JOIN Support In DAL

Version 1.5
Revision History..................................................................................................................1
Problem................................................................................................................................2
Required Functionality........................................................................................................2
DAL Core Enhancements....................................................................................................3
1.Applications/DAOs with General JOIN.......................................................................3
2. Com.ebay.integ.dal.JoinedDo.java..............................................................................5
2.1 JoinedDo Constructors & APIs..............................................................................5
3. com.ebay.integ.dal.map.JoinedMap.............................................................................6
3.1 Constructor & APIs................................................................................................6
4. QueryEngine APIs.......................................................................................................7
5. Table Touples/DDR......................................................................................................7
6. ContainedFieldMapping.Java & AttributeToken.java................................................8
Some Requirements for DEDE............................................................................................8
Design Review Minutes (11/13/2007).................................................................................9
Testing..................................................................................................................................9

Revision History
Revision
1.0
1.1
1.2
1.3
1.4

Author
George
GJ
GJ
GJ
GJ

1.5

GJ

date
11/6/2007
11/8/2007
11/13/2007
12/06/2007
12/13/2007

Comment
A draft version
Added SELF-JOIN handling
Added Design Review Minutes
Updated after coding done
Added how to pass joined hints to query/QEAPI
12/20/2007 Minor updates

Problem
Though the existing SQL JOIN support in DAL suffices to meet eBay Marketplaces
needs so far, it seems not adequate and flexible enough for eBox users who may need
general SQL JOIN capabilities.
Currently, there are a few mechanisms that DAL employs when multiple tables are
involved in a Query. However they all have certain limitations:

Independent sub-objects are other Java entities that have a loose association with
the 'parent' DO/DAO. Elements from independent sub-objects are lazily loaded
(at the time that the getXXX() method is called, rather than at the time the parent
DAO finder method is called.
Limitation: the JOIN was performed at application side (in DAO), not on
database server side. i.e. 2 or more trips to DB server are required.

Contained sub-objects are Java entities that are tightly bound to the parent
DO/DAO. SELECTs will contain a JOIN to retrieve from the contained object's
table (iff the read-set includes one or more columns from that contained subobject).
Limitation:
1> it requires that all objects that participate in the JOIN have read-sets that
are in sync. (That is, READSET_COMPACT must have the same integer
identity in each table involved. otherwise queries that use this read-set will
return unexpected results).
2>The container and sub-objects have the same life cycle.

A single DO may have fields mapping to columns in multiple tables.


[this works well to me as I used this as my first DAL code]
Limitation: Java entity/Model is different from the database table model, thus
more coupling among objects at conceptual level.

Required Functionality
This document mainly focuses on removing those limitations mentioned above.
The general JOIN functionality must meet the following functional requirements:
Allow join of two or more tables located on the same data source
Do not require synchronization of read sets among the joined objects
Allow participated objects to be independent each other (i.e.
Do not require to create a new DO Class for JOIN.)

DAL Core Enhancements


In following sections, we will assume that we have tables employees
, departments, and their contents are as follows:
emp_id
1
99
1000
1001
dept_no
1
2
3

Emp_name
Mark
Alex
John
Martin

emp_age
28
35
41
32
Name
Lab
marketing
Support

mgr_id
1
1
1

dept_no
1
2
2
3

location
Dallas
Houston
LA

And we also have DAL classes <EmpDoImpl.java, EmpMap.java>,


<DeptDoImpl.java, DeptMap.java> for these two tables

1.Applications/DAOs with General JOIN


Following pseudo-code segment (in either DAO or application code directly)
demonstrates how an application/DAO does with JOIN after the enhancement:

/* 1> create two contributing proto DOs for the JOIN and the JoinedMap */
EmpDoImpl protoEmpDO = new EmpDoImpl(EmpDAO.getInstance(),EmpMap.getInstance());
DeptDoImpl protoDeptDO = new DeptDoImpl(DeptDAO.getInstance(), DeptMap.getInstance());

JoinedDo protoJoinedDO = new JoinedDo(protoEmpDO, protoDeptDO);


/* 2> define the join condition and the join map */
JoinedMap joinMap = new JoinedMap(protoJoinedDO);

TableJoin[] tableJoins ={
new TableJoin(m_joinMap.getTableDefs(), e.DEPT_ID=d.DEPT_ID(+)",null)};
joinMap.setTableJoins(tableJoins);
/* 3> create queries
Query queries[] ={
new SelectQuery(FIND_EMP_NAME_DEPT_LOATION, m_ourDDRHints,
new SelectStatement[] {
new SelectStatement(
BaseMap2.SET_MATCHANY, "SELECT /*<CALCOMMENT/>*/
"FROM <TABLES/> " +
"WHERE (<JOIN/>) "),} ),
new SelectQuery(JOIN_WITH_BINDING, m_ourDDRHints,
new SelectStatement[] {
new SelectStatement(
BaseMap2.SET_MATCHANY, "SELECT /*<CALCOMMENT/>*/
"FROM <TABLES/> " +
"WHERE e.id > :emp1.m_id
= :dept2.m_deptid and
.
joinMap.setQueries(queries);

<SELECTFIELDS/> " +

<SELECTFIELDS/> " +
and d.dept_id
(<JOIN/>) "),

}),

/* 4> create field mapping: the attr name must be unique. if it is used for binding then the
names must be matched with the the ones on the binding line:
":emp1.m_id" on where clause, then we have to use "emp1" here */
FieldMapping[] fieldMappings = {
new ContainedFieldMapping("emp1", EmpMap.getInstance(), EmpDoImpl.class, jd),
new ContainedFieldMapping("dept2", DeptMap.getInstance(), DeptDoImpl.class, jd) };
joinMap.setFieldMappings(fieldMappings);
/* 5> set the read setid for each involved DO/table
* <set1, set2> can be one of all available combinations.
* it is not necessary that set1==set2.
* the set id determines which fields are used in the query/SQL
* DeptMap.READSET_GET_LOCATION_ONLY: only location
*/

is requested.

HashMap<Class, Integer> set1Map= new HashMap<Class, Integer>();


set1Map.put(DeptDoImpl.class, DeptMap.READSET_GET_LOCATION_ONLY);
set1Map.put(EmpDoImpl.class, EmpMap.READSET_GET_NAME_FIELDS);
joinMap.addReadSetId(FIND_EMP_NAME_DEPT_LOCATION_SET_ID, set1Map);
/* 4> create a container to hold the result set */

List<JoinedDo> results= new ArrayList<JoinedDo>();


/* 5> delegate the request to QE. QE will figure out that the query is:
SELECT emp.emp_name, dept.location
FROM employees emp, depts dept
WHERE emp.emp_id >= 1000 and emp.dept_no=dept.dept_no
*/
qe.readJoinMultiple(results, joinedMap, protoJoinedDO,
FIND_EMP_NAME_DEPT_LOATION, FIND_EMP_NAME_DEPT_LOCATION_SET_ID);

/* 6> get the result DOs


The JDBC result set corresponding the SQL statement above
should contain 2 tuples:
<John, Houston>, <Martin, LA>
The corresponding output from the Java code here:
emp=(John), dept=(Houston);
And
emp=(Martin), dept=(LA);
*/
For (JoindedDo jd:results) {
EmpDoImpl emp = (EmpDoImpl) jd.getDO(EmpDoImpl.class);
DeptDoImpl dept = (DeptDoImpl) jd.getDO(DeptDoImpl.class);
emp.display(); //printout the loaded fields(id, name, age, dept_no)
dept.display(); // printout the location
}

The tables above are having 1-m relationship. However,


the tables/DOs involved in a JOIN can be totally independent. In another word,
QueryEngine does not utilize or care the relationship among the DOs.
See the unit test: KernelDALTests/src/com/ebay/integ/dal/generaljointests/joinbinding/
for details

2. Com.ebay.integ.dal.JoinedDo.java
For table A, B (or even Table C, D) and their corresponding DOs:
DO0, DO1 (subclass of BaseDo2)we introduce a new class called JoinedDo
class (extending BaseDo2 ) to loosely hold DO0, DO1... as you saw in the example
above.
A protoDO (passed to QE APIs) of the type JoinedDo should be a validated one (see
constructors below). The validated JoinedDo object should have following properties:
a> The participated DO0, DO1 must NOT be instances of JoinedDo.
b> each participated DO should have different Classes.
c> each participated DO should have a non-null map object
The general join utilizes the existing concept of contained DO. In addition to
eliminating the set id sync requirement, with the general join, you do NOT need to
define a Class to contain a few sub-DOs, instead, you just need to instantiate an
object of the class JoinedDo as discussed below.

2.1 JoinedDo Constructors & APIs


Here are some of JoinedDos constructors:
/* JOIN involves 2 or more DOs, validated internally*/
JoinedDo(BaseDo2 DO0, BaseDo2 DO1, BaseDo2...args);
/* JOIN with a predefined DO list, validated internally*/
JoinedDo(List<BaseDo2> DOList );
/* Used by QE internally only */
JoinedDo(List<BaseDo2> DOList, Boolean validating);

When table A JOINs B, the result set should contain zero to many JoinedDo objects.
To get individual DO from a result JoinedDo object, call:
DO = (DOClass) JoinedDo.getDO(Class DOClass);
DO = (DOClass) JoinedDo.getDO(int DOIndex);
For example:
EmpDoImpl emp = (EmpDoImpl) JoinedDo.getDO(EmpDoImpl.class);
EmpDoImpl emp = (EmpDoImpl) JoinedDo.getDO(0);

3. com.ebay.integ.dal.map.JoinedMap
QueryEngine needs a map object for a protoDO to figure out the <SELECTFIELDS/>,
<TABLES/> and SQL statement (parsed token), hints/TouplProvider/table touples,
and even CAL logging .

3.1 Constructor & APIs


The main advantage of creating a map object by a DAO/application for JOIN is
that a map can define multiple queries/SQLStatement, table join objects
and the map can be initialized once and reused by many different queries thereafter.
The Class JoinedMap is introduced to ease the DAO/applications burden. A DAO just
need to create JoinedMap object for each ProtoJoinedDo object.
Here are some major APIs:

public JoinedMap(JoinedDo joinedDo)


/* create a set id which corresponding to a list of set ids:
* so QE does not require a sync on the set id across the all DOs
*/
public boolean addReadSetId(int id, List<Integer> ids)
public boolean addReadSetId(int id, HashMap<Class, Integer> classToSetIdMap)
See the section Requirement for DEDE for how the map should look like.

4. QueryEngine APIs
Here are a few possible new QE APIs (overloading ones), but not limited to.
/* do not know if there is any use case that we need to read fist record
from a JOIN. But we list the API here
*/
public JoinedDo readJoinSingle(
JoinedMap
map
JoinedDo
protoJoinedDO,
String
queryName, //search for the query in the map
Int
readsetId,
MappingIncludesAttribute[][] overrideQueryDefaultDDRHints);
/* likely used by most applications */
public void readJoinMultiple(
List<JoinedDo> results,
JoinedMap map,
JoinedDo protoJoinedDO,
String queryName,
int readSetId) throws FinderException

5. Table Touples/DDR
For a protoJoinedDO, QueryEngine will figure out table touples for each participated
DO separately with each DOs own hints & touple provider. For each protoDO, QE will
throw a RuntimeException if it detects that the data source is different among the table
touples at each position/try (see DDRToupleSetManager.verifyDataSources()) to
guarantee that all physical tables involved in JOIN are on the same db host for each try.
Application developers should make sure that the corresponding touple providers & hints
of the participated DOs will generate the same number of table touples for each DO
during run time AND the same data source for each touple across the DOs in the same
try. For example:
if tables Emp and Dept, each has two table touples during run time:
Emp: <emp1, datasource1> and <emp2, datasource2>
Dept: <dept, datasource1> and <dept, datasource2>
QE will be fine.

But if they have following touples:


Emp: <emp1, datasource1> and <emp2, datasource2>
Dept: <dept, datasource3> and <dept, datasource2>
Then QE will throw an exception because datasource3 != datasource1 for the first try.

6. ContainedFieldMapping.Java & AttributeToken.java


This ContainedFieldMapping is enhanced to have a new constructor as follows:
/**
* this constructor is to help create attrIndex for each participated DO to JOIN.
* The developers do not have to worry about the correlation. The attrIndex is used
* for binding. See AttributeToken.java.
*
* @param attrName the name for the paritipated DO. the value is not significant if
*
binding is not needed. but need to be unique amoung the attributes.
* @param mapClass the pariticipated DO's map
* @param doClass the paritipated DO class
* @param jd the JoinedDo object which contains the participated DO specified by doClass
* @throws IllegalArgumentException
*/

public ContainedFieldMapping(String attrName, BaseMap2


mapClass, Class doClass, JoinedDo jd) throws
IllegalArgumentException

Class AttributeToken.java is enhanced also to support


loading JoinedDO By loading each pariticipated DO into a
list.

Some Requirements for DEDE


As mentioned in QEs new APIs, the applications/DAO can create a map object
(of class JoinedMap) corresponding to a JoinedDo type in a DEDE generated DAO
constructor.
This generated map should have:
1> set id definitions. [each participated DO should have a readSetId]
Predefine read sets by assigning a setID to a
tuple of set ids. For example:
GET_NAME_DEPT_SET (emp.NAME_SET, dept.LOCATION_SET )

2> field mappings.


Each participated DO should have a Contained field mapping. The attribute
names, such as emp1, dept2, should be unique.
As the pseudo-code above, DEDE may help to create following lines
into the DAO
FieldMapping[] fieldMappings = {
new ContainedFieldMapping("emp1", EmpMap.getInstance(), EmpDoImpl.class, jd),
new ContainedFieldMapping("dept2", DeptMap.getInstance(), DeptDoImpl.class, jd)
};
joinMap.setFieldMappings(fieldMappings);

3> Queries and DDR Hints


DEDE should be able to create the Joined hints for a JOIN. The hints are passed
to each participated DOs Touple Provider. These hints can be specified at a query
level or at QEs API calls.
[See the example in generaljointests dir for details]
DEDE can assign TabkeJoin objects to the generated JoinedMap object.

Design Review Minutes (11/13/2007)


1. Always require a map for the JOIN (drop the support to create a map on fly by QE).
2. DEDE: The map provided by DAO should map JOIN_SET_IDs to tuples. for example:
READSET_GET_EMP_NAME_DEPT_LOCATION = (Emp.READSET_GET_NAME,
DeptMap.READSET_GET_LOCATION)
Instead of allowing protoDOs to hold the setIDs.
3. provide JoinedDo.getDO(int index) [besides JoinedDo.getDO(DOClass)]
4. Not support "A join A join B" for the time being.
5. nice to allow a participated DO to have multiple tables (support the existing DO which may
have 2 or more tabledefs ) though from now on we should encourage using one DO to
represent one table because we have the general join support.

Testing
Unit test code for this feature is in
KernelDALTests/src/com/ebay/integ/dal/generaljointests
There are a few directories in this directory. Each focuses on
a certain category of testing. See README.txt there for
9

details.

10

Potrebbero piacerti anche