Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
juliandyke.com
Agenda
Introduction Tests Indexes Number of columns processed SELECT FOR UPDATE Number of rows processed COMMIT Batch size Global temporary tables External tables Conclusion
juliandyke.com
Redo Records
Header
Body
Redo Record 1
Header
Redo Record 2
Body
Spare Header Body Spare
Wastage
STOP
juliandyke.com
Change Vectors
Header Header Body Header
Change Vector 1
Body
Body
Change Vector 2
Header
Body
Change Vector 3
Redo Record
4
STOP
Change Vectors
juliandyke.com
Change Vector
Header
Body
STOP
juliandyke.com
Example
Examples in this presentation taken from Formula 1 database Contains full details of all races from 1961 to 2004 Updated annually in November (end of season) Currently 20 cars per race 19 races per season `Approximately 360 new rows per season
juliandyke.net
juliandyke.com
Schema
CIRCUIT COUNTRY
SEASON
RACE
DRIVER
TEAM
ENGINE
GRANDPRIX CAR
CLASSIFICATION
juliandyke.com
Cars
Each season has up to 18 races (19 in 2005) Each race has up to 39 entrants (13 races in 1989) Each car has driver, team and engine laps completed (may be zero) optional notes Results are classified as follows
C DNF Classified Did not finish
DNS
DNQ DIS
8
juliandyke.com
Points
But . . . not always straightforward Half points awarded for incomplete races Split races (two half point races aggregated) Driver and / or team disqualifications e.g. Tyrrell in 1984 Up to 1980 only best scores counted for each half of season e.g. Best 5 results from first 7 races and best 5 results from last 7 races 1982-1990 only best 11 results counted for drivers 1961-1978 only first car to finish counted for each team
juliandyke.com
Comma separated file 16181 rows Fields are: season_key race_key position driver_key team_key engine_key laps_completed classification_key notes (optional)
2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004
17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18
17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
ZBAU DCOU RBAR MWEB JMON KRAI RBAR FALO RSCH TSAT MSCH FMAS GFIS JVIL DCOU JTRU RZON CKLI TGLO ZBAU GBRU MWEB NHEI JBUT
MIN MCL FER JAG WIL MCL FER REN WIL BAR FER SAU SAU REN MCL TOY TOY JAG JOR MIN MIN JAG JOR BAR
FOR MER FER FOR BMW MER FER REN BMW HON FER FER FER REN MER TOY TOY FOR FOR FOR FOR FOR FOR HON
41 38 38 20 71 71 71 71 71 71 71 71 71 70 70 70 70 69 69 67 67 23 15 3
10
juliandyke.com
Comma separated file 16181 rows Fields are: season_key race_key position driver_points team_point
2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004 2004
17 17 17 17 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18 18
17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0 0 0 0 10 8 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 10 8 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0
11
juliandyke.com
CAR table
NUMBER NUMBER NUMBER VARCHAR2(4) VARCHAR2(3) VARCHAR2(3) NUMBER VARCHAR2(4) VARCHAR2(100), NUMBER NUMBER
NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL, NOT NULL,
DEFAULT 0, DEFAULT 0
ALTER TABLE car ADD CONSTRAINT car_pk PRIMARY KEY (season_key,race_key,position); CREATE INDEX car_driver ON car (season_key,driver_key,driver_points);
12
juliandyke.com
CAR
13
juliandyke.com
Baseline - Insert
For each line in car.csv { read :season_key, :race_key, :position, :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes; INSERT INTO car (season_key, race_key, position, driver_key, team_key, engine_key, laps_completed, classification_key, notes) VALUES (:season_key,:race_key,:position, :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes) COMMIT; }
14
juliandyke.com
Baseline - Insert
Redo Generation for each Insert Statement
Header INSERT INSERT INSERT Undo Redo Undo 5.2 5.1 (11.1) 11.2 5.1 (10.22)
INSERT
INSERT INSERT COMMIT
Redo
Undo Redo Commit
10.2
5.1 (10.22) 10.2 5.4
15
juliandyke.com
Insert Statement
Redo Generation for each Insert Statement
INSERT Redo Header INSERT INSERT COMMIT INSERT INSERT Redo Redo Commit Undo Undo Undo 11.2 5.2 10.2 10.2 5.4 5.1 (11.1) 5.1 (10.22) 5.1 (10.22)
INSERT
16
juliandyke.com
Baseline - Update
For each line in points.csv { read :season_key, :race_key, :position, :driver_points, :team_points; SELECT driver_key, team_key, engine_key, laps_completed, classification_key, notes INTO :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes FROM car WHERE season_key = :season_key AND race_key = :race_key AND position = :position FOR UPDATE; UPDATE car SET driver_key = :driver_key, team_key = :team_key,engine_key = :engine_key, laps_completed = :laps_completed,classification_key = :classification_key, notes = :notes, driver_points = :driver_points, team_points = :team_points WHERE season_key = :season_key AND race_key = :race_key AND position = :position; COMMIT;
17
juliandyke.com
Baseline - Update
Redo Generation for each Update Statement
Header SELECT FOR UPDATE SELECT FOR UPDATE Undo Redo Undo Redo Undo Redo Undo Redo Commit 5.2 5.1 (11.1) 11.4 5.1 (11.1) 11.5 5.1 (10.22) 10.4 5.1 (10.22)
UPDATE
UPDATE UPDATE UPDATE UPDATE UPDATE COMMIT
10.2
5.4
End Transaction
18
juliandyke.com
Baseline - Update
Redo Generation for each Update Statement
SELECT FOR UPDATE Redo Header UPDATE Redo 11.4 5.2 11.5
Lock row in CAR table Start Transaction Update row in CAR table
UPDATE
UPDATE
Redo
Redo Commit Undo Undo Undo Undo
10.4
10.2
COMMIT
SELECT FOR UPDATE UPDATE UPDATE UPDATE
19
juliandyke.com
Baseline - Results
Note Amount of redo generated by both INSERT and UPDATE can be variable due to Undo segment management Recursive DDL statements e.g. extent allocation Block cleanouts
20
juliandyke.com
Test 1
Check for unused indexes CAR_PK indexes columns SEASON_KEY RACE_KEY POSITION supports primary key therefore mandatory CAR_DRIVER indexes columns SEASON_KEY DRIVER_KEY DRIVER_POINTS no longer required by current version of application
DROP INDEX car_driver;
21
juliandyke.com
Test 1 - Insert
Redo Generation for each Insert Statement
Header INSERT INSERT INSERT INSERT INSERT INSERT COMMIT Undo Redo Undo Redo Undo Redo Commit 5.2 5.1 (11.1) 11.2 5.1 (10.22) 10.2 5.1 (10.22) 10.2 5.4
Start Transaction
Undo insert row in CAR table Insert row in CAR table Undo insert row into CAR_PK index Insert row into CAR_PK index Undo insert row into CAR_DRIVER index Insert row into CAR_DRIVER index End Transaction
22
STOP
juliandyke.com
Test 1 - Update
Redo Generation for each Update Statement
Header SELECT FOR UPDATE SELECT FOR UPDATE Undo Redo Undo Redo Undo Redo Undo Redo Commit 5.2 5.1 (11.1) 11.4 5.1 (11.1) 11.5 5.1 (10.22) 10.4 5.1 (10.22) 10.2 5.4 Start Transaction Undo lock row in CAR table Lock row in CAR table
UPDATE
UPDATE UPDATE UPDATE UPDATE UPDATE COMMIT
23
STOP
juliandyke.com
Test 1 - Results
INSERT (car.csv)
20448852 14687756
UPDATE (points.csv)
14409676 12467400
Total
34858528 27155156
Conclusion Eliminating redundant index reduced insert redo generation by 5761096 bytes update redo generation by 1942276 bytes
24
juliandyke.com
Test 2
In UPDATE statements For tables undo and redo is generated for all columns in SET clause
For indexes undo and redo are only generated for index keys that have changed
25
juliandyke.com
Test 2 - Update
Redo Generation for each Update Statement
Header SELECT FOR UPDATE Undo Redo Undo Redo Commit 5.2 5.1 (11.1) Start Transaction Undo lock row in CAR table
11.4
5.1 (11.1) 11.5 5.4
juliandyke.com
Test 2
Only update columns which can have new values DRIVER_POINTS TEAM_POINTS
For each line in points.csv { read :season_key, :race_key, :position, :driver_points, :team_points; SELECT ... FOR UPDATE; UPDATE car SET driver_key = :driver_key, team_key = :team_key, engine_key = :engine_key, laps_completed = :laps_completed, classification_key = :classification_key, notes = :notes, driver_points = :driver_points, team_points = :team_points WHERE season_key = :season_key AND race_key = :race_key AND position = :position; COMMIT; }
27
juliandyke.com
Test 2 - Results
Test1
Test2
14687756
14560052
12467400
11584760
27155156
26144812
Conclusion Eliminating unnecessary columns from update statements reduced update redo generation by 882640 bytes Would be significantly more if unchanged columns included long fields e.g. CHAR, or VARCHAR2
28
juliandyke.com
Test 3
For each line in points.csv { read :season_key, :race_key, :position, :driver_points, :team_points; SELECT driver_key, team_key, engine_key, laps_completed, classification_key, notes INTO :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes FROM car WHERE season_key = :season_key AND race_key = :race_key AND position = :position FOR UPDATE; UPDATE car SET driver_points = :driver_points, team_points = :team_points WHERE season_key = :season_key AND race_key = :race_key AND position = :position; COMMIT; }
29
juliandyke.com
Test 3 - Update
Redo Generation for each Update Statement
Header SELECT FOR UPDATE SELECT FOR UPDATE UPDATE UPDATE COMMIT Undo Redo Undo Redo Commit
Start Transaction Undo lock row in CAR table Lock row in CAR table Undo update row in CAR table Update row in CAR table End Transaction
30
STOP
juliandyke.com
Test 3 - Results
INSERT (car.csv)
20448852 14687756 14560052 14554428
UPDATE (points.csv)
14409676 12467400 11584760 8475484
Total
34858528 27155156 26144812 23029912
Conclusion Eliminating SELECT FOR UPDATE statement reduced update redo generation by 3109276 bytes
31
juliandyke.com
Test 4
Rows are inserted with default values of 0 for driver_points and team_points Points only scored by first eight cars - 2003 onwards first six cars - pre 2003
Team
Points Driver Points 3514 No Points 30
No Points
324
12313
Only update rows with non-zero rows for driver_points and/or team_points
Driver 324 3514 Team 30 12313
32
STOP
juliandyke.com
Test 4 - Update
Redo Generation for each Update Statement
UPDATE car SET driver_points = 1 team_points = 1 WHERE ... UPDATE car SET driver_points = 0 team_points = 0 WHERE ... UPDATE car SET driver_points = 0 team_points = 0 WHERE ... UPDATE car SET driver_points = 9 team_points = 9 WHERE ... col9 = 0 col9 = 1 col10 = 0 col10 = 1
col9 = 0 col9 = 0
col10 = 0 col10 = 0
col9 = 0 col9 = 0
col10 = 0 col10 = 0
col9 = 0
col9 = 9
col10 = 0
col10 = 9
33
STOP
juliandyke.com
Test 4
Only update rows with non-zero rows for driver_points and/or team_points
For each line in points.csv { read :season_key, :race_key, :position, :driver_points, :team_points; IF driver_points != 0 OR team_points != 0 THEN { UPDATE car SET driver_points = :driver_points, team_points = :team_points WHERE season_key = :season_key AND race_key = :race_key AND position = :position; COMMIT; } }
34
juliandyke.com
Test 4 - Results
INSERT (car.csv)
20448852 14687756 14560052 14554428 14683408
UPDATE (points.csv)
14409676 12467400 11584760 8475484 2070316
Total
34858528 27155156 26144812 23029912 16753724
Conclusions Eliminating unnecessary update statements reduced update redo generation by 6405168 bytes
35
juliandyke.com
Test 5
For each line in car.csv { read :season_key, :race_key, :position, :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes; INSERT INTO car (season_key, race_key, position, driver_key, team_key, engine_key, laps_completed, classification_key, notes) VALUES (:season_key,:race_key,:position, :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes) COMMIT; } COMMIT;
36
juliandyke.com
Test 5
For each line in points.csv { read :season_key, :race_key, :position, :driver_points, :team_points; IF driver_points != 0 OR team_points != 0 THEN { UPDATE car SET driver_points = :driver_points, team_points = :team_points WHERE season_key = :season_key AND race_key = :race_key AND position = :position; COMMIT; } }
COMMIT;
37
juliandyke.com
Test 5 - Insert
Redo Generation for each Insert Statement
Header INSERT INSERT INSERT Undo Redo Undo 5.2 5.1 (11.1) 11.2 5.1 (10.22) Start Transaction Undo insert row in CAR table Insert row in CAR table Undo insert row into CAR_PK index
INSERT COMMIT
Redo
Commit
10.2 5.4
5.2 5.1 (11.1) 11.2 5.1 (10.22) 10.2 5.4
Header
INSERT INSERT INSERT INSERT COMMIT Undo Redo Undo Redo Commit
38
STOP
juliandyke.com
Test 5 - Results
Test 1
Test 2 Test 3 Test 4 Test 5
14687756
14560052 14554428 14683408 9516512
12467400
11584760 8475484 2070316 1028084
27155156
26144812 23029912 16753724 10544596
Conclusion Eliminating COMMIT statements reduced insert redo generation by 5166896 bytes update redo generation by 1042232 bytes
39
juliandyke.com
Test 6
Default batch size is 1 Test INSERT and UPDATE with different batch sizes
Batch Size INSERT Redo 1 2 4 8 16 32 9517096 5654136 3927092 3011944 2588540 2375884 UPDATE Redo 1028084 1028084 1028440 1028084 1028636 1028172 Total Redo 10545180 6682220 4955532 4040028 3617176 3404056
64
128 256
2254936
2195876 2179404
1028040
1028084 1028440
3282976
3223960 3207844
512
1024 2048
40
2163816
2163084 2160012
1028084
1028084 1028084
3191900
3191168 3188096
juliandyke.com
Test 6 - Results
10000000 9000000 8000000 7000000
Redo (bytes)
6000000 5000000 4000000 3000000 2000000 1000000 0 1 2 4 8 16 32 64 128 256 512 1024 2048 Batch Size
41
juliandyke.com
Test 6 - Results
Baseline
Test 1 Test 2
20448852
14687756 14560052
14409676
12467400 11584760
34858528
27155156 26144812
Test 3
Test 4 Test 5
14554428
14683408 9516512
8475484
2070316 1028084
23029912
16753724 10544596
Test 6
2195876
1028084
3223960
Conclusion Batch Size of 128 reduced insert redo generation by 7320636 bytes update redo generation unaffected
42
juliandyke.com
Test 7
43
juliandyke.com
Test 7
For each line in car.csv { read :season_key, :race_key, :position, :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes; INSERT INTO temporary_car (season_key, race_key, position, driver_key, team_key, engine_key, laps_completed, classification_key, notes) VALUES (:season_key,:race_key,:position, :driver_key, :team_key, :engine_key, :laps_completed, :classification_key, :notes) COMMIT; }
44
juliandyke.com
Test 7
For each line in points.csv { read :season_key, :race_key, :position, :driver_points, :team_points; IF driver_points != 0 OR team_points != 0 THEN { UPDATE temporary car SET driver_points = :driver_points, team_points = :team_points WHERE season_key = :season_key AND race_key = :race_key AND position = :position; } } COMMIT;
45
juliandyke.com
Test 7
INSERT INTO car ( season_key, race_key, position, driver_key, team_key, engine_key, laps_completed, classification_key, notes, driver_points, team_points ) SELECT season_key, race_key, position, driver_key, team_key, engine_key, laps_completed, classification_key, notes, driver_points, team_points FROM temporary_car;
46
juliandyke.com
Test 7 - Results
Total 34858528
27155156 26144812 23029912 16753724 10544596 3223960 2883748
Conclusion Global Temporary Table reduced total redo generation by 340212 bytes
47
juliandyke.com
Test 8
CREATE TABLE external_points ( season_key VARCHAR2(4), race_key VARCHAR2(2), position NUMBER, driver_points NUMBER, team_points NUMBER ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY external_dir ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY ',' ) LOCATION ('points.csv') );
48
juliandyke.com
Test 8
CREATE TABLE external_car ( season_key VARCHAR2(4), race_key VARCHAR2(2), position NUMBER, driver_key VARCHAR2(4), team_key VARCHAR2(3), engine_key VARCHAR2(3), laps_completed NUMBER, classification_key VARCHAR2(4), notes VARCHAR2(100) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY external_dir ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY ',' MISSING FIELD VALUES ARE NULL ) LOCATION ('car.csv') );
49
juliandyke.com
Test 8
Insert directly into permanent table joining contents of both external tables
INSERT INTO car ( season_key, race_key, position, driver_key, team_key, engine_key, laps_completed, classification_key, notes, driver_points, team_points ) SELECT c.season_key, c.race_key, c.position, c.driver_key, c.team_key, c.engine_key, c.laps_completed, c.classification_key, c.notes, p.driver_points, p.team_points FROM external_car c, external_points p WHERE c.season_key = p.season_key AND c.race_key = p.race_key AND c.position = p.position";
50
juliandyke.com
Test 8 - Results
51
juliandyke.com
Conclusion
We have seen that the following techniques can be used to reduce the amount of redo generated:
Eliminating redundant indexes Reducing the number of columns updated Eliminating redundant SELECT FOR UPDATE statements Reducing the number of rows processed Eliminating COMMIT statements Increasing the batch size Using Global Temporary Tables Using External Tables
52
juliandyke.com
53
juliandyke.com