Sei sulla pagina 1di 7

A Diagram of the MySQL information schema

This page contains a clickable diagram of the MySQL data dictionary implementation, the information schema database. I drew up the diagram for reference purposes after being "5.0 Beta challenged" by the MySQL Newsletter (June 2005). I've hooked the diagram to a MAP, so you can click on a table to link through to the relevant MySQL reference page. I hope you'll find it worthwhile:

About the Diagram


The diagram was initially created in Microsoft Visio by reverse-engineering the information_schema database via the MyODBC driver. Afterwards, relationships were added by hand. You can check out the history of changes here. If you notice any mistakes, or omissions, or if you have a question, just send me an email at my hotmail adres (R_P_Bouman at hotmail dot com). For more info on those cool MySQL features, such as views, stored procedures/functions and triggers, check out http://www.mysqldevelopment.com/ You should realise that this is not a fysical data model of the implementation. Rather, it is a conceptual model of the structure of the information schema database As such, it can be used as a guide when building queries against the information schema.

Please, Redistribute!
You can download the original Visio 2002 file here. You'll probably need the Entity Relationship stencil shipped with your Visio product. The diagram is also available as a gif image. You are free to copy or redistribute the diagram in any form. This includes saving and redistributing the diagram in another format (pdf, jpeg, whatever). However, I would appreciate it if you would not modify the contents of the model without notifying me. Also, I would appreciate it if you do not remove the footer identifying me as the author of the diagram.

Notation Conventions
The diagram is based on the Visio stencil for Entity Relationship modelling. Entities (Tables) are drawn as rectangles, relationships are drawn as lines connecting entities. Cardinalities are drawn using the usual "crowfeet" markers. Some general notes on the notation provided by Visio:

Columns shown in Bold Type Face are not nullable; that is, a value is mandatory for these columns The Primary Key Columns appear in a separate box right beneath the table name Non-identifying relationships are drawn using a dashed connection line

(Partially) Identifying relationships are drawn using a solid connection line in the left margin right before the column names, one or more commaseparated markers may appear indicating that the column references a parent column as part of a relationship definition

Colors I used background colors to mark functionally related tables. Depending upon your criteria, you could recognize other functionally related units. I chose these colors:

dark blue for the schema (database) related tables, SCHEMATA light blue for the character set and collation related tables red for the relational core tables kaki for the STATISTICS table, which holds information concerning indexes yellow for the privilege related tables green for the constraint related tables magenta for the ROUTINES and TRIGGERS tables

Rolenames I added rolenames for the relationships (29-06-2005). You can still download the previous version of the Visio 2002 file without rolenames. The gif image of this previous version is also available. Except for the rolenames, the previous version has is type set in a 10pt font, whereas the new version is set in an 8pt font. Just Take you pick. Right after the rolename, you can see an integer between left and right parenthesis. The integer between the parenthesis corresponds to the foreign key markers (FK1, FK2, etcetera) appearing right before some column names. To see how this works, look at the relationship between KEY_COLUMN_USAGE and TABLE_CONSTRAINTS. The rolename is "defined by", and it is followed by a "(1)". The "1" between the parenthesis corresponds to the KEY_COLUMN_USAGE columns CONSTRAINT_SCHEMA and CONSTRAINT_NAME. Right before these columns you can see the "FK1" markers, where the "1" in "FK1" indicates correspondence to the "1" between the relationship parenthesis. The diagram does not show the details of the entire relationship mapping. For instance, looking at the diagram does not reveal that the CONSTRAINT_SCHEMA column in KEY_COLUMN_USAGE refers to the CONSTRAINT_SCHEMA column in TABLE_CONSTRAINTS. You can see that it must map to either the CONSTRAINT_SCHEMA or the

CONSTRAINT_NAME column, because the pair of them form an identifier for TABLE_CONSTRAINTS. However, by looking at the column names it's not too hard to guess that KEY_COLUMN_USAGE.CONSTRAINT_SCHEMA will map to TABLE_CONSTRAINTS.CONSTRAINT_SCHEMA , and that the KEY_COLUMN_USAGE.CONSTRAINT_NAME will map to TABLE_CONSTRAINTS.CONSTRAINT_NAME. To avoid cluttering of the diagram, only one rolename is displayed. I chose to display the rolename that applies to the child end (usually the "many" side) of the relationship. Visio refers to this as the "Inverse Text" of the "Verb phrase". This type of rolename expresses how the child entity relates to the parent entity. In most cases, only one such rolename is enough to think of a meaningful phrase for the other rolename. My choice for naming the child end instead of the parent end is arbitrary. I am willing to change this if you provide suggestions to do so. Using the entity names and the rolename, the semantics of a relationship can be expressed in a simple english sentence: 1. Take the singular form of the name of a child entity 2. Append the rolename as it appears in the diagram 3. Append the singular form of the name of the corresponding parent entity By this convention, the relationship between SCHEMATA and CHARACTER_SETS would read: 1. "SCHEMA" 2. "has default" 3. "CHARACTER_SET" For a more detailed understanding of the relationship, expand this pattern by specifying the cardinalities. Cardinalities specify in what quantities instances at either side of a relationship relate to each other. Sticking to our previous example, the sentence would read: A Schema has exactly one default Characterset. Another example, for the relationship between COLUMNS and CHARACTER_SETS: A Column optionally draws (characters) from at most one Characterset. I tried to choose the rolenames as descriptive as possible, but any suggestions are very welcome.

Modelling choice 1: The CONSTRAINT_SCHEMA and INDEX_SCHEMA columns vs the TRIGGER_SCHEMA column
In MySQL, indexes as well as constraints reside in same schema as the table on which they are defined. The same holds for triggers: a trigger always resides in the same schema as that of it's associated table. There's a difference too: indexes and constraints are uniquely identified by their respective name (INDEX_NAME and CONSTRAINT_NAME) within their associated table. Triggers are uniquely identified by their name within their schema (which is always the same schema as that of their associated table). This affects the diagram as far as the columns CONSTRAINT_SCHEMA (tables: TABLE_CONSTRAINTS and KEY_COLUMN_USAGE) and INDEX_SCHEMA (table: STATISTICS) are concerned. Because we know that the value for these columns is identical to the value in the TABLE_SCHEMA column, we can get away with not including these columns in the identifiers of their tables. Instead, we can use the TABLE_SCHEMA column. Also, there's no need to include separate relationships between SCHEMATA and TABLE_CONSTRAINTS and SCHEMATA and STATISTICS. The former is already determined because relationships exist between SCHEMATA and TABLES and TABLES and TABLE_CONSTRAINTS. The latter is already determined because relationships exist between SCHEMATA and TABLES, TABLES and COLUMNS, and finally, COLUMNS and STATISTICS. So, triggers are in a separate namespace: triggers can be uniquely identified within a schema by using their name (TRIGGER_NAME). Although a trigger is always associated to a table much in the same way as a table constraint or a index is, the triggers table does have it's 'own' identifier, which is made up of TRIGGER_SCHEMA and TRIGGER_NAME, although the value of the TRIGGER_SCHEMA and EVENT_OBJECT_SCHEMA columns are in fact always equal.

Modelling choice 2: The CATALOG columns


I chose to exclude the %CATALOG columns from the primary keys and relationship definitions. MySQL does not support catalogs, which is fine according to ISO 9075. However, as far as I can see, the MySQL implementation is not entirely consistent with ISO 9075. It's not a big thing though, considering the fact that catalogs are not supported. You can safely skip the next few paragraphs if you're not interested in features that are not, and most probably will not be supported (at least in the near future).

The CATALOG column is usually present as expected, that is, in those places where an object needs to be identified within a catalog (that is, if MySQL would support catalogs!). When a CATALOG column is present it always evaluates to the "NULL Value" because MySQL does not support catalogs, which is ISO 9075 compliant. There are some cases where one could expect a CATALOG column, but were it is ommitted without raising any questions: In COLLATIONS (COLLATION_CATALOG), CHARACTER_SETS (CHARACTER_SET_CATALOG), and in those places where a COLLATION or CHARACTER_SET could be referenced. Note that a fully compliant ISO 9075 implementation would have "NULL" CATALOG columns in these cases though (as well as a %_SCHEMA column!). There are some other cases where the MySQL implementation omits CATALOG columns, and where it does raise questions. Consider the KEY_COLUMN_USAGE table. We can see a CONSTRAINT_CATALOG and TABLE_CATALOG column. The former would serve to reference (and thus identify) a table constraint (along with CONSTRAINT_SCHEMA and CONSTRAINT_NAME). The latter would serve to reference (and thus identify) a table column (along with TABLE_SCHEMA and TABLE_NAME and COLUMN_NAME). So far, so good, but here we go: The KEY_COLUMN_USAGE table seems to be able to reference another table column via the REFERENCED_TABLE_SCHEMA, REFERENCED_TABLE_NAME and REFERENCED_COLUMN_NAME columns. This is used to define the mapping between a foreign key column and the corresponding primary key or unique constraint column. I would expect to see a REFERENCED_TABLE_CATALOG column for the sake of consistency. A total lack of CATALOG columns would do equally well in this respect. But, okay. I'm done now. It's not really important, considering that the REFERENCED% columns are not even standard in KEY_COLUMN_USAGE. I feel that the current lack of support for the CATALOG concept justifies omitting these columns from the primary key definitions and relationships in the diagram. In fact, I feel that the omission improves the readability of the diagram.

Potrebbero piacerti anche