Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
o SQL aggregation function is used to perform the calculations on multiple rows of a single column of a
table. It returns a single value.
o It is also used to summarize the data.
1. COUNT FUNCTION
o COUNT function is used to Count the number of rows in a database table. It can work on both numeric and
non-numeric data types.
o COUNT function uses the COUNT(*) that returns the count of all the rows in a specified table. COUNT(*)
considers duplicate and Null.
Syntax
1. COUNT(*)
2. or
3. COUNT( [ALL|DISTINCT] expression )
Sample table:
PRODUCT_MAST
Example: COUNT( )
1. SELECT COUNT(*)
2. FROM PRODUCT_MAST;
Output:
10
1. SELECT COUNT(*)
2. FROM PRODUCT_MAST;
3. WHERE RATE>=20;
Output:
Output:
Output:
Com1 5
Com2 3
Com3 2
Output:
Com1 5
Com2 3
2. SUM Function
Sum function is used to calculate the sum of all selected columns. It works on numeric fields only.
Syntax
1. SUM( )
2. or
3. SUM( [ALL|DISTINCT] expression )
Example: SUM( )
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST;
Output:
670
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3;
Output:
320
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3
4. GROUP BY COMPANY;
Output:
Com1 150
Com2 170
Example: SUM() with HAVING
Output:
Com1 335
Com3 170
3. AVG function
The AVG function is used to calculate the average value of the numeric type. AVG function returns the average of
all non-Null values.
Syntax
1. AVG( )
2. or
3. AVG( [ALL|DISTINCT] expression )
Example:
1. SELECT AVG(COST)
2. FROM PRODUCT_MAST;
Output:
67.00
4. MAX Function
MAX function is used to find the maximum value of a certain column. This function determines the largest value of
all selected values of a column.
Syntax
1. MAX( )
2. or
3. MAX( [ALL|DISTINCT] expression )
Example:
1. SELECT MAX(RATE)
2. FROM PRODUCT_MAST;
30
5. MIN Function
MIN function is used to find the minimum value of a certain column. This function determines the smallest value of
all selected values of a column.
Syntax
1. MIN()
2. or
3. MIN( [ALL|DISTINCT] expression )
Example:
1. SELECT MIN(RATE)
2. FROM PRODUCT_MAST;
Output:
10
SQL CREATE VIEW Statement
A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables
in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were
coming from one single table.
Note: A view always shows up-to-date data! The database engine recreates the data, using the view's SQL
statement, every time a user queries a view.
The following SQL creates a view that shows all customers from Brazil:
Example
CREATE VIEW [Brazil Customers] AS
SELECT CustomerName, ContactName
FROM Customers
WHERE Country = "Brazil";
Your Database:
Tablename Records
Customers 91
Categories 8
Employees 10
OrderDetails 518
Orders 196
Products 77
Shippers 3
Suppliers 29
Views:
Name of View Records
Brazil Customers 9
SQL INDEXES
Indexes are special lookup tables that the database search engine can use to speed up data retrieval. Simply put, an
index is a pointer to data in a table. An index in a database is very similar to an index in the back of a book.
For example, if you want to reference all pages in a book that discusses a certain topic, you first refer to the index,
which lists all the topics alphabetically and are then referred to one or more specific page numbers.
An index helps to speed up SELECT queries and WHERE clauses, but it slows down data input, with
the UPDATE and the INSERT statements. Indexes can be created or dropped with no effect on the data.
Creating an index involves the CREATE INDEX statement, which allows you to name the index, to specify the
table and which column or columns to index, and to indicate whether the index is in an ascending or descending
order.
Indexes can also be unique, like the UNIQUE constraint, in that the index prevents duplicate entries in the column
or combination of columns on which there is an index.
Syntax
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
....
);
The column parameters specify the names of the columns of the table.
The datatype parameter specifies the type of data the column can hold (e.g. varchar, integer, date, etc.).
Syntax
DROP TABLE table_name;
Note: Be careful before dropping a table. Deleting a table will result in loss of complete information stored
in the table!
The TRUNCATE TABLE statement is used to delete the data inside a table, but not the table itself.
Syntax
TRUNCATE TABLE table_name;
First Normal Form (1NF)
EMPLOYEE table:
The decomposition of the EMPLOYEE table into 1NF has been shown below:
next →← prev
First Normal Form (1NF)
EMPLOYEE table:
The decomposition of the EMPLOYEE table into 1NF has been shown below:
EMP_ID EMP_NAME EMP_PHONE EMP_STATE
14 John 7272826385 UP
14 John 9064738238 UP
20 Harry 8574783832 Bihar
12 Sam 7390372389 Punjab
12 Sam 8589830302 Punjab
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a school, a teacher can
teach more than one subject.
TEACHER table
In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a proper subset of
a candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
o A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
o 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
o If there is no transitive dependency for non-prime attributes, then the relation must be in third normal form.
A relation is in third normal form if it holds atleast one of the following conditions for every non-trivial function
dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Example:
EMPLOYEE_DETAIL table:
Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.
Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. The
non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It
violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP> table,
with EMP_ZIP as a Primary key.
EMPLOYEE table:
EMP_ID EMP_NAME EMP_ZIP
222 Harry 201010
333 Stephan 02228
444 Lan 60007
555 Katharine 06389
666 John 462007
EMPLOYEE_ZIP table:
Example: Let's assume there is a company where employees work in more than one department.
EMPLOYEE table:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
Now, this is in BCNF because left side part of both the functional dependencies is a key.
o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued dependency.
o For a dependency A → B, if for a single value of A, multiple values of B exists, then the relation will be a
multi-valued dependency.
Example
STUDENT
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity. Hence, there is no
relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID, which leads to unnecessary
repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
FAQ: What is the distinction between BCNF and 3NF? Is there a reason to prefer one over the other?
ANSWER: 3NF is the Third normal form used in relational database normalization. According to the Codd’s
definition, a table is said to be in 3NF, if and only if, that table is in the second normal form (2NF), and every
attribute in the table that do not belong to a candidate key should directly depend on every candidate key of that
table.
BCNF (also known as 3.5NF) is another normal form used in relational database normalization. It was introduced to
capture some the anomalies that are not addressed by the 3NF. A table is said to be in BCNF, if and only if, for each
of the dependencies of the form A → B that are non-trivial, A is a super-key.
BCNF acts differently from 3NF only when the relation has multiple overlapping candidate keys.
The reason is that the functional dependency X -> Y is of course true if Y is a subset of X. So in any table that has
only one candidate key and is in 3NF, it is already in BCNF because there is no column (either key or non-key) that
is functionally dependent on anything besides that key.
Example: Assume your pizza has exactly three topping types, and you must choose:
Because each pizza must have exactly one of each topping type, we know that (Pizza, Topping Type) is a candidate
key. We also know intuitively that a given topping cannot belong to different types simultaneously. So
(Pizza, Topping) must be unique and therefore is also a candidate key. So we have two overlapping candidate keys.
We need to prevent these sorts of mistakes, to make mozzarella always be cheese. We should use a separate table for
this, so we write down that fact in only one place.
1. Pizza Topping
2. -------- ----------
3. 1 mozzarella
4. 1 pepperoni
5. 1 olives
6. 2 mozzarella
7. 2 sausage
8. 2 peppers
9.
10. Topping Topping Type
11. ----------- -------------
12. mozzarella cheese
13. pepperoni meat
14. olives vegetable
15. sausage meat
16. peppers vegetable
I showed an anomaly where we marked mozzarella as the wrong topping type. We know this is wrong, but the rule
that makes it wrong is a dependency Topping -> Topping Type which is n ot a valid dependency for BCNF for this
table. It's a dependency on something other than a whole candidate key.
So to solve this, we take Topping Type out of the Pizzas table and make it a non-key attribute in a Toppings table.
Relational Calculus
o Relational calculus is a non-procedural query language. In the non-procedural query language, the user is
concerned with the details of how to obtain the end results.
o The relational calculus tells what to do but never explains how to do.
o The tuple relational calculus is specified to select the tuples in a relation. In TRC, filtering variable uses the
tuples of a relation.
o The result of the relation can have one or more tuples.
Notation:
Where
For example:
OUTPUT: This query selects the tuples from the AUTHOR relation. It returns a tuple with 'name' from Author who
has written an article on 'database'.
TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and Universal Quantifiers (∀).
For example:
Output: This query will yield the same result as the previous one.
o The second form of relation is known as Domain relational calculus. In domain relational calculus, filtering
variable uses the domain of attributes.
o Domain relational calculus uses the same operators as tuple calculus. It uses logical connectives ∧ (and), ∨
(or) and ┓ (not).
o It uses Existential (∃) and Universal Quantifiers (∀) to bind the variable.
Notation:
Where
For example:
Output: This query will yield the article, page, and subject from the relational javatpoint, where the subject is a
database.
Answer: (A)
Explanation:
Option A:
Option B:
As TRC is a mathematical expression, hence it is expected to give only distinct result set.
Option D:
Answer: (B)
Answer: (A)
Explanation: The table is not in 2nd Normal Form as the non-prime attributes are dependent on subsets of
candidate keys.
The candidate keys are AD, BD, ED and FD. In all of the following FDs, the non-prime attributes are dependent on
a partial candidate key.
A -> BC
B -> CFH
F -> EG