Sei sulla pagina 1di 19

Introduction to the SQL Server Transaction Log

Overview
In this tutorial, well go over some of the tasks you can do to manage the transaction log. The
transaction log is very important to SQL Server and there are books alone that discuss how to
manage the log so this will be an introduction tutorial to the transaction log.
Some questions that I hope to answer in this tutorial include:

What is the transaction log?


Why does SQL Server require a transaction log?
What is a transaction?
What does BEGIN TRAN, ROLLBACK TRAN, and COMMIT TRAN mean?
What is a nested transaction?
How does Mirroring, Replication, and Log Shipping use the transaction log?
Why does the transaction log grow?
What is a Virtual Log File (VLF)?
How to monitor transaction log growth.
How to shrink the transaction log.
How large should the transaction log be?
Do multiple transaction logs help performance?
What are the recovery models and how does the log use them?
How to recover data using a transaction log backup.

What is the transaction log?


Where do I start? The transaction log is an integral part of SQL Server. Every database has a
transaction log that is stored within the log file that is separate from the data file. A transaction
log basically records all database modifications. When a user issues an INSERT, for example, it is
logged in the transaction log. This enables the database to roll back or restore the transaction if
a failure were to occur and prevents data corruption. For example, lets say Sue is using an
application and inserts 2000 rows of data. While SQL Server is processing this data lets say
someone pulls the plug on the server. (Again, this is just an example, I hope nobody would
actually do this). Because the INSERT statement was writing to the transaction log and it knows
a failure occurred it will roll back the statement. If this wasnt put in place, could you imagine
having to sift through the data to see how many rows it inserted and then change the code to
insert the remaining rows? Or even better, what if the application inserted random columns in
no order and you had to determine what data was inserted and what data was left out? This
could take forever!
Log entries are sequential in nature. The transaction log is split up into small chunks called
virtual log files, which we will discuss in a later section. When a virtual log file is full,
transactions automatically move to the next virtual log file. As long as the log records at the

beginning of the transaction log have been truncated when logging reaches the end of the log,
it will circle back around to the start and will overwrite what was there before:

Why does SQL Server require a transaction log?


SQL Server needs a transaction log? No, SQL Server HAS to have a transaction log to work?
Without a log file, your database will not work.at all.
In the next section well talk more about what a transaction is, but this section will cover the
transaction log.
The transaction log supports the following:
ROLLBACK TRANSACTION - If a user or application issues the ROLLBACK statement, or if the DB
engine detects a failure, the log records are used to roll back the transaction. We will discuss
BEGIN, COMMIT, and ROLLBACK TRANSACTION later in this tutorial.
Recover Incomplete Transactions - If you have ever started SQL Server from a failure you may
have noticed databases in the (In Recovery) mode:

This is an indication that SQL Server is rolling back transactions that did not complete before
the SQL Server was restarted or it is rolling forward all modifications that were recorded in the
log but not written to the data file. You may also see this if you have started a restore WITH
RECOVERY.

Rolling a restored DB, file, filegroup, or page forward to the point of failure If SQL Server
were to fail and you need to restore the database back to the point in which the failure
occurred you can as long as you are using the FULL recovery model. Start with a full backup,
then apply the latest differential, and the subsequent transaction log backups up to the point of
failure. We will go into more detail later in this tutorial. You can find more about this here.
High availability solutions - Transactional replication, mirroring, and log shipping all use the
transaction log. We will discuss how the log is used later in this tutorial.

What is a transaction?
A transaction can be defined in many different ways and Ive always had this question come up
in interviews. Basically, a transaction is a unit of work that is performed against a database. This
work can be performed manually, such as an UPDATE statement you issue in SQL Server
Management Studio or an application that INSERTS data into the database. These are all
transactions.
SQL Server supports the following transaction modes:
Autocommit transactions - Each individual statement is a transaction.
Explicit transactions - Each transaction is explicitly started with the BEGIN TRANSACTION
statement and explicitly ended with a COMMIT or ROLLBACK statement.
Implicit transactions A new transaction is implicitly started when the prior transaction
completes, but each transaction is explicitly completed with a COMMIT or ROLLBACK
statement.
Batch-scoped transactions - Applicable only to multiple active result sets (MARS), a TransactSQL explicit or implicit transaction that starts under a MARS session becomes a batch-scoped
transaction. A batch-scoped transaction that is not committed or rolled back when a batch
completes is automatically rolled back by SQL Server.
You may have heard of the ACID properties. These apply to transactions as well:
Atomicity - ensures that all operations within the work unit are completed successfully,
otherwise the transaction is aborted at the point of failure and previous operations are rolled
back to their former state.
Consistency - ensures that the database properly changes states upon a successfully committed
transaction.
Isolation enables transactions to operate independently of and transparent to each other.

Durability - ensures that the result or effect of a committed transaction persists in case of a
system failure.
Last, but not least, SQL Server supports transaction control. Below is a short description of
each, but well go over them in more detail in the next section. Note that transaction controls
are only used with DML commands.
BEGIN TRANSACTION - the starting point of a transaction
ROLLBACK TRANSACTION - roll back a transaction either because of a mistake or a failure
COMMIT TRANSACTION - save changes to the database

What does BEGIN TRAN, ROLLBACK TRAN, and COMMIT TRAN mean?
When creating a SQL Statement by default, for example, SELECT * FROM
HumanResources.Employee, SQL Server will run this statement and immediately return the
results:

If you were to add BEGIN TRANSACTION (or BEGIN TRAN) before the statement it automatically
makes the transaction explicit and holds a lock on the table until the transaction is either
committed or rolled back.
BEGIN TRANSACTION marks the starting point of an explicit, local transaction. - MS
For example, when I issue a DELETE or UPDATE statement I always explicitly use BEGIN TRAN to
make sure my statement is correct and I get the correct number of results returned.

Lets say I want to UPDATE the Employee table and set JobTitle equal to 'DBA' where LoginID is
like '%barbara%'. I accidentally create my statement wrong and issue the statement below
which actually would make every JobTitle equal to 'DBA':
UPDATE HumanResources.Employee
SET JobTitle = DBA
WHERE LoginID IN (
SELECT LoginID FROM HumanResources.Employee)

Oops! I didnt mean to do that!! I accidentally made every record have a JobTitle of DBA. If I
would have placed a BEGIN TRAN before my statement I would have noticed that 290 results
would be effected and something is wrong with my statement:

Since I specified a BEGIN TRAN, the transaction is now waiting on a ROLLBACK or COMMIT.
While the transaction is waiting it has created a lock on the table and any other processes that
are trying to access HumanResources.Employee are now being blocked. Be careful using BEGIN
TRAN and make sure you immediately issue a ROLLBACK or COMMIT:

As you can see, SPID 52 is getting blocked by 54.


Since I noticed something was terribly wrong with my UPDATE statement, I can issue a
ROLLBACK TRAN statement to rollback the transaction meaning that none of the data actually
changed:
ROLLBACK TRANSACTION rolls back an explicit or implicit transaction to the beginning of the
transaction, or to a savepoint inside the transaction. It also frees resources held by the
transaction. - MS

If I had written my statement correct the first time and noticed the right amount of results
displayed then I could issue a COMMIT TRAN and it would execute the statement and my
changes would be committed to the database:
COMMIT TRANSACTION marks the end of a successful implicit or explicit transaction. If
@@TRANCOUNT is 1, COMMIT TRANSACTION makes all data modifications performed since the
start of the transaction a permanent part of the database, frees the resources held by the
transaction, and decrements @@TRANCOUNT to 0. If @@TRANCOUNT is greater than 1,
COMMIT TRANSACTION decrements @@TRANCOUNT only by 1 and the transaction stays
active. - MS

What is a nested transaction?


A nested transaction is a transaction that is created inside another transaction.Huh? It sounds
confusing, but its not. A nested transactions purpose is to support transactions in stored
procedures that can be called from a process already in a transaction or from a process that has
no active transaction.
See below for an example of a nested transaction:
BEGIN TRAN Tran1
GO
BEGIN TRAN Nested Tran
GO
INSERT INTO Table1 DEFAULT Values
GO 10
COMMIT TRAN Nested Tran
SELECT * FROM Table1
I get back 10 results. So far, so good. When I rollback the first transaction.
ROLLBACK TRAN Tran1
The 10 transactions are gone. Why? The nested transaction is based on the action of the
outermost transaction. Since I rolled back the outer transaction, the entire transaction is rolled
back no matter what I did in between. If I would have committed Tran 1, then my nested
transaction would have also committed.

How does Mirroring, Replication, and Log Shipping use the transaction log?
Im not going to go over what each high availability option is, but instead Ill talk about how the
transaction log is involved with each option.
Database Mirroring - When a DML statement is executed against the primary database,
mirroring needs to create this exact same statement on the mirrored database as quickly as
possible. This is done by sending a continuous stream of active transaction log records to the
mirror server, which applies these logs to the mirrored database, in sequence, as quickly as
possible. Starting in SQL Server 2008, the transaction log record gets compressed before
sending it over to the mirrored server to help reduce latency across the network.
Another consideration when choosing between synchronous or asynchronous mirroring is
when mirroring is asynchronous, the transactions commit without waiting for the mirror server
to the write the log to the disk, which maximizes performance. In synchronous mirroring, the
transaction is committed on both partners, but that increases network latency.
One thing to take in account with database mirroring is when you pause mirroring, the principal
database will still accumulate records in the transaction log and that log file cannot be
truncated. Therefore, is the database mirroring stays in the paused state for a long period of
time, it can cause the log to fill up.
Transactional Replication - Transactional replication includes 3 main parts: SQL Server
Snapshot Agent, Log Reader Agent, and Distribution Agent.
During transactional replication the Log Reader Agent monitors the transaction log of the
database replicated and copies the transactions marked for replication into the distribution
database (a database created when replication is configured). The Distribution Agent then
comes along and copies the transactions from the distribution database to the Subscriber
(replicated database)
The Log Reader Agent runs at the Distributor and when executed it first reads the publication
transaction log looking for any DML statements. Next, the agent copies those transactions
marked for replication to the distribution database. The Distribution Agent then moves the
transactions to the Subscriber as described above. Only committed transactions are sent to the
distribution database.
Unlike mirroring, which works at the physical level, replication works at the logical level.
Log Shipping - Theres really not too much involved in log shipping. You can think of it like you
think of Backup/Restore. Log shipping basically consists of backing up a transaction log,
shipping (moving) it to another server and restoring the transaction log. The log can also be
shipped to multiple secondary servers if needed.

Why does the transaction log grow?


Like weve already discussed, the transaction log records all database modifications. If you have
a busy system, this alone could cause the transaction log to grow. If Autogrow is not enabled,
once the log file hits the specified maximum size, it will throw an error on every transaction
that hits the database until the problem is fixed. I always recommend turning on Autogrow, but
keep an eye on your transaction log file. Also, specifying a small autogrowth increment on your
log file can reduce performance. The file growth increment on a log file should be sufficiently
large enough to avoid frequent expansion. The default is 10 percent and this may be sufficient
for you environment. I would start at 10 percent, but keep an eye on the log to see how often it
grows and adjust accordingly. Using a fixed increment instead of percentages may work better
for your environment. There really isnt one setting for all databases, so you need to monitor
log usage to figure out the best setting for each database.
To enable Autogrow in SSMS, right click the database and select properties. Under the
Database properties, choose the Files tab and adjust Autogrowth / Maxsize:

Other factors that will cause the log file to grow may include the following:

Uncommitted transactions
Index Operations - CREATE INDEX, rebuild indexes, etc.
Un-replicated transactions
Long running transactions
Incorrect recovery model settings (we'll discuss this later)
Large transactions

What is a Virtual Log File (VLF)?


Each physical transaction log file is divided internally into numerous virtual log files, or VLFs.
The virtual log files are not a certain size nor can you specify how many VLFs are in a physical
log file. The Database Engine does this for us, but for performance reasons it tries to maintain a
small number of virtual files.
System Performance is affected when the virtual log file is defined by small size and
growth_increment values meaning if the log files grow to a large size because of many small
increments, it will increase the number of virtual log files. This is why its a good reason to set
Autogrow to a larger increment. If the log is set to grow at 1MB at a time it may grow
continuously resulting in more and more virtual log files. An increased number of VLFs can slow
down database startup and log backup/restore operations.
Theres not a right or wrong number of VLFs per database, but remember, the more you have
the worse performance you may have. You can use DBCC LOGINFO to check the number of
VLFs in your database.
Andy Novick wrote an excellent tip on performance issues with a large number of VLFs here.

How to monitor transaction log growth?


Monitoring the log file is very important and SQL Server has made it fairly easy for us to do this.
One way to find information about the log is in the catalog view sys.database_files. This view
returns information about data and log files that include type of file, name, location, state, size,
growth, etc. The following query will filter down to only the log file and displays some very
useful information:
SELECT name AS [File Name],
physical_name AS [Physical Name],

size/128.0 AS [Total Size in MB],


size/128.0 - CAST(FILEPROPERTY(name, 'SpaceUsed') AS int)/128.0 AS [Available Space In
MB],
[growth], [file_id]
FROM sys.database_files
WHERE type_desc = 'LOG'

You can also use DBCC SQLPERF (logspace) which has been around for a while. This command
displays useful details such as DB name, Log Size (MB) and Log Space Used (%):

The result set would like the following. As you can see there is a lot of great information that is
returned when using LABELONLY.
ColumnName
MediaName
MediaSetId
FamilyCount
FamilySequenceNumber
MediaFamilyId
MediaSequenceNumber
MediaLabelPresent
MediaDescription
SoftwareName
SoftwareVendorId
MediaDate
MirrorCount

Value
NULL
8825ADE0-2C83-45BD-994C-7469A5DFF124
1
1
8A6648F8-0000-0000-0000-000000000000
1
0
NULL
Microsoft SQL Server
4608
02:37.0
1

How to shrink the transaction log


One thing that I see a lot of administrators ask about is transaction log size and how to truncate
it. Log records that are not managed correctly will eventually fill up the disk causing no more
modifications to the database. Transaction log growth can occur for a few different reasons.
Long running transactions, incorrect recovery model configuration and lack of log backups can
grow the log.
Log truncation frees up space in the log file so the transaction log can reuse it. Unless there is
some kind of unexpected delay, log truncation will occur automatically after a checkpoint (if the
database is in SIMPLE recovery model) or after a log backup (if the database is in FULL or BULKLOGGED recovery model). MSSQLTips.com offers plenty of tips regarding transaction log
truncation, but Ill show you two ways to shrink the log.

Shrink the log in SQL Server Management Studio


To shrink the log in SSMS, right click the database, choose Tasks, Shrink, Files:

On the Shrink File window, change the File Type to Log. You can also choose to either release
unused space, reorganize pages before releasing unused space, or empty file by migrating the
data to other files in the same filegroup:

Shrink the log using TSQL


If the database is in the SIMPLE recovery model you can use the following statement to shrink
the log file:
DBCC SHRINKFILE (AdventureWorks2012_log, 1)
Replace AdventureWorks2012_log with the logical name of the log file you need shrunk and
change 1 to the number of MB you want the log file shrunk to.
If the database is in FULL recovery model you could set it to SIMPLE, run DBCC SHRINKFILE, and
set back to FULL if you dont care about losing the data in the log.

ALTER DATABASE AdventureWorks2012


SET RECOVERY SIMPLE
GO
DBCC SHRINKFILE (AdventureWorks2012_log, 1)
GO
ALTER DATABASE AdventureWorks2012
SET RECOVERY FULL

**You can find the logical name of the log file by using the following query:
SELECT name FROM sys.master_files WHERE type_desc = 'LOG'
Another option to shrink the log using the FULL recovery model is to backup the log for your
database using the BACKUP LOG statement and then issue the SHRINKFILE command to shrink
the transaction log:
BACKUP LOG AdventureWorks2012 TO BackupDevice

How large should the transaction log be?


This is a question Ive seen come up more and more throughout my years as a DBA. How large
should the transaction log be? Well.it depends.
For normal day to day operations, I would recommend starting at 25% of the data file size. So if
your data file is 20GB, make the log 5GB. Keep an eye on the log to see if it continuously grows
beyond 5GB. Like we talked about in the VLF section, every time your log grows you add more
VLFs which can cause performance problems. 25% is the recommended starting point for
normal operations, however, if youre doing index rebuilds every night or if youre using high
availability options such as replication or mirroring, youll need enough log space to hold onto
the extra transactions that may occur.
25% is just a recommended value to start with. Realistically you should use a test server to
examine how much log space is used at average and peak times and adjust your production
server leaving a little extra space to spare.

Do multiple transaction logs help performance?


Multiple transaction log files absolutely does NOT help performance. Multiple data files,
however, does help so I think people believe since multiple data files can be beneficial that
multiple log files are. Not true.
The transaction log is sequential so SQL Server doesnt perform parallel I/Os to the transaction
log if there are multiples. If your system does have multiple log files, the first file is used in its
entirety, then the second file will be used, and so on.
The only time a second transaction log file might be needed is if the first transaction log is full
and the transaction log wont clear, it will then flip over to the second log file allowing database
modifications until the first log file can clear.

What are the recovery models and how does the log use them?
I mentioned recovery models a few times in the sections above so Ill give a brief overview of
each one
Full In the Full recovery model, log files should be backed up on a regular basis to prevent the
disk from filling up the drive. Also, in the Full recovery model, you can restore a database to the
point of failure depending on your log backup schedules. This is the default for the model
database in SQL Server. If you leave this recovery model set to Full, make sure you schedule log
backups.
Simple In the Simple recovery model, transaction log backups cannot be used. This means
that SQL Server will automatically reclaim disk space at certain intervals, which can be good,
but it also means that if a failure were to occur you can only restore back to the last full backup
since all transactions are lost in between. This is generally used in development environments.
Bulk Logged - In the Bulk Logged recovery model, certain large scale or bulk copy operations
are minimally logged. I have never left a database in the Bulk Logged recovery model for a large
amount of time. I usually will have a database in the Full recovery model and if Im doing a bulk
insert and dont need to log the transactions, Ill switch to Bulk Logged and then back to Full
once this is complete. Bulk Logged, however, does support log backup and point in time
recovery.
Set Recovery Model in SQL Server Management Studio
To change the recovery mode in SSMS, right click the database, choose Properties. On the
Database Properties window, select Options, Recovery Model:

Set Recovery Model using TSQL


Use [master]
GO
ALTER DATABASE [AdventureWorks2012] SET RECOVERY SIMPLE WITH NO_WAIT
GO
Replace AdventureWorks2012 with your database name and use SIMPLE, FULL, or
BULK_LOGGED for the recovery model.

How to recover data using a transaction log backup?


When using the Full or Bulk Logged recovery models you can restore to a point in time
depending on how often you create transaction log backups.
Lets say you create a full backup every night at 12:00am and transaction log backups every
hour. Lets now say that the server crashes at 3:30am and you get an alert and the only way to
bring this server back is to restore the data. If you restore the 12:00am full backup, then the
transaction log backups in order from 1:00am, 2:00am, and 3:00am, you will have up to date
data as of 3:00am only losing 30 minutes of data. Of course, you can set transaction log backups
to occur every 5 minutes which makes your point in time of data loss only 5 minutes at the
most. The optimal interval depends on factors such as importance of data loss, size of the
database, number of transactions, and workload of the server.
Also, when you are restoring transaction logs, you have to restore WITH NORECOVERY until the
last log. When restoring the last log, restore WITH RECOVERY
Its not possible to restore a transaction log backup if:

If a transaction log backup is damaged or missing, you can only restore up to the point
of the damaged or missing log backup. Transaction log backups have to be in sequential
order.
You restore a database WITH RECOVERY before you are finished restoring all the
transaction logs needed. If this were to happen, you would need to start over from the
beginning with the Full backup.

The following example starts by restoring a full backup followed by two transaction log
backups. Note using the NORECOVERY statement throughout except on the last restore.
RESTORE DATABASE AdventureWorks2012
FROM BackupDevice
WITH NORECOVERY
GO
RESTORE LOG AdventureWorks2012
FROM BackupDevice
WITH FILE = 1, NORECOVERY
GO

RESTORE LOG AdventureWorks2012


FROM BackupDevice
WITH FILE = 2, RECOVERY
GO

Potrebbero piacerti anche