SQL Server Logging Essentials
- Details
Ever wonder what the log file is really for, and how it really works? This conceptual video highlights how the SQL Server Storage engine takes advantage of write-ahead logging functionality both to ensure data consistency and to optimize system performance. In this video you'll also learn why the log file is so important to data recovery operations and learn about best practices for managing log file sizes and keeping your databases operational.
This Video Covers
Log Files, Recovery, Backups, the SQL Server Storage Engine, and Transactions.
Details
Video Length: 14:15
Skill Level: 000 - Overview
Series: SQL Server Backups
Author/Presenter: Michael K. Campbell
Applies to SQL Server: 2000, 2005, and 2008
Tags: Administration, Concepts, Log Files, Transactions, and Storage Engine
- Downloads
+ Related Videos
+ Related Resources
Inside Microsoft SQL Server 2005: The Storage Engine
Chapter 5 has great, in-depth, information on logging and recovery, and covers log-file growth, the check-point process, and information about virtual log files. See pages 92-93 for information about database and log files.
Inside Microsoft SQL Server 2000
Chapter 5 (starting around page 175) provides some great information on database file sizes and growth - as well as information on shrinking files. Pages 182+ cover information about logging mechanics and truncation as well as the role of virtual log files.
Professional SQL Server 2000 Programming
Pages 30-34 provide a high-level overview of databases and database files, while pages 461-466 go into greater depth on how logging works.
SQL Server Books Online
For additional information about logging, check out relevant topics from Books Online:
SQL Server 2000
Transaction Recovery
Checkpoints and the Active Portion of the Log
Transactions Architecture
SQL Server 2005/2008
Introduction to Transaction Logs
Understanding and Managing Transaction Logs
Microsoft TechNet
Paul S. Randal has a great article outlining Logging and Recovery basics. Understanding Logging and Recovery in SQL Server.
+ Transcript
NOTE: Accompanying timeline provides approximate correspondance only.
00:00
-
-
-
-
00:15
-
-
-
-
00:30
-
-
-
-
00:45
-
-
-
-
01:00
-
-
-
-
01:15
-
-
-
-
01:30
-
-
-
-
01:45
-
-
-
-
02:00
-
-
-
-
02:15
-
-
-
-
02:30
-
-
-
-
02:45
-
-
-
-
03:00
-
-
-
-
03:15
-
-
-
-
03:30
-
-
-
-
03:45
-
-
-
-
04:00
-
-
-
-
04:15
-
-
-
-
04:30
-
-
-
-
04:45
-
-
-
-
05:00
-
-
-
-
05:15
-
-
-
-
05:30
-
-
-
-
05:45
-
-
-
-
06:00
-
-
-
-
06:15
-
-
-
-
06:30
-
-
-
-
06:45
-
-
-
-
07:00
-
-
-
-
07:15
-
-
-
-
07:30
-
-
-
-
07:45
-
-
-
-
08:00
-
-
-
-
08:15
-
-
-
-
08:30
-
-
-
-
08:45
-
-
-
-
09:00
-
-
-
-
09:15
-
-
-
-
09:30
-
-
-
-
09:45
-
-
-
-
10:00
-
-
-
-
10:15
-
-
-
-
10:30
-
-
-
-
10:45
-
-
-
-
11:00
-
-
-
-
11:15
-
-
-
-
11:30
-
-
-
-
11:45
-
-
-
-
12:00
-
-
-
-
12:15
-
-
-
-
12:30
-
-
-
-
12:45
-
-
-
-
13:00
-
-
-
-
13:15
-
-
-
-
13:30
-
-
-
-
13:45
-
-
-
-
14:00
-
-
-
-
14:15
Hello and welcome to another SQL Server Video. My name is Michael Campbell and in this video we’re going to look at a high-level overview of SQL Server’s Storage engine in order to gain an understanding of how SQL Server manages your data – both in memory and on disk.
Doing this will help expose the need and benefits of SQL Server’s Logging functionality, and once we examine how that contributes to backup options and capabilities, we’ll conclude this video by looking at how SQL Server interacts with and manages log files – which is something you need to have a decent understanding of if you want to keep
your databases and applications running smoothly, and if you want to maximize your options when it comes to backups and disaster recovery.
If you’ve ever worked in a networked environment where multiple users are trying to work from the same Excel spreadsheet, or another file, you know that things can quickly get ugly – as only one process, or user, can have the file open for ‘write’ operations at a given time.
SQL Server is under this same restriction – but allows multiple users, processes, and applications to all simultaneously read and modify data in what amounts to ‘real time’ by acting as a proxy, or ‘service’ that sits on top of your data files and manages access to them.
In other words, because SQL Server is a service that controls access to underlying data files, it’s able to act as a ‘middle man’ that
not only allows it to effectively manage read requests via caching, but which also approves, organizes, and orchestrates – or sequences – write operations as they come in.
And it’s this ‘middle man’ functionality that allows SQL Server to manage potentially thousands of simultaneous connections against underlying data files.
To manage all of these concurrent requests, SQL Server employs a very sophisticated caching engine, known as the buffer manager, which
is tasked with pulling data out of files and into SQL Server’s set of working memory.
And to optimize the process of pulling data out of files and into memory – as well as to write data back to disk after it has been modified in memory,
SQL Server organizes data into 8KB ‘slots’ or units of work
known as pages that are ideally suited to being swapped back and forth between machine memory and disk storage. Pages, in turn, are allocated by SQL Server into physically contiguous groups of 8 – known as extents.
As such, this means that ‘under the covers’ in your database, the data in your tables and indexes is allocated into extents, and physically stored on pages.
And, throughout your database a number of specialized pages provide maps, or pointers, to where all other pages and extents are stored.
With these allocation maps, along with some complex hashing and logic within the buffer manager, SQL Server is able to create a highly organized storage environment where it can very quickly and easily retrieve your data –
as well as know where to write that data back to disk if and when it gets modified.
Therefore as users and applications request data, the pages containing that data are read into the cache, or buffer manager, and used to satisfy client requests of queries.
The buffer manager will then keep those pages in memory to satisfy future requests as pulling results back from memory is exponentially faster than pulling it back from disk.
And, as more requests for data are made, more and more data is moved in 8KB blocks into the buffer manager.
Of course, if the amount of requested data stored in the buffer manager exceeds the amount of physical memory available to SQL Server, then some of the pages, or buffers, in the buffer manager will have to be dropped. As such, SQL Server makes use of some powerful, and very intelligent, pruning heuristics that keep the most frequently used bits of your data in memory.
But what happens when a new recorded is added, or when an UPDATE statement changes a whole slew of rows or columns?
In these cases, since any page that needs to be changed has already been pulled into memory, it would be ideal if SQL Server could just make the change in memory and write the change out to disk when it gets time.
And, in fact, that’s effectively what it does – by changing pages in memory and marking them as ‘dirty’ or modified.
Then, when the buffer manager needs to swap out pages, it can bundle-up changes to individual pages and write them as efficiently as possible.
In other words, let’s say that within the space of… 15 seconds there are 27 changes to a single table made by 5 different users. And let’s say that those changes are spread out over 10 pages.
If SQL Server just made those changes sequentially, it would have to make 27 writes to disk – one right after the other.
But, since those changes are first handled in memory, SQL Server can just wait a bit and then write out those dirty pages – resulting in 10 writes rather than 27.
But what happens if SQL Server is in the middle of ‘waiting’ to write data and crashes? What happens to those changes?
That, and, when does SQL Server know when it’s a good time to write changes?
To protect against system crashes, as well as to account for potential errors during update operations, and the need to explicitly execute a ROLLBACK operation, SQL Server makes use of write-ahead logging functionality that ensures that operations are completed in order, and that they can be rolled back when and if needed. Write-ahead logging also allows SQL Server to recover in the case of a system crash, or when SQL Server is powered down before changes to dirty pages can be written to disk.
To account for these needs, SQL Server records the value of the changed data BEFORE an operation starts, and what the value of that data should be AFTER the operation completes. Each of these change records constitutes a log entry, and is written to the SQL Server Log file in sequence – where each entry is given a sequential log sequence number, or LSN, that makes it easy for SQL Server to reconstruct the order of operations if necessary.
And since this log entry is created BEFORE each change is made, SQL Server can ‘flag’ dirty pages in memory with the LSN number of the transaction making the change – which provides a great way for SQL Server to keep track of which operations are changing what data.
Then, as transactions complete and are committed, the buffer manager is able to write out changes to disk – so long as the LSN for each dirty page it’s writing points to a transaction that has successfully completed. In this way, changes to underlying database data can be handled asynchronously by the data storage engine – in order to achieve the best performance, and the order of operations is still faithfully preserved in the transaction log.
Therefore, in the case of a crash, or even when the server is just shut down, log file records have already been written to disk, and can be used to reconstruct or ‘recover’ the true state that the database should be in. In this way, each time SQL Server is started up, or each time a database is restored and then ‘recovered’, SQL Server can go through and make sure that everything in the data files ‘jives’ with how the log files say it should be.
Now, to keep the ‘gap’ between committed transactions and when those changes are written to disk from getting too large, SQL Server implements a Checkpoint process that basically amounts to the periodically asking itself: “If I crashed right now, how long would it take me to run through the recovery process – and roll operations either forward or backward as needed?”
If SQL Server thinks that it will take longer than a minute (which is the default – though you can set this value to virtually whatever you want), it will throw a checkpoint marker into the log file and then write out all dirty log and data pages to disk that it can.
Then, once the checkpoint operation is complete, SQL Server works through a number of ‘house-keeping’ operations that makes sure that any current or long-running transactions are all accounted for. Then, since SQL Server no longer needs log entries to ensure the state of underlying data, it marks non-needed entries as inactive so that they can either be truncated (or deleted) or backed-up by a log-file backup operation.
Which brings us to backups and disaster recovery options – because the way you instruct SQL Server to either keep or discard those records has a huge impact both on how your log files grow as well as on what kind of recovery options you have available to you when something bad happens – as the data stored in these log files provides a complete set of play-by-play changes made to your database that provide a number of great options when it comes to recovering from disasters.
For example, if you’re doing full database backups at the top of every hour, and your server full-on crashes at 45 minutes after the hour, you can typically use the log file to recover everything that happened in the last 45 minutes – instead of just asking end-users to retype everything they did during that same time period.
Likewise, in horrible cases where ‘stan the man’ tacos a key table with an UPDATE query that’s missing a WHERE clause, you can recover a copy of your database to the ‘point in time’ right before the execution of stan’s poorly thought-out query.
There are also a number of great 3rd party tools out there that can read the insanely complex information stored in the log file, and can therefore let you ‘unplay’ and ‘replay’ transactions of your choosing in order to recover from ‘stupid user’ errors. And, while these solutions can cost a few pesos, they can totally save your bacon and are therefore worth their weight in gold.
And, since the log file represents a faithful record of changes, it can also be used in log-shipping and replication scenarios that make it possible to synchronize copies of your data or databases in different locations and on different servers if desired.
Therefore, given how valuable these transaction records can be, you’ll want to take care in what you do with them – and you’ll also need to balance that need with storage considerations.
And, in terms of storage, SQL Server treats all of the log space allocated to a given database as a location to store virtual log files, or VLFs. In this way, transaction logs can span multiple drives if desired while simultaneously handling the need to create new virtual log files as needed.
And as this overly-simplified schematic shows, SQL Server is able to allocate new virtual log files to house more and more records as processing proceeds.
Then as each checkpoint occurs, and data files are synchronized with transaction records, the data in reconciled virtual log files is no longer needed for performance and atomicity requirements and can be ‘truncated’ to make room for new virtual log files that ‘wrap-around’ and take up freed-up space.
And, as long as the number of transactions stays at a maintainable rate, and as long as long-running transactions don’t prevent truncation, this wrap-around process will be able to continue – meaning that your database will be able to run without needing more log space.
And, if SQL Server needs more physical log space to allocate the creation of new VLFs, it will see if it’s permitted to grow the log based on individual database settings that control maximum size and growth options. If it’s allowed to grow, SQL Server will attempt to increase physical the file size – and as long as there’s sufficient disk available, everything will continue as needed.
But if it’s not able to grow, then SQL Server won’t be able to allocate new VLFs, and a long-running transaction, or high volume of changes will cause the log file to run out of space – at which point SQL Server is no longer able to modify data and you’ll start getting 9002 errors stating that your log file is full. Which will, effectively, bring your database to a screeching halt.
Therefore, making sure that log files are properly sized, can grow, and are able to free up non-active entries is a key component of keeping your databases running properly.
And the way that you ‘dispose’ or free-up non-active transactions from the log file really helps determine what kind of disaster recovery options you have at your disposal.
In some cases, simply truncating non-active transactions from the log file can make a lot of sense. But when it comes to recovering from disaster scenarios, you’re limited to a set of ‘simple’ options because you’re using what is known as the ‘simple’ recovery model. And, when you’re using this recovery model, you can’t take advantage of past log file entries so all you’re able to do in disaster recovery scenarios is restore your database to the last full of differential backup made of the entire database – and any changes made since that backup are simply lost.
On the other hand, if you want non-simple recovery options, you can instruct SQL Server to preserve non-active entries in the log, and let a log file backup process come through and make copies of those records – after which, SQL Server will then truncate, or return used space back to the log file for the creation of new VLFs. In this way, you can keep a running set of log file backups, say… for 2-3 days, somewhere on your server or on different machines as a way to keep your recovery options open.
And, as you would guess, this approach takes a tiny bit more effort to setup and administer, but when log file backups are done at regular intervals, this approach provides you with much greater protection in the case of disasters while still doing a great job of keeping the transaction log free to continue the ‘wrap around’ process that helps maintain a more manageable log file size.
So, in conclusion, we’ve taken a very high level look at how logging works – and how SQL Server’s write-ahead logging helps serve both a performance and an accuracy need. We’ve also looked at how that impacts log file growth and maintenance. And, in so doing, we’ve also taken a very brief look at different recovery models that represent different approaches, or options, in terms of managing log files and your recovery options.
Now, if you’d like to learn more about these, and other options, make sure to check out this video’s companion videos that cover backup options in more detail, as well as backup best practices.
Then, to put these concepts into practice, make sure to check out the ‘how-to’ videos in this series that will help turn you into a pro when it comes to working with and understanding SQL Server backups.
- Comments
Do Not Redistribute
Please do not redistribute this content.
+ I Already Agreed - Stop Bugging Me
Non-Registered users must agree not to distribute each new time that they visit the site (prior to the first download per each visit or session on the site).
Registered users don't have to agree to terms each time they visit the site.
Registration costs nothing, is painless, and is evil-free™. And SQLServerVideos.com will never divulge your personal information to third parties. Ever.
Registered Users
Login Here
+ Terms of Service
By using this site and its resources, you agree to the following terms of use:
Redistribution is prohibited.
Redistribution of videos via any of the following means is prohibited:
WebServer/HTTP, FTP
YouTube and similar sites,
BitTorrent and other P2P technologies,
CD/DVD, Portable Hard-Drive, etc.
All Rights Reserved.
SQLServerVideos' Parent Company, OverAchiever Productions, reserves all intellectual rights associated with this site and its content.
OverAchiever Production reserves the right to prosecute or litigate copyright violations.
As the copyright owner, OverAchiever Productions will also request information (from Google, YouTube, etc.) about anyone that posts copyrighted SSV
content to a video distribution site.
need internal storage of logfile
it is very good
can you please tell me with an example of what do you meant by allocation
Hey.. Nice vedio..Nice work.. Appreciable
Muy buena y completa explicacion de los conceptos basicos del tema
Add A Comment