Tuesday, July 7, 2015

Understanding transaction properties with Microsoft SQL Server [Discussion on Transactions - I]

I had a chance to discuss an old topic with few of my colleagues, it was on transactions and isolation levels. Although it is old, well known to experienced engineers, it is something new and something important to know for many. Therefore, thought to make some notes on the discussion, here is the first one.

What is a transaction? Simplest way to understand it is, consider multiple operations to be done as a single operation. In database management system world, it is a unit of work that will contain one or more operations on querying, modifying data and modifying definition of the data. With SQL Server, we implement explicit transactions using BEGIN TRAN, COMMIT TRAN and ROLLBACK TRAN. In addition to that, SQL Server supports implicit transaction and auto-commit transactions.

Transactions have four properties. Here are details we discussed on them aligning with SQL Server.

Atomicity: Transaction is considered as an atomic unit of work. As explained above, it includes one or more operations. However transaction completion is not just completing one operation, it should either complete all operations involved or none of them should be completed. SQL Server automatically rolls back transactions that have errors if it runs with auto-commit mode and we need to roll back if explicit or implicit transactions are used. If the system fails before the completion of the transaction, upon restart the service, SQL Server undoes all incomplete transactions during Recovery Process.

Consistency: This refers the state of the data (or database). When a transaction needs to be started on data, data needs to be in a consistence state in order to access. In SQL Server, if data is being modified, with default behavior, the state of the data is not consistence and transaction has to wait, unless different isolation level is used. SQL Server allows to start the transaction once data is consistence and transaction goes through all integrity rules set and completes bringing data into another consistence state.

Isolation: This refers controlling data access as transaction expects. Basically, if transaction expects to get exclusive access even for data reading, this property makes sure that data is in required consistency level for accessing. Once allowed, if other concurrent transactions require same data, access is allowed based on the way data is being accessed by the first transaction. SQL Server offers two methods for handling isolation: locking and versioning. Both locking and versioning are available in on-premises SQL Server installation and default is locking. Locking makes sure that transaction cannot read data if it is inconsistence. Versioning is bit different, if data is inconsistence, it allows to read the previous consistence state of data. This is the default for Azure SQL Databases. This can be controlled by implementing different isolation levels with transactions.

Durability: Once the transaction is committed, it is considered as durable even with a disaster. SQL Server uses write-ahead mechanism for handling this, it makes sure committed transactions are written to the transaction log before writing it to data files. If something happens before writing committed data to the data file, still data can be recovered using info recorded in the log file. Generally, at the restart of SQL Server service, it checks the transaction logs for rolling forward transactions that are written to log but data files and rolling back transaction that are incomplete.

No comments: