1. Technology

What is a Transaction?

A Quick Overview

By

This article is an overview of transactions in databases. They can also be used as a general principle in non database programming. If you're new to databases, you can read more at In databases, it sometimes happens that multiple tables have to be updated at the same time. For example a payroll payment system may need to update these tables when an employee is paid:
  • Employee record Table - holds a record for each employee.
  • Accounts Table - holds payment details by department
  • Auditing Table - holds all payment details for auditing purposes
The idea is when the employee is paid that his record is updated with the paid amounts, a record is added to the accounts table and one to the auditing table. That way the company can account for the salary payment. It all adds up and is consistent. The problem arises if something goes wrong while writing to these three tables. Maybe a hardware failure or a program bug, say a duplicate data row added or a crash due to hardware. If you work with databases, you will experience a failure at some time or other. You are most likely to get problems during development if a SQL query goes wrong. Say it takes too long to run and you have to kill the process.

Tip

If you working on a database and are a bit wary about doing a change do it inside a transaction. Here is a typical update SQL query.
 update Payroll set Salary = Salary *1.1 -- 10% pay increase
 
This applies to all rows in the Payroll table. If you are the nervous type or just want to be safer, issue a
 begin tran 
then do the update and if it gets the correct results, finish off with a
 commit tran
 
Of course if you somehow forgot the where clause and managed to delete all rows then instead of the commit you should do a rollback tran which is an undo. Everything that has been done since the begin tran is undone.

Why Not Use Transactions for Everything?

Because transactions are expensive in processing time to do. The server has to set aside resources to deal with both success and failure and then deal with one or the other.

With untransacted database updates there is no "undo button" and inconsistencies can arise when one table is updated but others aren't. If the employee record has been changed but the other two haven't then there are going to be inconsistencies and problems sooner or later- all of the money won't be accounted for. So all relational databases support transactions. If all three tables are updated within a transaction then two outcomes are possible.

  1. All three updates succeed. The data is committed.
  2. Something goes wrong. Nothing is committed and the database is rolled back to its state before the update.

Using ACID

There are four principles regarding transactions, known as ACID. Thi sis an abbreviation short for
  • Atomicity
  • Consistency
  • Isolation
  • Durability
Atomicity means that either all of the tasks of a transaction are performed or none of them are. It works or it fails so either all tables are updated or none are.

Consistency means that database was in a legal state when the transaction began and when it ended. If there are constraints on columns then these cannot be violated. These tend to be entity integrity (the absence of duplicate rows), referential integrity (consistency between related tables) and domain integrity (obedience to limits of the data domain. For instance you might have a check on a date of birth age field preventing birthdays say 120 years before or after today so a living person cannot have a date of birth in 1776 or 2070.

Isolation means other operations should not be able to see the transaction half way through. It must be impossible to read the accounts table in a different operation immediately after the employee record has been updated but before the accounts table row is added. No operation outside the transaction can ever see the data in an intermediate state. This is frequently done by the server locking the table or more likely the rows that are being updated.

Durability means that once the transaction has been committed successfully it cannot be undone. This means it will survive system failure, and that the database system has checked the integrity constraints and won't need to abort the transaction. Many databases write all transactions into a log that can be played back to recreate the system state just before it failed. In that case a transaction is only considered committed after it is safely in the log.

Not just Databases

Using the idea of a transaction as a model is quite a good one - you might use it if you are implementing an undo facility. The hardest part is providing a way of backing out partial changes. For example if you are changing a struct or array member then you need to have a copy of that member before you make the change. In C++/C# you could create a transactable class and create derived classes from that. Once a transaction is started, a flag is set in the class and all assignments through accessor functions (C++) or properties (C#) will copy the previous value to a temporary location before the assignment. Eventually a rollback or commit is done and the instance is set accordingly. It might be that the assignment isn't performed until commit.

Atomicity is guaranteed by the commit mechanism. Consistency depends on how you verify the object's state but using accessor/properties with range checking etc goes a long way. Isolation depends on whether you use multi-threading etc and guard the transaction code perhaps with a critical section (on Windows).

Conclusion

That concludes this article. For lots more about SQL and databases in general, we recommend our sister site's coverage of databases and SQL at databases.about.com.

©2014 About.com. All rights reserved.