Transaction is a sequence of one or more SQL operations treated as a logical unit. Transaction could be completed either entirely
successfully or not executed at all. Transactions and locking are important aspects of database for supporting multiple users.
Into this tutorial we’ll take a look at
Transaction must have the following properties called
ACID, which means the following:
- Atomicity: means that transaction must happen as a single unit and exactly once. SQL statements contained in a transaction
make up a single unit of work. And if one part of the transaction fails, then the entire transaction fails, and the database
state is left unchanged.
E.g. when we transfer money from one account to another it’s not acceptable to debit one account and do not credit the second account because of any errors. If any error occurs, money should be returned to the first(credit) account.
- Consistency: means that at the end of transaction the database should be left in consistent state.
E.g. after transferring $10 from one account to another the funds in the first account should be decreased by $10 and funds in the second account should be increased by $10;
- Isolation: means that each transaction must be independent of all other transaction, no matter how many other transactions are there in the system. In other words, DB should guarantee that result of execution parallel transactions should be the same as if transactions were executed serially, one after another. Isolation is typically defined at database level as a property that describes how/when the changes made by one operation become visible to other. Relational Databases support the following isolation levels:
- Read uncommitted
- Read committed
- Repeatable read
- Durability: means that once a transaction has been committed, the database changes are preserved, even if the machine on
which the database software is running crashes later. This is achieved by using a transaction log file.
E.g. Once money has been successfully transferred between accounts, it must stay transferred, even if the power fails and the machine with the database has an uncontrolled power down.
How to start transaction
We took a look at such transaction properties as
ACID. Now let’s take a look at how to create, commit and rollback transactions
To start transaction,
BEGIN key word should be used.
COMMIT is used for saving the result of one’s work.
ROLLBACK is used
for aborting transaction without saving the result of one’s work. Here’s a snippet of code with statement which is wrapped into
BEGIN CREATE TABLE vehicle ( vehicle_id SERIAL NOT NULL, make VARCHAR(100) NOT NULL, model VARCHAR(100) NOT NULL, price NUMERIC(5) NOT NULL ); INSERT INTO vehicle (make, model, price) VALUES ('Toyota', 'Camry', 25560); COMMIT;
Here’s an example with rolled back transaction:
BEGIN INSERT INTO vehicle (make, model, price) VALUES ('Audi', 'A4', 42825); INSERT INTO vehicle (make, model, price) VALUES ('Acura', 'ILX', 30840); INSERT INTO vehicle (make, model, price) VALUES ('BMW', '5 Series ', 67470); ROLLBACK;
If we select the data from the table we’ll see that records for
MBW were not inserted into the
Isolation determines how transaction integrity is visible to other users and systems. As was mentioned early, there 4 isolation levels defined in SQL standard:
Weak isolation levels
- Read committed
- Read uncommitted
- Repeatable Read
Have lower overhead and consistency guarantee but higher concurrency.
Strict isolation levels
Has a lot of overhead and low concurrency.
PostgreSQL supports only
Read Committed and
Serializable isolation levels because
this is the only sensible way to map the standard isolation levels to the multiversion concurrency control architecture.
Read uncommitted Isolation Level
Read uncommitted isolation level is the lowest isolation level. It guarantees only the absence of lost updates. In this level
transactions can see changes to data made by other transactions that are not yet committed. During
SELECT query execution we can
retrieve data which actually doesn’t exists because transaction was rolled back.
Read Committed isolation level
Read Committed isolation level is used by default in many relational databases. Into this isolation level
SELECT query may
see only the data committed before the
SELECT query begun. If any uncommitted data or committed changes occur during query
SELECT statement will never see it. It happens so because
SELECT query sees a snapshot of the database as of the
instant the query begins to run. But
SELECT query may see the changes which were made within its own transaction, even though
they are not yet committed.
DELETE queries behave into the same way as
SELECT: they see only the data committed before the
DELETE queries executed. Because of the such rule it’s possible to get inconsistent data during snapshot.
Repeatable read Isolation Level
Repeatable read means that a transaction re-reads data it has previously read and finds that data has been modified by another
E.g. When a transaction (e.g. named
T1) reads any data this data will be blocked from modification by other transactions
T1 will be finished.
Serializable Isolation Level
Serializable isolation level is the strictest transaction isolation level. Into this isolation level transactions are
completely isolated from each other and each transaction executes one after another, serially, rather than concurrently (into the
way as if other transactions are not exists). In case when application uses
Serializable isolation level must be ready for
retrying transactions due to serialization failures.
Incorrectly chosen level of isolation may impact significantly on database performance or may lead to the inconsistency of data
during selection or update. To decrease the impact of isolation level on performance SQL standard defines isolation levels in
undesirable phenomena which may happen in multiuser database. Here’s a list of major problems:
- Dirty read: transaction can read data which was written by another transaction even if another transaction is not committed
Dirty readpermits other transactions to see data that has not yet been committed to the database. This means they can see changes that are later discarded, because of the
- Non-repeatable read: is very similar to the
Dirty Readbut is more restrictively defined. When rereading data, a transaction can find that the data has been modified by another transaction that has just committed. The same query executed twice can return different values for the same rows.
- Phantom read: is similar to non-repeatable read, but it is related to new data, created by another transaction. The same query executed twice can return different numbers of records.
Isolation levels vs READ phenomena
Change Isolation level
READ COMMITTED isolation level is set by default in
PostgreSQL. If need it could be changed to
Serializable isolation level:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Additional reading resources
- Transaction Isolation (official PostgreSQL documentation)
- Isolation (database systems)
- Introduction to Databases Videos from MOOC available here