SQL Transactions and ACID
Transaction is a sequence of one or more SQL operations treated as a logical unit. Transaction could be completed either entirely
successfully or not executed at all. Transactions and locking are important aspects of database for supporting multiple users.
Into this tutorial we’ll take a look atTransaction
, ACID
, Isolation Levels
.
ACID
Transaction must have the following properties called ACID
, which means the following:
- Atomicity: means that transaction must happen as a single unit and exactly once. SQL statements contained in a transaction
make up a single unit of work. And if one part of the transaction fails, then the entire transaction fails, and the database
state is left unchanged.
E.g. when we transfer money from one account to another it’s not acceptable to debit one account and do not credit the second account because of any errors. If any error occurs, money should be returned to the first(credit) account. - Consistency: means that at the end of transaction the database should be left in consistent state.
E.g. after transferring $10 from one account to another the funds in the first account should be decreased by $10 and funds in the second account should be increased by $10; - Isolation: means that each transaction must be independent of all other transaction, no matter how many other transactions are there in the system. In other words, DB should guarantee that result of execution parallel transactions should be the same as if transactions were executed serially, one after another. Isolation is typically defined at database level as a property that describes how/when the changes made by one operation become visible to other. Relational Databases support the following isolation levels:
- Read uncommitted
- Read committed
- Repeatable read
- Serializable
- Durability: means that once a transaction has been committed, the database changes are preserved, even if the machine on
which the database software is running crashes later. This is achieved by using a transaction log file.
E.g. Once money has been successfully transferred between accounts, it must stay transferred, even if the power fails and the machine with the database has an uncontrolled power down.
How to start transaction
We took a look at such transaction properties as ACID
. Now let’s take a look at how to create, commit and rollback transactions
in PostgreSQL
.
To start transaction, BEGIN
key word should be used. COMMIT
is used for saving the result of one’s work. ROLLBACK
is used
for aborting transaction without saving the result of one’s work. Here’s a snippet of code with statement which is wrapped into
transaction:
BEGIN
CREATE TABLE vehicle
(
vehicle_id SERIAL NOT NULL,
make VARCHAR(100) NOT NULL,
model VARCHAR(100) NOT NULL,
price NUMERIC(5) NOT NULL
);
INSERT INTO vehicle (make, model, price)
VALUES ('Toyota', 'Camry', 25560);
COMMIT;
Here’s an example with rolled back transaction:
BEGIN
INSERT INTO vehicle (make, model, price)
VALUES ('Audi', 'A4', 42825);
INSERT INTO vehicle (make, model, price)
VALUES ('Acura', 'ILX', 30840);
INSERT INTO vehicle (make, model, price)
VALUES ('BMW', '5 Series ', 67470);
ROLLBACK;
If we select the data from the table we’ll see that records for Audi
, Acura
and MBW
were not inserted into the vehicle
table.
Isolation levels
Isolation determines how transaction integrity is visible to other users and systems. As was mentioned early, there 4 isolation levels defined in SQL standard:
Weak isolation levels
- Read committed
- Read uncommitted
- Repeatable Read
Have lower overhead and consistency guarantee but higher concurrency.
Strict isolation levels
- Serializable
Has a lot of overhead and low concurrency.
PostgreSQL
supports only Read Committed
and Serializable
isolation levels because
this is the only sensible way to map the standard isolation levels to the multiversion concurrency control architecture.
Read uncommitted Isolation Level
Read uncommitted isolation level is the lowest isolation level. It guarantees only the absence of lost updates. In this level
transactions can see changes to data made by other transactions that are not yet committed. During SELECT
query execution we can
retrieve data which actually doesn’t exists because transaction was rolled back.
Read Committed isolation level
Read Committed isolation level is used by default in many relational databases. Into this isolation level SELECT
query may
see only the data committed before the SELECT
query begun. If any uncommitted data or committed changes occur during query
execution, SELECT
statement will never see it. It happens so because SELECT
query sees a snapshot of the database as of the
instant the query begins to run. But SELECT
query may see the changes which were made within its own transaction, even though
they are not yet committed.
UPDATE
and DELETE
queries behave into the same way as SELECT
: they see only the data committed before the UPDATE
and DELETE
queries executed. Because of the such rule it’s possible to get inconsistent data during snapshot.
Repeatable read Isolation Level
Repeatable read means that a transaction re-reads data it has previously read and finds that data has been modified by another
transaction.
E.g. When a transaction (e.g. named T1
) reads any data this data will be blocked from modification by other transactions
until T1
will be finished.
Serializable Isolation Level
Serializable isolation level is the strictest transaction isolation level. Into this isolation level transactions are
completely isolated from each other and each transaction executes one after another, serially, rather than concurrently (into the
way as if other transactions are not exists). In case when application uses Serializable
isolation level must be ready for
retrying transactions due to serialization failures.
Incorrectly chosen level of isolation may impact significantly on database performance or may lead to the inconsistency of data
during selection or update. To decrease the impact of isolation level on performance SQL standard defines isolation levels in
terms of undesirable phenomena
which may happen in multiuser database. Here’s a list of major problems:
- Dirty read: transaction can read data which was written by another transaction even if another transaction is not committed
yet,
Dirty read
permits other transactions to see data that has not yet been committed to the database. This means they can see changes that are later discarded, because of theROLLBACK
command. - Non-repeatable read: is very similar to the
Dirty Read
but is more restrictively defined. When rereading data, a transaction can find that the data has been modified by another transaction that has just committed. The same query executed twice can return different values for the same rows. - Phantom read: is similar to non-repeatable read, but it is related to new data, created by another transaction. The same query executed twice can return different numbers of records.
Isolation levels vs READ phenomena
Change Isolation level
READ COMMITTED
isolation level is set by default in PostgreSQL
. If need it could be changed to Serializable
isolation level:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Additional reading resources
- Transaction Isolation (official PostgreSQL documentation)
- Isolation (database systems)
- Introduction to Databases Videos from MOOC available here