You have seen that relational database requires to divide information into columns and rows – this gives us tables with records as a result. A record in such a table has a given structure. Sometimes there is a need to store related information into multiple tables – there can be many reasons for that:
Records are then connected by foreign keys which contain information about relationships between tables.
Let’s think about a simple scenario based on your database structure – you want to create a form, where the users of your application could fill (in a single step) the information about a person and information about the address of this person. This is quite helpful because users do not need to create the address record beforehand.
In your database schema, you have the table person
and the table location
. After the examination of columns in these two
tables you find out that the table person
has the column id_address
which also has a defined
foreign key pointing into the table location
.
This means that in the person
table you can define a link for each record to the location
table using the value from
the column id_location
of the table location
. This also means that multiple persons can share one address (imagine that
they live in one household as a family).
Because te person links to the address, you have to insert information about the address first. Then somehow find the ID assigned to that new row (remember that it is auto-generated), store the ID in local variable of the script. Then you can insert the the row with person information which will include the ID of the new address. The only problem is how to find the auto-generated ID.
If you work with the database alone (when you develop an application), the state of the database does not change unless
you do something. In a real world situation, there can be many other users changing the database through your application
at any moment. This basically means that you cannot INSERT
the data and then SELECT
e.g. the highest ID from the table.
The problem is that many other users’ INSERT
commands could be executed between your INSERT
and your SELECT MAX()...
command. You have to ask the database system specifically for ID of
last insertion in in your session. Session is
basically a connection between your application and the SQL server. Session is initiated during the startup of
your application.
For this purpose the database has a mechanism which will return the last ID generated by an auto-increment
sequence requested by your INSERT
in your database session. In Postgre SQL, each auto-increment sequence
has a name – you can find that name in the default value of the id column:
There is a CURRVAL(seq_name)
SQL function which will return the last value of the auto-increment sequence in Postgre SQL and
this function takes as an argument the name of the sequence. Execute both commands at once in the Adminer:
Now you can search for the record with the returned ID – you should find a row with the city
column set to Bucharest
.
Notice that these two queries have to be executed in a single execution and therefore they have to be separated by a semicolon.
In PHP with the PDO library you can use the lastInsertId($sequenceName)
method instead of executing separate SQL SELECT
command:
The lastInsertId($sequenceName)
method picks the correct SQL command depending on the SQL server that you are using.
On PostgreSQL, this will be CURRVAL(seq_name)
. For example On MySQL this will be the LAST_INSERT_ID()
function
which does not take any arguments and you do not have to pass $sequenceName
into lastInsertId()
function in that case.
In the above example, I have used two variables $stmt1
and $stmt2
for the queries. It is not necessary, because
I don’t need $stmt1
after $stmt2
, so you can use only a single variable and overwrite it. In fact, it is
better to reuse the same variable if possible.
You already know that you have to enclose database communication in try-catch
blocks. But what happens when the
first query (insert an address) is accepted and the second one (insert a person) is not? There can be more reasons than
you think for the second INSERT
command to fail:
NULL
value).Any of the above would result in a state in which you have the address in the database and not the person.
The user would try to insert the information again, but there will be an unused address record in the location
table.
Even worse, it might be impossible to insert the address at all if it would violate record uniqueness.
To prevent this and other inconsistent states you want to enclose both queries into a transaction. This will make sure that both queries are either accepted by the database system and both rows are inserted into their tables or no row is inserted at all after the transaction ends.
Either SQL query from my example may raise an exception and the program execution will jump into the catch
block. When this
happens, the database will revert to the previous state thanks to rollback
command. On the other hand, when everything goes
smooth, the changes are store permanently in the database using commit
command.
The database system executes the SQL commands in transaction right away – it does not wait until the commit
command.
The most important moment is when you begin
the transaction, at that moment the database begins to record
your changes and you can revert to that state using rollback
command. Your changes in database are available to you
during the transaction but no one else can see them.
Take the first PHP script above and complete it so that it inserts data into the location
and person
tables. Or create your own with full working forms. Make sure to use the version without transactions. Then try to break
the second INSERT
command (just make an error in the SQL command spelling, e.g. IxSERT
instead of
INSERT
). Observe the changes in the database. Then add the transaction commands (begin
, commit
, rollback
)
and again observe what changes are made to the database. Do you notice the change of behavior?
Can you explain why it changed?
If you run the the script without the transaction commands, the first INSERT
succeeds and the the location
will be inserted into the location
table. It will remain there even if the other insert fails.
When you add the transaction commands, the behavior will change. If the first INSERT fails, and
raises an exception, the script will jump to catch
statement and issue the rollback
command.
This will roll back all SQL commands from the beginning of the transaction – in this case the
insert into the location
table. Therefore the database will not contain the orphaned location record.
In this chapter I demonstrated how to insert multiple records which have some dependence among them. This requires using a last insert id value for the current database session. In the second part I described the importance of using transactions when inserting multiple rows.
PDO::lastInsertId()