Make your database migration easier with Liquibase

Applications can manage data using various technologies, and how many different ways can you think of other than using the good old databases? The important thing to understand here is application evolves - and so does its state. The evolution of the application state is not just pertained to the actual data, but also the schema as well. Imagine you shipped a new feature to production on a Friday evening, but a new table that is required for this feature to work is not available in production yet. It's a developer nightmare! Only if there was some way where the deployment of schema changes went hand-in-hand with application code changes.

Enter Liquibase!

What is Liquibase?

Liquibase is a database migration tool, written in Java, that lets you track and deploy changes made to a database schema. The changes could be concerning the schema or the actual data itself. It is a command-line application that operates in a shell. You could interact with Liquibase in 2 ways - manually or automated. The former approach involves yourself executing CLI commands whereas the latter approach involves automating Liquibase execution in your build process - using the CICD pipeline or using a Liquibase plugin for your build tool such as maven. Liquibase is supported by most of the widely used database vendors mentioned here.

What are the core objects of Liquibase?

Now that you know what Liquibase is, let us go a bit deeper to familiarize ourselves with the core objects used by the Liquibase engine to do what it is supposed to be doing - track and deploy schema changes. Liquibase lets the developers specify the database schema changes in a declarative format such as JSON, XML, YAML or even SQL. This file is called a Changelog. This changelog file can be committed to your code repository. This can turn out to be extremely powerful when you want to integrate Liquibase execution into your CICD pipelines. Changelog consists of another core Liquibase object which is called Changeset. Please see a sample changelog in SQL format below.

--changeset rvincent:4
create table authority (authority_id uuid not null, authority_name varchar(255) not null);

--changeset rvincent:5
create table user_authority (user_id uuid not null, authority_id uuid not null);

--changeset rvincent:6
ALTER TABLE login_user DROP COLUMN IF EXISTS role;

Changeset is a unit of change that is executed against your database. This could be your DDL scripts, DML scripts etc. Each changeset is identified by an ID, the author name and the file name in which the changeset is located. Even though you can map multiple schema changes under one changeset, it is best practice to specify one change per changeset, to avoid failed auto-commit that could leave your database in an undesired state. And finally, we have the third core Liquibase object called ChangeType. ChangeType object represents the type of change we want to perform against the database. For eg: there is createTable ChangeType whenever we execute a CREATE TABLE script as part of our changeset. ChangeType object enables the Liquibase to decouple from various database vendors, thereby enabling the same changelog to get applied to multiple database systems and still function as expected. You may find a sample changeset below.

--changeset rvincent:1
create table login_user (user_id uuid not null, email varchar(255), first_name varchar(255), last_name varchar(255), password varchar(255), role varchar(255), primary key (user_id))

Please note that the changeset is identified by the author name and a version, separated by a colon.

How does Liquibase track schema changes?

By now, you should be knowing, at the very least, that Liquibase tracks and deploy schema/data changes to the database. How does it do this? Long answer short - Tracking tables. You are aware by now that you should write changesets in your changelog. But you should also know that each changeset is immutable. If you initially had a changeset that created a table PERSON with columns NAME and AGE, and now you wish to add a new column ADDRESS, instead of modifying the initial changeset, you should be adding a new changeset that adds the column ADDRESS to PERSON table. Else, Liquibase will throw an exception during execution. How would Liquibase know all the schema changes that have been executed previously to throw such an exception? Well, it uses a tracking table called DATABASECHANGELOG. This is one of the 2 kinds of tables employed by Liquibase to track and deploy schema changes. Liquibase would add one record to DATABASECHANGELOG for each change mentioned in the changelog. There is no primary key for DATABASECHANGELOG but each record is uniquely identified by the changeset ID, author name and filename. When we update our changelog by adding new changesets, Liquibase would go through the DATABASECHANGELOG and would know that only the newly added changesets need to be executed. If any of the changesets are modified, Liquibase would know that from DATABASECHANGELOG and would throw an exception accordingly. Since Liquibase keeps track of all changes being made to the database, it is easy to perform a rollback to a previous version.

How does Liquibase deploy schema changes?

I mentioned above that Liquibase uses 2 kinds of tracking tables, and we already know what one table does - DATABASECHANGELOG. The other kind of tracking table used by Liquibase is called DATABASECHANGELOGLOCK. We know Liquibase updates the database state. What if there is another user who is performing a database update against the same record at the same time so that it would override any changes made to the record by Liquibase? Liquibase needs to lock the record to prevent this scenario. It would update LOCKED column in DATABASECHANGELOGLOCK to 1, indicating the record is locked for update. Once the change is made, the column is updated back to 0, effectively releasing the lock. There may come a situation where we have to manually release the lock if Liquibase fails during its execution leaving the record to be in LOCKED state. We could issue the following CLI command to release the lock manually - liquibase releaseLocks.

Please note that the 2 tracking tables will be created by Liquibase in your database during the execution of your changelog for the very first time!

Wrapping up

You can finally see the powerful of Liquibase. Release management in Software Engineering is a tedious process. Liquibase enables the developers to version control the database state and track its changes over time thereby ensuring that software is released in a safer and faster manner, along with database-related changes. The fact that it is version-controlled enables multiple developers to collaborate simultaneously and work on the changes. If you feel excited about this tool, don't wait long before you try this out. The link for downloading the tool can be found here.

Renil Vincent's blog