monomelodies / dbmover
PHP-based database versioning tool
Requires
- dariuszp/cli-progress-bar: ^1.0
- monomelodies/kingconf: ^1.0
Requires (Dev)
- monomelodies/gentry: ^0.8
README
A PHP-based database versioning tool.
Installation
Composer (recommended)
composer require monomelodies/dbmover
Manual
- Download or clone the repository;
- There is an executable
dbmover
in the root.
Vendor support
dbMover currently supports the MySQL and PostgreSQL database engines. Support for SQLite is sort-of planned for the near future. If you have access to MSSQL or Oracle (or yet another database) and would like to contribute, you're more than welcome! See the end of this readme.
Design goals
Web applications often work with SQL databases. Programmers will layout such a database in a "schema file", which is essentially just SQL statements. The problem arises when, during the course of development or an application's lifetime, changes to this schema are required. This involves manually applying the changes to all developers' test databases, perhaps a staging database and eventually the production database(s).
Doing this manually is tedious and error-prone. Remembering to write migrations for each change is also tedious, and keeping track of which migrations have already been applied (or not) is error-prone too (real life use case: importing older versions of a particular database to resolve a particular issue, and the migration "registry" itself being out of date).
dbMover automates this task for you by just looking at the central, leading, version controlled schema files and applying any changes required.
Usage
In the root of your project, place a dbmover.json
file. This will contain the
settings for dbMover like connections, databasename(s) and the location of your
schema file(s).
The format is as follows:
{ "dsn": { "user": "yourUserName", "password": "something secret", "schema": ["path/to/schema/file.sql"], "ignore": ["/regex/"] } }
The contents of "dsn"
are a bit driver-specific, but will usually be along
the lines of "engine:host=host,dbname=name,port=1234"
, with one or more being
optional (defaulting to the engine defaults). This is the exact same string that
PHP's PDO
constructor expects.
The file names of the schemas must be either relative to the directory you're running from (recommended, since typically you want to keep those in version control alongside your project's code) or absolute.
Best practice: leave the config file out of source control, e.g. by adding it to
.gitignore
. The database connection credentials will change seldom (if ever) so setting this up should mostly be a one-time manual operation. This way your development database can use a throwaway password, and the production database can use something much stronger, use a host different thanlocalhost
etc.
The optional "ignore"
entry can contain an array of regexes of objects to
ignore during the migration (e.g. when your application automatically creates
tables for cached data or something). The regular expressions are injected into
PHP's preg_match
verbatim so should also contain delimiters and can optionally
specify pattern modifiers like "/i"
. They are checked for all objects (tables,
views, procedures etc.).
After defining the config file, run the executable from that same location:
vendor/bin/dbmover
That's it - your database should now be up to date with your schema(s).
Adding tables
Just add the new table definition to the schema and re-run.
Adding columns
Forgot a column in a database? No problem, just add it in your schema and re-run dbMover.
Note that new columns will always be appended to the end of the table. Some
database drivers (like MySQL) support the BEFORE
keyword, but e.g. PostgreSQL
doesn't and dbMover is as database-agnostic as possible.
If your vendor supports it and you really need to add a column at a certain
position, use an IF
block as described below.
Altering columns
Just change the column definition in the schema file and dbMover will alter it for you. This assumes the column retains the same name and whatever data it contains is compatible with the new type (or can be discarded); for more complex alterations, see below.
Dropping columns
Just remove them from the schema and re-run.
Dropping tables
Just remove them from the schema and re-run.
Foreign key constraints and indexes
Depending on your database vendor, it may be allowed to specify these during
table creation. That would mean dbMover never sees them if the table already
exists! So don't do that. Instead, create these constraints after table
creation using ALTER TABLE
statements. The exception is a primary key on an
AUTOINCREMENT
/SERIAL
field since table creation would fail when the
constraint is missing. However, these columns rarely if ever change, and even if
they do you can manually apply the change using an IF
block with the correct
check on pre/post state.
An example in MySQL:
-- This is *wrong*: CREATE TABLE foo ( id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, bar INTEGER, INDEX (bar), CONSTRAINT FOREIGN KEY (bar) REFERENCES buzz(id) ) ENGINE='InnoDB'; -- Write it like this instead: CREATE TABLE foo ( id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, bar INTEGER, ) ENGINE='InnoDB'; ALTER TABLE foo ADD INDEX (bar), ADD CONSTRAINT FOREIGN KEY (bar) REFERENCES buzz(id);
In PostgreSQL:
-- This is *wrong*: CREATE TABLE foo ( id SERIAL PRIMARY KEY, bar INTEGER REFERENCES buzz(id) ); CREATE INDEX foo_bar_idx ON foo(bar); -- Write it like this instead: CREATE TABLE foo ( id SERIAL PRIMARY KEY, bar INTEGER, ); CREATE INDEX foo_bar_idx ON foo(bar); ALTER TABLE foo ADD CONSTRAINT FOREIGN KEY (bar) REFERENCES buzz(id);
The reason is that during the migration, dbMover will DROP
all existing
constraints and recreate them during the migration to ensure only the
constraints intended by your schema exist on the database. This is the most
surefire way to accomplish this, as opposed to analysing your SQL statements,
accounting for vendor-specific syntax etc.etc.etc.
For extremely large tables, recreating indexes might considerably slow down the moving process. C'est la vie.
Loose ALTER
statements
Sometimes you need to ALTER
a table after creation specifically, e.g. when it
has a foreign key referring to a table you need to create later on. For example,
a blog_posts
table might refer to a lastcomment
, while blog_post_comments
in turn refers to a blog_id
on blog_posts
. Here you would first create the
posts table, then the comments table (with its foreign key constraint), and
finally add the constraint to the posts table.
Each ALTER TABLE
statement is run in isolation by dbMover in the order
specified in the schema file, so just (re)add the foreign key where you would
logically add it if running against an empty database. The statement will either
fail silently (if the column doesn't exist or is of the wrong type pending a
migration) or will succeed eventually.
More complex schema changes
Some things are hard(er) to automatically determine, like a table or column
rename. You should wrap these changes in IF
blocks with a condition that will
pass when the migration needs to be done, and will otherwise fail.
Depending on your database vendor, it might be required to wrap these in a
"throwaway" procedure. E.g. MySQL only supports IF
inside a procedure. The
vendor-specific classes in dbMover handle this for you. Throwaway procedures are
prefixed with tmp_
.
Note that the exact syntax of conditionals (ELSE IF
, ELSIF
) is also
vendor-dependent. The exact way to determine whether a table needs renaming is
also vendor-dependent (though in the current version dbMover only supports
ANSI-compatible databases anyway, so you can use INFORMATION_SCHEMA
for this
purpose).
Inserting default data
To prevent duplicate inserts, these should be wrapped in an IF NOT EXISTS ()
condition like so:
IF NOT EXISTS (SELECT 1 FROM mytable WHERE id = 1) THEN INSERT INTO mytable (id, value1, value2, valueN) VALUES (1, 2, 3, 4); END IF;
The order of things
While your schema file should run perfectly when called against an empty database, if the database already contains objects dbMover will reorder the statements as best it can. In particular:
- All statements beginning with
IF
are hoisted. - All
ALTER TABLE
statements are also hoisted. - The above two are run in isolation.
- Indexes and foreign key constraints are dropped.
- All
CREATE TABLE
statements are hoisted and analysed.- If the table should be created, issue that statement verbatim.
- If the table needs updating, issue the required update statements.
- Drop existing views, routines and triggers.
- Run all other statements (
CREATE PROCEDURE
,CREATE VIEW
etc.). - Re-run step 3. Note that
ALTER TABLE
statements will silently fail, and presumably some or all conditions that werefalse
will now evaluate totrue
and vice versa. - Attempt to drop tables that were not or not longer present in the schema.
Transferring data from one table to another
This is sometimes necessary. In these cases, you should use IF
blocks and
query e.g. INFORMATION_SCHEMA
(depending on your vendor) to determine if the
migration has already run.
Important: the
IF
should evaluate tofalse
if the migration has already run to avoid running it twice. Take care here.
A simplified and abstract pseudo example:
IF NOT EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE ...) THEN INSERT INTO target SELECT * FROM original; END IF;
Caveats
Be neat
dbMover assumes well-formed SQL, where keywords are written in ALL CAPS. It does not specifically validate your SQL, though any errors will cause the statement to fail and the script to halt (so theoretically they can't do much harm...). dbMover will tell you what error it got.
By "be neat", we mean write CREATE TABLE
instead of create Table
etc.
dbMover also doesn't recognise e.g. MySQL's escaping of reserved words using backticks. Just don't do that, it's evil.
For the
ignore
regexes, you can perfectly use "strange" object names if you need to since these are regexed verbatim.
For hoisting, it is assumed that statements-to-be-hoisted are at the beginning
of lines (i.e., e.g. /^IF /
in regular expression terms).
Databases may or may not be case-sensitive; keep in mind that dbMover is case-sensitive, so just be consistent in your spelling.
Storage engines and collations
dbMover ignores these. The assumption is that modifying these are a risky and very rare operation that you want to do manually and/or monitor more closely anyway.
Test your schema first
Always run dbMover against a test database for an updated schema. Everybody makes typos, you don't want those to mangle a production database. Preferably you'd test against a copy of the actual production database.
Bring down your application during migration
Depending on what you're requesting and how big your dataset is, migrations might take a few minutes. You don't want users editing any data while the schema isn't in a stable state yet!
How your application handles its down state is not up to dbMover. A simple way would be to wrap the dbMover call in a script of your own, e.g.:
touch down vendor/bin/dbmover rm down
...and in your application something like:
<?php if (file_exists('down')) { die("Application is down for maintainance."); } // ...other code...
This is of course an extremely simple example but should point you in the right direction.
Backup your database before migration
If you tested against an actual copy and it worked fine this shouldn't be necessary, but better safe than sorry. You might suffer a power outage during the migration!
Besides, the simple fact that the script runs correctly doesn't necessarily mean it did what you intended. Always verify your data after a migration.
Note on PostgreSQL
PostgreSQL's INFORMATION_SCHEMA
aliases contain more data than you would
define in a schema file, especially for routines (its native functions are also
exposed there). Since these native functions are normally owned by the
postgresql
user, dbMover will try to drop them and just silently fail. So
always run dbMover as an actual database user, not as a master user (this
goes for MySQL as well, although the above problem isn't applicable there).
"Master" or "root" users have privileges normal users don't, so theoretically
you could include something like DROP unrelated_database;
in the schema
file. You'd only have yourself to blame of course, but best to avoid the risk.
Sequences in PostgreSQL
PostgreSQL stores "auto_increment" unique ID's in so called "sequences" (to all intents and purposes, a "special" separate table). To allow migrations to run correctly, you should write these in the following manner:
CREATE SEQUENCE IF NOT EXISTS mytable_id_seq; CREATE TABLE mytable ( id BIGINT NOT NULL PRIMARY KEY DEFAULT NEXTVAL('auth_id_seq'::regclass), -- ...other table info... );
One can also use a shorthand SERIAL
or BIGSERIAL
type that automates this
process, but that doesn't play well with DbMover's analysis and table
modification statements.
Contributing
SQLite support is sort-of planned for the near future, but is not extremely high on my priority list (clients use it occasionally, but it's really not a database suited very well for web development).
MSSQL and Oracle are valid choices, but we don't have access to them. If you do and you fancy porting the database-specific parts of dbMover, by all means fork the repository and send us a pull request!
There's no formal style guide, but look at the existing code and please try to keep your coding style consitent with it. If you work on vendors I can't/won't support, please also make sure you add unit tests for those.
A todo for 0.4 is also to issue warnings if certain statements produced errors both before and after table migration, since that means they probably didn't do what you intended.