mrjgreen / db-sync
Efficient MySQL comparison and sync utility
Installs: 37 314
Dependents: 0
Suggesters: 0
Security: 0
Stars: 292
Watchers: 36
Forks: 90
Open Issues: 11
Requires
- mrjgreen/database: ^2.5
- psr/log: ^1.0
- ramsey/array_column: ^1.1
- symfony/console: ^3.2
- symfony/debug: ^3.2
Requires (Dev)
- phpunit/phpunit: ^5.0
- satooshi/php-coveralls: ^1.0
This package is not auto-updated.
Last update: 2024-11-22 20:03:20 UTC
README
WARNING - This package modifies database tables. Use with extreme caution and back up databases before running.
Always perform a dry run (this is the default action) first before specifying the --execute (-e)
option.
What is it?
DbSync is a tool for efficiently comparing and synchronising two or more remote MySQL database tables.
In order to do this without comparing every byte of data, the tool preforms a checksum (MD5, SHA1, CRC32) over a range of rows on both the source and destination tables, and compares only the hash. If a block is found to have an inconsistency, the tool performs a checksum on each half of the block, recursively (down to a minimum block transfer size), until it finds the inconsistency.
Notes About Deletion
DbSync will only delete rows from the destination that no longer exist on the source when the --delete
option is specified.
Use this option with extreme caution. Always perform a dry run first.
If you use DbSync to synchronise a table which has row deletions on the source without using the --delete
option,
DbSync will find inconsistencies in any block with a deleted row on every run but will not be able to remove the rows from the target.
Installation
Via composer - run the following command in your project directory:
composer require mrjgreen/db-sync
Or use the packaged archive directly
wget https://github.com/mrjgreen/db-sync/raw/v3/db-sync.phar -O db-sync.phar
chmod a+x db-sync.phar
Optionally make the command available globally
sudo mv db-sync.phar /usr/bin/db-sync
Usage:
db-sync [options] <source> <target> <table>
Sync a mysql database table from one host to another using an efficient checksum algorithm to find differences.
Arguments:
source The source host ip to use.
target The target host ip to use.
table The fully qualified database table to sync.
Options:
-b, --block-size=BLOCK-SIZE The maximum block to use for when comparing. [default: 1024]
--charset=CHARSET The charset to use for database connections. [default: "utf8"]
-c, --columns=COLUMNS Columns to sync - all columns not "ignored" will be included....
-C, --comparison=COMPARISON Columns from the list of synced columns to use to create the...
-f, --config-file=CONFIG-FILE A path to a config.ini file from which to read values. [default: "dbsync.ini"]
--delete Remove rows from the target table that do not exist in the source.
-e, --execute Perform the data write on non-matching blocks.
-h, --help Show this usage information.
-H, --hash Specify the hash algorithm used to generate the comparison hash. [default: "md5"]
-i, --ignore-columns=IGNORE-COLUMNS Columns to ignore. Will not be copied or used to create the hash....
-I, --ignore-comparison=IGNORE-COMPARISON Columns to ignore from the hash. Columns will still be copied....
-p, --password[=PASSWORD] The password for the specified user. Will be solicited on the tty if...
-u, --user=USER The name of the user to connect with. [default: "USER"]
-s, --transfer-size=TRANSFER-SIZE The maximum copy size to use for when comparing. [default: 8]
--target.user=TARGET.USER The name of the user to connect to the target host with if different...
--target.table=TARGET.TABLE The name of the table on the target host if different to the source.
--target.password=TARGET.PASSWORD The password for the target host if the target user is specified....
--where=WHERE A where clause to apply to the tables.
-v, --verbose Enable verbose output.
-q, --quiet Disable output, overrides "verbose" option.
Examples
Note - All of these commands will perform a "dry-run" only. To execute the insert/update statement against the target database, you must specify the --execute (-e) option
Example 1
Sync the table web.customers
from one host to another (non-standard port on target):
db-sync --user root --password mypass 127.0.0.1 111.222.3.44:13306 web.customers
Example 2
Sync the table web.customers
from one host to another, deleting rows from the target that no longer exist on the source (using SHA1 hash for comparison):
db-sync --user root --password mypass --hash sha1 --delete 127.0.0.1 111.222.3.44 web.customers
Example 3
Sync the table web.customers
from one host to another using different credentials:
db-sync --user root --password mypass --target.user admin --target.password password 127.0.0.1 111.222.3.44 web.customers:
Example 4
Sync only the email
and name
fields from the table web.customers
:
NB. The primary key will automatically be included in the column set
db-sync --user root --password mypass 127.0.0.1 111.222.3.44 web.customers -c email -c name
Example 5
Sync every column except for the updated_at
fields from the table web.customers
:
db-sync --user root --password mypass 127.0.0.1 111.222.3.44 web.customers -i updated_at
Example 6
Sync every column from the table web.customers
but only use the updated_at
fields when calculating the hash:
Inconsistencies in other fields will not be detected. In the event of a hash inconsistency in fields which are included, the excluded fields will still be copied to the target host.
db-sync --user root --password mypass 127.0.0.1 111.222.3.44 web.customers -C updated_at
Example 7
Sync every column from the table web.customers
and use all fields except for the notes
or info
fields when calculating the hash:
Inconsistencies in excluded fields will not be detected. In the event of a hash inconsistency in fields which are included, the excluded fields will still be copied to the target host.
This is especially useful for tables with long text fields that don't change after initial insert, or which are associated with an
on update CURRENT_TIMESTAMP
field. For large tables this can offer a big performance boost.
db-sync --user root --password mypass 127.0.0.1 111.222.3.44 web.customers -I notes -I info
Example 8
Sync the table web.customers
to a table under a different name in a different database web_backup.customers_2
:
db-sync --user root --password mypass --target.table web_backup.customers_2 127.0.0.1 111.222.3.44 web.customers
Example 9
Sync the table web.customers
only including active records and delete records in the target table, that are no longer active:
db-sync --user root --password mypass 127.0.0.1 111.222.3.44:13306 web.customers --delete --where="active = 1"
Config File
To avoid having to specify options repeatedly, and to avoid exposing your password on the tty you can specify a config file.
By default DbSync will look for a file named dbsync.ini
in the current working directory.
Example:
user=root password=mypass target.user=admin target.password=myadminpass
Use library within project (non-commandline)
You can include the library within your project and use the components directly:
use \DbSync\DbSync; use \DbSync\Transfer\Transfer; use \DbSync\Hash\ShaHash; use \DbSync\Table; use \DbSync\ColumnConfiguration; $sync = new DbSync(new Transfer(new ShaHash(), $blockSize, $transferSize)); $sync->setLogger(new YourPsrLogger()); $sync->dryRun(false); $sync->delete(true); $sourceTable = new Table($sourceConnection, $sourceDb, $sourceTable); $targetTable = new Table($targetConnection, $targetDb, $targetTable); // if you only want specific columns $columnConfig = new ColumnConfiguration($syncColumns, $ignoreColumns); // if you only want to use specific columns for the comparison $compareConfig = new ColumnConfiguration($compareColumns, $ignoreCompareColumns); // optionally apply a where clause - this can be useful when sync-ing large tables, where // you can make use of a column to rule out large portions of the data // that you know haven't changed, such as columns with "on update CURRENT_TIMESTAMP" etc.. $sourceTable->setWhereClause(new WhereClause("column_name = ?", ['value'])); $targetTable->setWhereClause(new WhereClause("column_name > ?", ['value'])); $sync->sync($sourceTable, $targetTable, $columnConfig, $compareConfig);
Roadmap
- 100% test coverage via full stack integration tests
- Allow option to delete data from target where not present on the source
- Use symfony console command for sync
- Option to re-try with back-off on lock wait timeouts
- Option to create missing tables on target
- Option to skip duplicate key errors
- Speed up initial sync of empty table - Maybe offer combination with other tool for full fast outfile based replacement
Requirements
PHP 5.4 or above PDO MySQL Extension
License
DbSync is licensed under the MIT License - see the LICENSE file for details
Acknowledgments
- Inspiration for this project came from the Percona Tools
pt-table-sync
.