raigu/ordered-lists-sync

Library for synchronizing ordered data with the minimum of insert and delete operations.

v0.3.0 2021-12-03 06:04 UTC

This package is auto-updated.

Last update: 2025-01-11 23:47:21 UTC


README

Latest Stable Version build codecov Scrutinizer Code Quality License: MIT Dependents

Library for synchronizing ordered data with the minimum of insert and delete operations.

Optimized for large datasets. Suitable for keeping in sync internal data with outside source. See demos.

Compatibility

  • PHP 7.4, 8.0, 8.1, 8.2, 8.3

Installations

$ composer require raigu/ordered-lists-sync

Usage

\Raigu\OrderedListsSynchronization\Synchronization compares source and target lists. It detects which elements have been added or removed in source compared to the target and calls corresponding callback.

The source and target must be of type Iterator.

$synchronization = new \Raigu\OrderedListsSynchronization\Synchronization();
$synchronization(
    $source = new ArrayIterator(['A', 'B', 'D']), 
    $target = new ArrayIterator(['B', 'C', 'D']),
    $add = function ($element) { echo "ADD: {$element}\n"; },
    $remove = function ($element) { echo "REMOVE: {$element}\n"; }
);

Output:

ADD: A
REMOVE: C

Use Cases

Sample use cases with demo code:

Design

The \Raigu\OrderedListsSynchronization\Synchronization has only one purpose. Therefore, it is designed so the instance will be a function, no method is exposed. It is an object because it allows extending it in future (for example adding logging).

Using Iterator type for source and target has several advantages.

First, you can create your own iterators as separate components which are easier to develop and test.

Secondly, more complex problems can be solved. Specially if there are memory constraints.
You can make source or target as generator, as an instance implementing Iterate or IteratorAggregate interface or use Sandard PHP Library (SPL) iterators.

Thirdly, it allows creating components which do not do any job in constructor. This is convenient when using dependency injection or declarative programming.

Development

Setting up

$ git clone git@github.com:raigu/ordered-lists-sync.git
$ cd ordered-lists-sync
$ composer install

Testing

$ composer test
$ composer coverage

Algorithm

Algorithm works same way like we use dictionary. If all words are ordered, then we know exactly where some word should be. If we have two dictionaries and start to compare them word by word from start, then we can detect added or removed words.

Example (V denotes current position of the iterator):

        V
Source: A, B, D
        V
Target: B, C, D

A < B => if source is before target, then this means that value is missing in target. If there would be an A in target, then it should be here before B. But it is not. Therefore, it is missing and should be added. Add A and move source cursor:

           V
Source: A, B, D
        V
Target: B, C, D

B = B => if the source and target are equal, then they are sync. Move both cursors.

              V
Source: A, B, D
           V
Target: B, C, D

D > C => if source is after the target, then this means that value has been removed from source. Therefore, remove C and move the cursor forward:

              V
Source: A, B, D
              V
Target: B, C, D

D = D => they are in sync. Move both cursors.