keboola / staging-provider
Requires
- php: >=8.2
- ext-json: *
- keboola/input-mapping: *@dev
- keboola/output-mapping: *@dev
- keboola/slicer: *@dev
- keboola/storage-api-client: ^15.1
- keboola/storage-api-php-client-branch-wrapper: ^6.0
Requires (Dev)
- keboola/coding-standard: >=14.0
- phpstan/phpstan: ^1.8
- phpstan/phpstan-phpunit: ^1.1
- phpunit/phpunit: ^9.5
- sempro/phpunit-pretty-print: ^1.4
- symfony/dotenv: ^5.2|^6.0
- dev-main
- 9.0.0
- 8.1.0
- 8.0.0
- 7.1.0
- 7.0.2
- 7.0.1
- 7.0.0
- 6.1.0
- 6.0.4
- 6.0.3
- 6.0.2
- 6.0.1
- 6.0.0
- 5.7.0
- 5.6.0
- 5.5.0
- 5.4.0
- 5.3.0
- 5.2.0
- 5.1.0
- 5.0.0
- 4.1.1
- 4.1.0
- 4.0.0
- 3.0.0
- 2.4.0
- 2.3.0
- 2.2.0
- 2.1.0
- 2.0.0
- 1.1.0
- 1.0.0
- dev-PST-2086-ondra
- dev-PST-2197-ondra
- dev-PST-2247-ondra
- dev-erik-PST-1468
- dev-erik-pst-2226
- dev-novakjiri-PST-2203-logs-fetching
- dev-odin-PST-1670
- dev-pepa_k8s_deployment
- dev-roman-pst-1710
- dev-pepa_auth_adminToken
- dev-rrik-om-ci-fixup
- dev-erik-GCP-472
- dev-erik-PST-850-part2
- dev-erik-GCP-374-part1
- dev-erik-GCP-374-part3
- dev-odin-GCP-374-c
- dev-azure-event-grid
- dev-erik-PST-780-pokuse
- dev-pepa_azClientNamedArgs
- dev-zajca-event-grid
This package is auto-updated.
Last update: 2024-11-21 15:39:04 UTC
README
Installation
composer require keboola/staging-provider
Usage
The staging provider package helps you to properly configure input/output staging factory for various environments.
Typical use-case can be set up a Reader
instance to access some data:
use Keboola\InputMapping\Reader; use Keboola\InputMapping\Staging\StrategyFactory as InputStrategyFactory; use Keboola\StagingProvider\InputProviderInitializer; use Keboola\StagingProvider\WorkspaceProviderFactory\ExistingDatabaseWorkspaceProviderFactory; use Keboola\StorageApi\Client; use Keboola\StorageApi\Workspaces; use Keboola\StorageApiBranch\ClientWrapper; use Psr\Log\NullLogger; $storageApiClient = new Client(...); $storageApiClientWrapper = new ClientWrapper($storageApiClient, ...); $logger = new NullLogger(); $strategyFactory = new InputStrategyFactory($storageApiClientWrapper, $logger, 'json'); $tokenInfo = $storageApiClient->verifyToken(); $dataDir = '/data'; $workspaceProviderFactory = new ExistingDatabaseWorkspaceProviderFactory( new Workspaces($storageApiClient), 'my-workspace', // workspace ID 'abcd1234' // workspace password ); $providerInitializer = new InputProviderInitializer($strategyFactory, $workspaceProviderFactory, $dataDir); $providerInitializer->initializeProviders( InputStrategyFactory::WORKSPACE_SNOWFLAKE, $tokenInfo ); // now the $strategyFactory is ready to be used $reader = new Reader($strategyFactory);
We start by creating a StrategyFactory
needed by the reader. The strategy itself has no knowledge of which storage
should be used with each staging type. This is what provider initializer does - configure the StrategyFactory
for
a specific type of staging.
To create a provider initializer we pass it:
- the
StrategyFactory
to initialize - a workspace provider factory - used to access workspace information for workspace staging
ExistingWorkspaceProviderFactory
in case we want to re-use existing workspaceComponentWorkspaceProviderFactory
in case we want a new workspace to be created (based on a component configuration)
- a data directory path used for local staging
Then we call initializeProviders
method to configure the StrategyFactory
for specific staging type. It's up to the
caller to know, which staging type to configure:
- when working with components, each component has staging type defined in its configuration
- sandbox has the type deduced from its workspace
- etc.
The example above presents usage of InputProviderInitializer
for configuration of input mapping StrategyFactory
for
a Reader
. Similarly, we can use OutputProviderInitializer
to configure output mapping StrategyFactory
for a Writer
.
Internals
The main objective of the library is to configure StrategyFactory
so it knows which staging provider to
use with each kind of storage.
Staging
Generally there are two kind of staging:
- local staging - used to store data locally on filesystem, represented by
LocalStaging
class - workspace staging - used to store data in a workspace, represented by
WorkspaceStagingInterface
Provider (staging provider)
The StrategyFactory
does not use a staging directly but rather through a provider (ProviderInterface
) so there is
a provider implementation for each kind:
LocalStagingProvider
WorkspaceStagingProvider
The main reason the StrategyFactory
does not use the staging directly is to achieve lazy initialization of the staging -
provider instance is created during bootstrap, but the staging instance is only created when really used.
Workspace provider factory
Local staging is pretty simple, it contains just the path to data directory, provided by the caller. On the other hand
things gets a bit more complicated with workspace staging as the provider may represent an already existing workspace or
a configuration for creating a new workspace. To achieve this, caller must provide a WorkspaceProviderFactoryInterface
.
Currently there are 2 implementations:
ExistingWorkspaceProviderFactory
which creates a provider working with an existing workspaceComponentWorkspaceProviderFactory
which creates a provider that creates a new workspace based on a component configuration
Development
First start with creating .env
file from .env.dist
.
cp .env.dist .env
# edit .env to set variable values
To run tests, there is a separate service for each PHP major version (5.6 to 7.4). For example, to run tests against PHP 5.6, run following:
docker compose run --rm tests56
To develop locally, use dev
service. Following will install Composer dependencies:
docker compose run --rm dev composer install
License
MIT licensed, see LICENSE file.