sammyjo20 / laravel-chunkable-jobs
Split a job into multiple separate job chunks
Installs: 63 051
Dependents: 2
Suggesters: 2
Security: 0
Stars: 83
Watchers: 3
Forks: 3
Open Issues: 0
Requires
- php: ^8.1
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.5
- orchestra/testbench: ^8.0 || ^9.0
- pestphp/pest: ^2.34
- spatie/ray: ^1.33
README
Laravel Chunkable Jobs
This package allows you to split up a process into multiple jobs with their own chunk. This is perfect for processing lots of data as you can delegate the processing into separate jobs or if you are retrieving data from a paginated API. It works by processing the job and then queueing another job to process the next chunk until it reaches the end.
Example
<?php use Sammyjo20\ChunkableJobs\Chunk; use Sammyjo20\ChunkableJobs\ChunkableJob; class GetPageOfPokemon extends ChunkableJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; public function defineChunk(): ?Chunk { $response = Http::asJson()->get('https://pokeapi.co/api/v2/pokemon'); $count = $response->json('count'); // 1154 return new Chunk(totalItems: $count, chunkSize: 1, startingPosition: 1); } protected function handleChunk(Chunk $chunk): void { $response = Http::asJson()->get(sprintf('https://pokeapi.co/api/v2/pokemon?limit=%s&offset=%s', $chunk->limit, $chunk->offset)); $data = $response->json(); // Store data of response } }
Installation
Install the package through Composer. This package requires PHP 8.1+ and Laravel 8 or higher.
composer require sammyjo20/laravel-chunkable-jobs
Getting Started
Create a new job and remove the handle
method from the job. Next, extend the ChunkableJob
class. You will be required to add two methods to your class, a defineChunk
method and handleChunk
method. In my example, I will be fetching every Pokemon from the Pokemon API and storing it into my application. You should have something like the following.
<?php use Sammyjo20\ChunkableJobs\Chunk; use Sammyjo20\ChunkableJobs\ChunkableJob; class GetPageOfPokemon extends ChunkableJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; public function defineChunk(): ?Chunk { // } protected function handleChunk(Chunk $chunk): void { // } }
Next, we’ll need to define our chunk, this tells the chunkable job how many items it has to process and the size of the chunks so it knows how many times to run the “handleChunk” method. Inside of this method, you can return a chunk. This chunk accepts three arguments: totalItems
, chunkSize
and startingPosition
. If you return null or a chunk without any totalItems, handleChunk
will not be processed.
<?php use Sammyjo20\ChunkableJobs\Chunk; public function defineChunk(): ?Chunk { $response = Http::asJson()->get('https://pokeapi.co/api/v2/pokemon'); $count = $response->json('count'); // 1154 return new Chunk(totalItems: $count, chunkSize: 1, startingPosition: 1); }
Chunk Constructor
- totalItems: The amount of items that you want to chunk through, for example if I had 100 items with a chunk size of 10, it would create 10 chunks.
- chunkSize: The size of each chunk. If you are dealing with a paginated API, this is the same as the per page of that API.
- startingPosition: The starting position of the chunk, it defaults to 1 but if you want to resume a job, you can change this startingPosition.
Next, we’ll want to write the logic to process each chunk. In my example, I want to make an API call for that chunk and then store the response. The handleMethod
will be executed on every chunk and will contain useful information about the chunk.
<?php use Sammyjo20\ChunkableJobs\Chunk; protected function handleChunk(Chunk $chunk): void { $response = Http::asJson()->get(sprintf('https://pokeapi.co/api/v2/pokemon?limit=%s&offset=%s', $chunk->limit, $chunk->offset)); $data = $response->json(); // Store data of response }
Chunk Properties
- totalItems: The total items provided when the chunk was created. This property does not change.
- totalChunks: The total number of chunks generated when the chunk was created. This property does not change.
- remainingItems: The remaining items in the chunk. This property decreases as the chunked jobs are dispatched.
- remainingChunks: The remaining chunks. This property decreases as the chunked jobs are dispatched.
- originalSize: The size of the chunk. This property does not change.
- size: The size of the current chunk. This property will only change on the last chunk if there is a remainder.
- limit: The limit of the current chunk. Similar to size, it is designed to help you interact with APIs that operate a limit/offset
- offset: The offset of the current chunk. This property increases as the chunked jobs are dispatched
- position: The current position of the chunk. It is designed to act as “page” if you are dealing with a paginated API. This will increase as the chunked jobs are dispatched.
- metadata: An array if you would like to apply any metadata to the chunk. Metadata will be passed to all next chunks.
Chunk Methods
- next: Allows you to get the next chunk. It is an immutable method so the original object is not affected.
- move: Allows you to move to a given chunk position. It is immutable by default but you can make it immutable.
- replace: Allows you to replace the current object with another chunk.
- isFirst: Specifies if the chunk is the first chunk.
- isNotFirst: Opposite of isFirst
- isLast: Specifies if the chunk is the last chunk.
- isNotLast: Opposite of isFirst
- isEmpty: Specifies if the chunk is empty, which means the totalItems property is zero.
- isNotEmpty: Opposite of isEmpty
Dispatching
To dispatch a chunkable job, it's exactly the same. The default behaviour of chunkable jobs is to process one job, then dispatch the next after it has been successfully processed.
<?php GetPageOfPokemon::dispatch();
Dispatching every chunked job at once
Sometimes you may want to throw as much resource as you can to a specific chunked job. If processing one chunk at a time is not suitable and you would rather dispatch every chunk straight away, you can use the dispatchAllChunks
static method on the chunkable job. It will accept constructor arguments through the parameters. Alternatively, you can use the BulkChunkDispatcher
class.
<?php use Sammyjo20\ChunkableJobs\BulkChunkDispatcher; // Will dispatch all jobs at once 🚀 GetPageOfPokemon::dispatchAllChunks(); // or BulkChunkDispatcher::dispatch(new GetPageOfPokemon);
Stopping Chunking Early
Sometimes you might want to stop the chunking process early. You can use the stopChunking
method and the job won’t dispatch the next chunk.
<?php use Sammyjo20\ChunkableJobs\Chunk; protected function handleChunk(Chunk $chunk): void { $response = Http::asJson()->get(sprintf('https://pokeapi.co/api/v2/pokemon?limit=%s&offset=%s', $chunk->limit, $chunk->offset)); // Stop chunking early... if ($response->failed()) { $this->stopChunking(); } }
Customising The Starting Chunk
Sometimes you might want to resume a chunkable job where it may have failed previously or paused. You can set the chunk on the job instance before you dispatch the job.
<?php use Sammyjo20\ChunkableJobs\Chunk; $job = new GetPageOfPokemon; $job->setChunk(new Chunk(totalItems: 100, chunkSize: 10, startingPosition: 5)); dispatch($job);
Using ChunkRange
to iterate over all chunks
If you need to iterate over every chunk, you can use the ChunkRange
class. This will return a generator that you can iterate over to get every chunk.
use Sammyjo20\ChunkableJobs\ChunkRange; $chunkRange = ChunkRange::create(30, 10); foreach($chunkRange as $chunk) { // Handle $chunks }
Unknown Size Chunking
Sometimes you might not know the size/limit that you want to chunk for and therefor you want to keep chunking infinitely and stop when you know when you have reached a limit. If you would like to do this, you can use the UnknownSizeChunk
which will set the size to PHP_MAX_INT
(which is a really really big number) and you can stop when you like.
<?php use Sammyjo20\ChunkableJobs\Chunk; use Sammyjo20\ChunkableJobs\ChunkableJob; use Sammyjo20\ChunkableJobs\UnknownSizeChunk; class GetPageOfPokemon extends ChunkableJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; public function defineChunk(): ?Chunk { return UnknownSizeChunk(chunkSize: 100); } protected function handleChunk(Chunk $chunk): void { // Keep processing // When ready to stop: if ($stop === true) { $this->stopChunking(); } } }
Setting the next chunk
Sometimes you might want to change the chunking entirely, if you would like to do this, you can use the nextChunk
method when chunking and the next chunk will be replaced by this chunk.
<?php protected function handleChunk(Chunk $chunk): void { $chunk = new Chunk(100, 10) $chunk = $chunk->move(5); $this->setNextChunk($chunk); }
SetUp & TearDown
The setUp
and tearDown
methods are called before and after the chunking process. This is useful if you need to do some setup before the chunking starts and some cleanup after all the job chunks have finished processing.
<?php use Sammyjo20\ChunkableJobs\Chunk; use Sammyjo20\ChunkableJobs\ChunkableJob; use Illuminate\Support\Facades\Log; class GetPageOfPokemon extends ChunkableJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; public function defineChunk(): ?Chunk { $response = Http::asJson()->get('https://pokeapi.co/api/v2/pokemon'); $count = $response->json('count'); // 1154 return new Chunk(totalItems: $count, chunkSize: 1, startingPosition: 1); } protected function handleChunk(Chunk $chunk): void { $response = Http::asJson()->get(sprintf('https://pokeapi.co/api/v2/pokemon?limit=%s&offset=%s', $chunk->limit, $chunk->offset)); $data = $response->json(); // Store data of response } protected function setUp(): void { Log::info('Starting the retrieval process...'); } protected function tearDown(): void { Log::info('Finished the retrieval process!'); } }
Warning around properties
When you use the ChunkableJob class on your job, you have to be careful about the properties that you set on your jobs during runtime. When the every next chunkable job is created, the current object is
cloned including its public, protected and private properties. If there are any properties that are set by the defineChunk
or handleChunk
methods, and you do not want them to be shared with the next
chunk, make sure to add the property to the $ignoredProperties
array on the job.
<?php use Sammyjo20\ChunkableJobs\Chunk; use Sammyjo20\ChunkableJobs\ChunkableJob; use Illuminate\Support\Facades\Log; class GetPageOfPokemon extends ChunkableJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; protected string $stripeSecret; protected array $ignoredProperties = ['stripeSecret']; }
Since private properties cannot be ignored, you should use the modifyClone
method which can be used
to unset private properties or make final adjustments to the cloned object before it is put onto the
queue.
<?php class GetPageOfPokemon extends ChunkableJob implements ShouldQueue { use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; private ?string $privateProperty = null; public function __construct() { $this->privateProperty = 'Shh!'; } // ... Other methods protected function modifyClone(ChunkableJob $job): static { unset($job->privateProperty); return $job; } }