mschop/pathogen

General-purpose path library for PHP.

0.7.1 2022-02-21 19:04 UTC

README

You are watching the documentation for v1.X (unreleased). For current stable release see https://github.com/mschop/pathogen/tree/0.7.1

Pathogen

General-purpose path library for PHP.

Run Unit Tests

Future development

This package is a fork of the abandoned package eloquent/pathogen. I will continue the work on this lib. Please feel free to issue feature requests or bug reports.

Installation and documentation

What is Pathogen?

Pathogen is a library for path manipulation. Pathogen supports file system paths including Unix and Windows style paths, but is truly a general-purpose path implementation, capable of representing URI paths and other path-like structures while providing a comprehensive API.

Table of contents

Pathogen concepts

Path parts

The overall structure of a Pathogen path can be broken down into smaller parts. This diagram shows some of these named parts as they apply to a typical path:

  A   A   ___ A ___
 / \ / \ /         \
/foo/bar/baz.qux.pop
         \_________/
            name
\__________________/
        path

A = atom

The 'name' portion can be further broken down as follows:

  NWE    E
/     \ / \
baz.qux.pop
\_/ \_____/
 NP   NS

NWE = name without extension
  E = extension
 NP = name prefix
 NS = name suffix

Path atoms

In Pathogen, a path consists of a sequence of 'atoms'. Atoms are the individual sections of the path hierarchy. Given the path /path/to/foo, the sequence of atoms would be path, to, foo. The slash character is referred to as the 'separator'.

The atoms . and .. have special meaning in Pathogen. The single dot (.) is referred to as the 'self atom' and is typically used to reference the current path. The double dot (..) is referred to as the 'parent atom' and is used to reference the path above the current one. Anyone familiar with typical file system paths should be familiar with their behaviour already.

Given a path instance, the atoms of the path can be determined as follows:

$atoms = $path->atoms(); // returns an array of strings

Path name

The 'name' section of a path is simply the last atom of a path. If a path has no atoms, its name is an empty string. Given a path instance, the name of the path can be determined like so:

$name = $path->name(); // returns a string

Path name extensions

The name of a path can be further divided using extension separators (.). For example, given the path name foo.bar.baz, Pathogen can determine the 'name without extension' (foo.bar), the 'name prefix' (foo), the 'name suffix' (bar.baz), and the 'extension' (baz).

Given a path instance, the various sections can be retrieved as follows:

$nameWithoutExtension = $path->nameWithoutExtension(); // returns a string
$namePrefix = $path->namePrefix(); // returns a string
$nameSuffix = $path->nameSuffix(); // returns a string or null
$extension = $path->extension(); // returns a string or null

Trailing separators

Pathogen is capable of representing a path with a trailing separator (/). This is useful in the case that a trailing separator has a special meaning to some logic, such as the behaviour of the Unix cp command. The trailing separator support is purely for the use of developers utilizing Pathogen; it does not affect any logic used by Pathogen itself.

It is worth noting that all new path instances produced by Pathogen will strip any trailing slashes unless it is explicitly stated otherwise.

Absolute and relative paths

In Pathogen, absolute and relative paths are represented by two different classes. While both classes implement a common PathInterface, other methods are provided by the AbsolutePathInterface or the RelativePathInterface respectively.

This distinction provides, amongst other benefits, the ability to harness PHP's type hinting to restrict the type of path required:

use Pathogen\AbsolutePathInterface;use Pathogen\Path;use Pathogen\RelativePathInterface;

function anyPath(Path $path)
{
    // accepts any path
}

function absoluteOnly(AbsolutePathInterface $path)
{
    // accepts only absolute paths
}

function relativeOnly(RelativePathInterface $path)
{
    // accepts only relative paths
}

Special paths

The 'root' path is considered the top-most absolute path, and is represented as a single separator with no atoms (/).

The 'self' path is considered to point to the 'current' path, and is represented as a single self atom (.).

Creating paths

Static factory methods

Normally you do create a path the static factory methods.To use this method effectively, simply choose the most appropriate class for the type of path:

use Pathogen\AbsolutePath;
use Pathogen\FileSystem\FileSystemPath;
use Pathogen\Path;
use Pathogen\RelativePath;
use Pathogen\Exception\PathTypeMismatchException;

$path = Path::fromString('/path/to/foo'); // returns AbsolutePath
$path = Path::fromString('\path\to\foo'); // returns AbsolutePath
$path = Path::fromString('bar/baz'); // returns RelativePath
$path = Path::fromString('bar\baz'); // returns RelativePath

# Specific classes
$path = RelativePath::fromString('bar/baz'); // returns RelativePath
$path = RelativePath::fromString('/bar/baz'); // !!! throws PathTypeMismatchException 
$path = AbsolutePath::fromString('/bar/baz'); // returns AbsolutePath
$path = AbsolutePath::fromString('bar/baz'); // !!! throws PathTypeMismatchException

# Drive Anchoring (Windows)
$path = Path::fromString('C:/bar/baz'); // returns AbsoluteDriveAnchoredPath
$path = Path::fromString('C:\bar\baz'); // returns AbsoluteDriveAnchoredPath
$path = Path::fromString('C:bar/baz'); // returns RelativeDriveAnchoredPath

You can also use the constructor to create new instances of Path.

use Pathogen\Path;
use Pathogen\AbsoluteDriveAnchoredPath;

// Equivalent to '/path/to/foo'
$atoms = ['path', 'to', 'foo'];
$path = new Path($atoms, hasTrailingSeparator: false);

// Equivalent to 'C:\path\to\foo'
$path = new AbsoluteDriveAnchoredPath(atoms: $atoms, hasTrailingSeparator: false, drive: 'C');

Path resolution

Resolution of a path involves taking a path which may be relative or absolute, and figuring out where that path points to, given a known 'base' path. The result of path resolution will always be an absolute path.

For example, consider a current path of /path/to/foo. A relative path of bar/baz, will resolve to /path/to/foo/bar/baz against this path. Conversely, an absolute path of /path/to/qux will not change after resolution, as it is already an absolute path.

Resolution methods

The simplest way to achieve path resolution with Pathogen is to use the most appropriate method on a path:

use Pathogen\FileSystem\FileSystemPath;

$basePath = FileSystemPath::fromString('/path/to/foo');
$relativePath = FileSystemPath::fromString('bar/baz');
$absolutePath = FileSystemPath::fromString('/path/to/qux');

echo $basePath->resolve($relativePath); // outputs '/path/to/foo/bar/baz'
echo $basePath->resolve($absolutePath); // outputs '/path/to/qux'

echo $relativePath->resolveAgainst($basePath); // outputs '/path/to/foo/bar/baz'

Resolver objects

Path resolvers are also a standalone concept in Pathogen. A simple example of their usage follows:

use Pathogen\FileSystem\FileSystemPath;
use Pathogen\Resolver\PathResolver;

$resolver = new PathResolver;

$basePath = FileSystemPath::fromString('/path/to/foo');
$relativePath = FileSystemPath::fromString('bar/baz');
$absolutePath = FileSystemPath::fromString('/path/to/qux');

echo $resolver->resolve($basePath, $relativePath); // outputs '/path/to/foo/bar/baz'
echo $resolver->resolve($basePath, $absolutePath); // outputs '/path/to/qux'

Path normalization

Normalization of a path is the process of converting a path to its simplest or canonical form. This means resolving as many of the self and parent atoms as possible. For example, the path /path/to/foo/../bar normalizes to /path/to/bar.

Normalization works differently for absolute and relative paths. Absolute paths can always be resolved to a canonical form with no self or parent atoms. Relative paths can often be simplified, but may still contain these special atoms. For example, the path ../foo/../.. will actually normalize to ../...

Note that for absolute paths, the root path (/) is the top-most path to which parent atoms will normalize. That is, paths with more parent atoms than regular atoms, like /.., /../.., or /foo/../.. will all normalize to be the root path (/).

Normalization typically never takes place in Pathogen unless it is required for a calculation, or done manually through the API. If a normalized path is required for some reason, this is left to the developer to handle.

Normalize method

The simplest way to normalize a path is to use the normalize() method:

use Pathogen\FileSystem\FileSystemPath;

$path = FileSystemPath::fromString('/path/./to/foo/../bar');

echo $path->normalize(); // outputs '/path/to/bar'

Normalizer objects

Path normalizers are also a standalone concept in Pathogen. A simple example of their usage follows:

use Pathogen\FileSystem\FileSystemPath;
use Pathogen\FileSystem\Normalizer\FileSystemPathNormalizer;

$normalizer = new FileSystemPathNormalizer;

$path = FileSystemPath::fromString('/path/./to/foo/../bar');

echo $normalizer->normalize($path); // outputs '/path/to/bar'

File system paths

Pathogen provides support for dealing with file system paths in a platform agnostic way. There are two approaches supported by Pathogen, which can be applied depending on the situation.

The first approach is to inspect the path string and create an appropriate path instance based upon a 'best guess'. This is handled by the FileSystemPath class:

use Pathogen\FileSystem\FileSystemPath;

$pathFoo = FileSystemPath::fromString('/path/to/foo');   // creates a Unix-style path
$pathBar = FileSystemPath::fromString('C:/path/to/bar'); // creates a Windows path

The second approach is to create paths based upon the current platform the code is running under. That is, when running under Linux or Unix, create Unix-style paths, and when running under Windows, create windows paths. This is handled by the PlatformFileSystemPath:

use Pathogen\FileSystem\PlatformFileSystemPath;

// creates a path to match the current platform
$path = PlatformFileSystemPath::fromString('/path/to/foo');

Note that FileSystemPath and PlatformFileSystemPath are only utility classes with static methods. The actual path class used will depend on the input. If it is necessary to type hint for a file system path, FileSystemPathInterface or one of its more specialized child interfaces should be used instead.

Immutability of paths

Paths in Pathogen are immutable, meaning that once they are created, they cannot be modified. When performing some mutating operation on a path, such as normalization or resolution, a new path instance is produced, rather than the original instance being altered. This allows a path to be exposed as part of an interface without creating a leaky abstraction.

Windows path support

Pathogen provides support for Windows paths. In addition to the methods available to Unix-style paths, Windows paths contain an optional drive specifier. The drive specifier is available via the drive() method:

$drive = $path->drive(); // returns a single-character string, or null

Dependency consumer traits

Pathogen provides some traits to make consuming its services extremely simple for code targeting PHP 5.4 and higher.

The concept of a dependency consumer trait is simple. If a class requires, for example, a path factory, it can simply use a PathFactoryTrait. This gives the class setPathFactory() and pathFactory() methods for managing the path factory dependency.

This example demonstrates how to use the file system path factory trait:

use Pathogen\FileSystem\Factory\Consumer\FileSystemPathFactoryTrait;

class ExampleConsumer
{
    use FileSystemPathFactoryTrait;
}

$consumer = new ExampleConsumer;
echo get_class($consumer->pathFactory()); // outputs 'Pathogen\FileSystem\Factory\FileSystemPathFactory'

Available dependency consumer traits

Usage examples

Resolving a user-provided path against the current working directory

use Pathogen\FileSystem\Factory\PlatformFileSystemPathFactory;

$factory = new PlatformFileSystemPathFactory;
$workingDirectoryPath = $factory->createWorkingDirectoryPath();

$path = $workingDirectoryPath->resolve(
    $factory->create($_SERVER['argv'][1])
);

Resolving a path against another arbitrary path

use Pathogen\Path;

$basePath = Path::fromString('/path/to/base');
$path = Path::fromString('../child');

$resolvedPath = $basePath->resolve($path);

echo $resolvedPath->string();              // outputs '/path/to/base/../child'
echo $resolvedPath->normalize()->string(); // outputs '/path/to/child'

Determining whether one path exists inside another

use Pathogen\Path;

$basePath = Path::fromString('/path/to/foo');
$pathA = Path::fromString('/path/to/foo/bar');
$pathB = Path::fromString('/path/to/somewhere/else');

var_dump($basePath->isAncestorOf($pathA)); // outputs 'bool(true)'
var_dump($basePath->isAncestorOf($pathB)); // outputs 'bool(false)'

Appending an extension to a path

use Pathogen\Path;

$path = Path::fromString('/path/to/foo.bar');
$pathWithExtension = $path->joinExtensions('baz');

echo $pathWithExtension->string(); // outputs '/path/to/foo.bar.baz'

Replacing a path's extension

use Pathogen\Path;

$path = Path::fromString('/path/to/foo.bar');
$pathWithNewExtension = $path->replaceExtension('baz');

echo $pathWithNewExtension->string(); // outputs '/path/to/foo.baz'

Replacing a section of a path

use Pathogen\Path;

$path = Path::fromString('/path/to/foo/bar');
$pathWithReplacement = $path->replace(1, array('for', 'baz'), 2);

echo $pathWithReplacement->string(); // outputs '/path/for/baz/bar'