phanan/poddle

Parse podcast feeds with PHP following PSP-1 Podcast RSS Standard

Maintainers

Details

github.com/phanan/poddle

Source

Issues

Fund package maintenance!
phanan

v1.2.0 2024-10-30 04:06 UTC

This package is auto-updated.

Last update: 2024-12-31 00:20:16 UTC


README

Poddle

Effortlessly parse podcast feeds in PHP following PSP-1 Podcast RSS Standard.

Requirements and Installation

Poddle requires PHP 8.1 or higher. You can install the library via Composer by running the following command:

composer require phanan/poddle

Usage

Parse from a URL

To parse a podcast feed from its URL, call the fromUrl method with the feed URL:

$poddle = \PhanAn\Poddle::fromUrl('https://example.com/feed.xml');

This method also accepts two additional parameters:

  • timeoutInSeconds: The number of seconds to wait while trying to connect. Defaults to 30. Note that the max_execution_time value in your PHP configuration may still limit the maximum timeout value.
  • client: A PSR-7-compliant client to make the request. If not provided, Poddle will use a default client. This parameter may come in handy during testing or if you need to heavily customize the request.

Parse from XML

If you already have the XML string, you can parse it using Poddle::fromXml instead:

$poddle = \PhanAn\Poddle::fromXml(file_read_contents('feed.xml'));

Upon success, both fromUrl and fromXml methods return a Poddle object, which you can use to access the feed's channel and episodes.

Channel

To access the podcast channel, call getChannel on the Poddle object:

/** @var \PhanAn\Poddle\Values\Channel $channel */
$channel = $poddle->getChannel();

All channel's required elements per the PSP-1 standard are available as properties on the Channel object:

$channel->title; // string
$channel->link; // ?string
$channel->description; // string
$channel->language; // string
$channel->image; // string
$channel->categories; // \PhanAn\Poddle\Values\CategoryCollection<\PhanAn\Poddle\Values\Category>
$channel->explicit; // bool

Note

Although required by the standard, link isn't supplied by all feeds, including some bigger players like Spotify's Megaphone. As such, link is nullable in Poddle.

All channel’s recommended elements are available via the metadata property:

$channel->metadata; // \PhanAn\Poddle\Values\ChannelMetadata
$channel->metadata->locked; // bool
$channel->metadata->guid; // ?string
$channel->metadata->author; // ?string
$channel->metadata->copyright; // ?string
$channel->metadata->txts; // \PhanAn\Poddle\Values\TxtCollection<\PhanAn\Poddle\Values\Txt>
$channel->metadata->fundings; // \PhanAn\Poddle\Values\FundingCollection<\PhanAn\Poddle\Values\Funding>
$channel->metadata->type; // ?\PhanAn\Poddle\Values\PodcastType
$channel->metadata->complete; // bool

Episodes

To access the podcast episodes, call getEpisodes on the Poddle object:

$episodes = $poddle->getEpisodes();

By default, getEpisodes will throw an error if any of the episodes is malformed. If you want a more forgiving behavior, pass true into the call to silently ignore the invalid episodes.

This method returns a lazy collection of \PhanAn\Poddle\Values\Episode objects. You can iterate over the collection to access each episode:

$episodes->each(function (\PhanAn\Poddle\Values\Episode $episode) {
    // Access episode properties
});

All episode's required elements per the PSP-1 standard are available as properties on the Episode object:

$episode->title; // string
$episode->enclosure; // \PhanAn\Poddle\Values\Enclosure
$episode->guid; // \PhanAn\Poddle\Values\EpisodeGuid

All episode's recommended elements are available via the metadata property:

$episode->metadata; // \PhanAn\Poddle\Values\EpisodeMetadata
$episode->metadata->link; // ?string
$episode->metadata->pubDate; // ?\DateTime
$episode->metadata->description; // ?string
$episode->metadata->duration; // ?int
$episode->metadata->image; // ?string
$episode->metadata->explicit; // ?bool
$episode->metadata->transcripts; // \PhanAn\Poddle\Values\TranscriptCollection<\PhanAn\Poddle\Values\Transcript>
$episode->metadata->episode; // ?int
$episode->metadata->season; // ?int
$episode->metadata->type; // ?\PhanAn\Poddle\Values\EpisodeType
$episode->metadata->block; // ?bool

Other Elements and Values

If you need to access other elements or values not covered by the PSP-1 standard, you can make use of the $xmlReader property on the Poddle object:

$xmlReader = $poddle->xmlReader;

This property is an instance of Saloon\XmlWrangler\XmlReader and allows you to navigate the XML document directly. For example, to access the feed's lastBuildDate value:

$poddle = \PhanAn\Poddle::fromUrl('https://example.com/feed.xml');
$poddle->xmlReader->value('rss.channel.lastBuildDate')?->sole(); // 'Thu, 02 May 2024 06:44:38 +0000'

For more information on how to use XmlReader, refer to Saloon\XmlWrangler documentation.

The original feed content is available via the xml property on the Poddle object:

$xml = $poddle->xml; // string

Serialization and Deserialization

All classes under the PhanAn\Poddle\Values namespace implement the \Illuminate\Contracts\Support\Arrayable and \Illuminate\Contracts\Support\Jsonable contracts, which provide two methods:

/**
  * Get the instance as an array. All nested objects are also converted to arrays.
  */
public function toArray(): array;

/**
  * Convert the object to its JSON representation.
  */
public function toJson($options = 0): string;

Additionally, classes like Channel and Episode provide fromArray static methods to create instances from arrays. These methods allow you to easily serialize and deserialize the objects, making it straightforward to store and retrieve the data in a database or JSON file. For instance, you can create an Eloquent custom cast in Laravel this way:

use Illuminate\Contracts\Database\Eloquent\CastsAttributes;
use PhanAn\Poddle\Values\Channel;

class ChannelCast implements CastsAttributes
{
    public function get($model, string $key, $value, array $attributes): Channel
    {
        return Channel::fromArray(json_decode($value, true));
    }

    /** @param Channel $value */
    public function set($model, string $key, $value, array $attributes)
    {
        return $value->toJson();
    }
}

Then, you can use the cast in your Eloquent model:

use Illuminate\Database\Eloquent\Model;

class Podcast extends Model
{
    protected $casts = [
        'channel' => ChannelCast::class,
    ];
}

Possible Questions

Why does Poddle not include element or value X from the feed?

Poddle follows the PSP-1 standard, which specifies the required and recommended elements for a podcast feed. If an element or value is not part of the standard, it is not included in Poddle. However, you can still access any element or value using the xmlReader property as described above.

How come pubDate is not a required element for episodes?

The PSP-1 standard does not require pubDate for episodes, but it is a recommended element. As a result, pubDate is available as part of the episode's metadata as a nullable \DateTime object. It’s up to you to determine if the value always presents and design your system accordingly.

Why is the episode's GUID an object instead of a string?

Per PSP-1 standard, an item’s <guid> element indeed contains a globally unique string value, but it can also have an attribute isPermaLink that indicates whether the GUID is a permalink. As such, the item GUID in Poddle is represented as an object with two public properties: value (string) and isPermaLink (bool). The object, however, implements the __toString method, so you can cast it to a string for convenience.

Where is an episode’s media URL?

The media URL for an episode is available as part of the episode's enclosure property.

Why are the episodes returned as an EpisodeCollection extends LazyCollection object? What’s a lazy collection anyway?

The LazyCollection class leverages PHP's generators to allow you to work with very large datasets while keeping memory usage low. Since a podcast feed can potentially contain a large number of episodes, returning a LazyCollection allows you to iterate over the episodes without loading them all into memory at once, speeding up the process and reducing memory consumption.

Can you support feature X/Y/Z?

Poddle aims to be a lightweight and efficient podcast feed parser that follows the PSP-1 standard, not a full-blown RSS/Atom parser. That said, if you have a feature request or suggestion, feel free to open an issue. Better yet, you can fork the repository, implement the feature yourself, and submit a pull request.