markuspoerschke / extractum
Extract information from web pages.
Installs: 1 099
Dependents: 0
Suggesters: 0
Security: 0
Stars: 3
Watchers: 4
Forks: 1
Open Issues: 6
Requires
- php: ^7.4 || ^8.0
- ext-dom: *
- ext-json: *
- ml/json-ld: ^1.2
- symfony/css-selector: ^5.1
- symfony/dom-crawler: ^5.1
- voku/stop-words: ^2.0
Requires (Dev)
- ergebnis/composer-normalize: ^2.11
- friendsofphp/php-cs-fixer: ^3.0
- phpmd/phpmd: ^2.9
- phpunit/phpunit: ^9.4
- symfony/finder: ^5.1
- symfony/var-dumper: ^5.2
- vimeo/psalm: ^4.1
- 1.x-dev
- 1.0.3
- 1.0.2
- 1.0.1
- 1.0.0
- dev-dependabot/github_actions/actions/cache-4.1.2
- dev-dependabot/github_actions/reviewdog/action-languagetool-1.20
- dev-dependabot/github_actions/stefanzweifel/git-auto-commit-action-5.0.1
- dev-dependabot/composer/friendsofphp/php-cs-fixer-3.45.0
- dev-dependabot/github_actions/actions/checkout-4
- dev-dependabot/github_actions/actionsx/prettier-3
This package is auto-updated.
Last update: 2024-11-23 04:18:10 UTC
README
Extractum is a PHP library that extracts information from web pages.
Getting Started
Installation
composer require markuspoerschke/extractum
Usage
$uri = 'https://www.example.com/'; $html = file_get_contents($uri); $extractor = new Extractum\Extractor(); $essence = $extractor->extract($html, $uri);
Extracted Information
The extracted information are returned as an object of type Extractum\Essence
.
License
This package is released under the MIT license.