crwlr / robots-txt
Robots Exclusion Standard/Protocol Parser for Web Crawling/Scraping
Fund package maintenance!
otsch
Installs: 10 467
Dependents: 1
Suggesters: 0
Security: 0
Stars: 9
Watchers: 1
Forks: 2
Open Issues: 0
Requires
- php: ^8.0
- crwlr/url: ^1.0|^2.0
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.0
- mockery/mockery: ^1.4
- phpstan/phpstan: ^1.1
- phpunit/phpunit: ^9.0
- sempro/phpunit-pretty-print: ^1.4
README
Robots Exclusion Standard/Protocol Parser
for Web Crawling/Scraping
Use this library within crawler/scraper programs to parse robots.txt files and check if your crawler user-agent is allowed to load certain paths.
Documentation
You can find the documentation at crwlr.software.
Contributing
If you consider contributing something to this package, read the contribution guide (CONTRIBUTING.md).