jaybizzle / doc-to-text
Extract text from a Word Doc
Installs: 9 807
Dependents: 0
Suggesters: 0
Security: 0
Stars: 5
Watchers: 2
Forks: 3
Open Issues: 0
Requires
- php: ^7.3
- symfony/process: ^4.0|^5.0
Requires (Dev)
- phpunit/phpunit: ^8.0|^9.0
This package is auto-updated.
Last update: 2025-01-16 04:54:28 UTC
README
This package provides a class to extract text from a Word Doc.
<?php use Jaybizzle\DocToText\Doc; echo Doc::getText('book.doc'); // returns the text from the doc
Requirements
Behind the scenes this package leverages antiword. You can verify if the binary is installed on your system by issuing this command:
which antiword
If it is installed it will return the path to the binary.
To install the binary you can use this command on Ubuntu or Debian:
apt-get install antiword
Installation
You can install the package via composer:
composer require jaybizzle/doc-to-text
Usage
Extracting text from a Doc is easy.
$text = (new Doc()) ->setDoc('book.doc') ->text();
Or easier:
echo Doc::getText('book.doc');
By default the package will assume that the antiword
command is located at /usr/bin/antiword
.
If it is located elsewhere pass its binary path to the constructor
$text = (new Doc('/custom/path/to/antiword')) ->setDoc('book.doc') ->text();
or as the second parameter to the getText
static method:
echo Doc::getText('book.doc', '/custom/path/to/antiword');
Sometimes you may want to use antiword options. To do so you can set them up using the setOptions
method.
$text = (new Doc()) ->setDoc('table.doc') ->setOptions(['f', 'w 80']) ->text() ;
or as the third parameter to the getText
static method:
echo Doc::getText('book.doc', null, ['f', 'w 80']);
Change log
Please see CHANGELOG for more information about what has changed recently.
Testing
composer test
Security
If you discover any security related issues, please email mbeech@mark-beech.co.uk instead of using the issue tracker.
Credits
License
The MIT License (MIT). Please see License File for more information.