patrickschur / stanford-nlp-tagger
PHP wrapper for the Stanford Natural Language Processing library. Supports POSTagger and CRFClassifier.
Installs: 19 552
Dependents: 1
Suggesters: 0
Security: 0
Stars: 76
Watchers: 7
Forks: 13
Open Issues: 1
Requires
- php: ^7
- patrickschur/language-detection: ^3
README
A PHP wrapper for the Stanford Natural Language Processing library. Supports POSTagger and CRFClassifier. Loads automatically the right packages and detects the language of the given text.
Requirements
- You have to install Java in version 1.8+ or higher.
- Download the right packages and extract them into the directory. (The script loads automatically the right packages, no matter where they are.)
Installation with Composer
$ composer require patrickschur/stanford-nlp-tagger
Example
- Download the required packages for the POSTagger here for English only or here for Arabic, Chinese, French, Spanish, and German.
- Extract the (.zip) package into your directory. (Please do not rename the packages, only if you want to add this packages manually.)
$pos = new \StanfordTagger\POSTagger(); $pos->tag('My dog also likes eating sausage.');
Results in
My_PRP$ dog_NN also_RB likes_VBZ eating_JJ sausage_NN ._.
setOutputFormat()
There are three ways of output formats (xml, slashTags and tsv)
$pos = new \StanfordTagger\POSTagger(); $pos->setOutputFormat(StanfordTagger::OUTPUT_FORMAT_XML); $pos->tag('My dog also likes eating sausage.');
Result as XML:
<?xml version="1.0" encoding="UTF-8"?> <pos> <sentence id="0"> <word wid="0" pos="PRP$">My</word> <word wid="1" pos="NN">dog</word> <word wid="2" pos="RB">also</word> <word wid="3" pos="VBZ">likes</word> <word wid="4" pos="JJ">eating</word> <word wid="5" pos="NN">sausage</word> <word wid="6" pos=".">.</word> </sentence> </pos>
or use
$pos->setOutputFormat(StanfordTagger::OUTPUT_FORMAT_TSV);
for
My PRP$
dog NN
also RB
likes VBZ
eating JJ
sausage NN
. .
setModel(), setJarArchive() and setClassfier()
All packages are loaded automatically but if you want to change that you can set them manually.
$pos = new \StanfordTagger\POSTagger(); $pos->setModel(__DIR__ . '/stanford-postagger-full-2018-10-16/models/english-bidirectional-distsim.tagger'); $pos->setJarArchive(__DIR__ . '/stanford-postagger-full-2018-10-16/stanford-postagger.jar');
CRFClassifier
- For English only, download the required packages for the CRFClassifier here.
- You have to download the language models separately:
- Extract the (.jar) files if you downloaded a language model and add them into your directory.
Example
$ner = new \StanfordTagger\CRFClassifier(); $ner->tag('Albert Einstein was a theoretical physicist born in Germany.');
Albert/PERSON Einstein/PERSON was/O theoretical/O physicist/O born/O in/O Germany/LOCATION./O
Contribute
Feel free to contribute to this repository. Any help is welcome.