Name	Name	Last commit message	Last commit date
Latest commit History 79 Commits
examples	examples
jsdoc2md	jsdoc2md
scripts	scripts
src	src
test	test
.eslintrc.json	.eslintrc.json
.gitignore	.gitignore
.travis.yml	.travis.yml
LICENSE.txt	LICENSE.txt
README.md	README.md
gulpfile.babel.js	gulpfile.babel.js
package-lock.json	package-lock.json
package.json	package.json

NodeJS CoreNLP Library

This project is under active development, please stay tuned for updates. More documentation and examples are comming.

This library connects to Stanford CoreNLP either via HTTP or by spawning processes. The first (HTTP) is the preferred method since it requires CoreNLP to initialize just once to serve many requests, it also avoids extra I/O given that the CLI method need to write temporary files to run.

Setup

1. Install the package:

npm i --save corenlp

2. Download Stanford CoreNLP

Via npm, run this command from your own project after having installed this library:

npm explore corenlp -- npm run corenlp:download

Once downloaded you can easily start the server by running

npm explore corenlp -- npm run corenlp:server

Or you can manually download the project from the Stanford's CoreNLP download section at: https://stanfordnlp.github.io/CoreNLP/download.html You may want to download, apart of the full package, other language models (see more on that page).

3. Configure Stanford CoreNLP

3.1. Using StanfordCoreNLPServer

# Run the server using all jars in the current directory (e.g., the CoreNLP home directory)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

CoreNLP connects by default via StanfordCoreNLPServer, using port 9000. You can also opt to setup the connection differently:

import CoreNLP, { Properties, Pipeline, ConnectorServer } from 'corenlp';

const connector = new ConnectorServer({ dsn: 'http://localhost:9000' });
const props = new Properties({
  annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English', connector);

3.2. Use CoreNLP via CLI

CoreNLP expects by default the StanfordCoreNLP package to be placed (unzipped) inside the path ${YOUR_NPM_PROJECT_ROOT}/corenlp/. You can also opt to setup the CLI interface differently:

import CoreNLP, { Properties, Pipeline, ConnectorCli } from 'corenlp';

const connector = new ConnectorCli({
  classPath: 'corenlp/stanford-corenlp-full-2017-06-09/*', // specify the paths relative to your npm project root
  mainClass: 'edu.stanford.nlp.pipeline.StanfordCoreNLP', // optional
  props: 'StanfordCoreNLP-spanish.properties', // optional
});
const props = new Properties({
  annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English', connector);

4. Usage

// ... initialize pipeline first (see above)

const sent = new CoreNLP.simple.Sentence('Hello world');
pipeline.annotate(sent)
  .then(sent => {
    console.log(sent.words());
  })
  .catch(err => {
    console.log('err', err);
  });

Examples

NOTE1: The examples below assumes that StanfordCoreNLP is running on port 9000. NOTE2: The examples below assumes es6 syntax, if you use require, use as follows: var CoreNLP = require('corenlp').default;

English

import CoreNLP, { Properties, Pipeline } from 'corenlp';

const props = new Properties({
  annotators: 'tokenize,ssplit,pos,lemma,ner,parse',
});
const pipeline = new Pipeline(props, 'English'); // uses ConnectorServer by default

const sent = new CoreNLP.simple.Sentence('The little dog runs so fast.');
pipeline.annotate(sent)
  .then(sent => {
    console.log('parse', sent.parse());
    console.log(CoreNLP.util.Tree.fromSentence(sent).dump());
  })
  .catch(err => {
    console.log('err', err);
  });

API Reference

Functions

setProperty(name, value): Property setter
getProperty(name, default) ⇒ *: Property getter
getProperties() ⇒ Object: Returns an Object map of the given properties
toJson() ⇒ Object: Returns a JSON object of the given properties
toPropertiessFileContent() ⇒ string: Returns a properties file-like string of the given properties
get() ⇒ Promise.<Object>
get(config, [utility]) ⇒ Promise.<Object>
text() ⇒ string: Get a string representation of the raw text
setLanguageISO() ⇒ string: Sets the language ISO (given by the pipeline during the annotation process) This is solely to keep track of the language chosen for further analysis
getLanguageISO() ⇒ string: Retrieves the language ISO
addAnnotator(annotator): Marks an annotator as a met dependency
addAnnotators(annotators): Marks multiple annotators as a met dependencies
removeAnnotator(annotator): Unmarks an annotator as a met dependency
hasAnnotator(annotator) ⇒ boolean: Tells you if an annotator is a met dependency
hasAnyAnnotator(annotators) ⇒ boolean: Tells you if at least on of a list of annotators is a met dependency
toString() ⇒ string: Get a string representation
equalsTo(annotator) ⇒ boolean: Defines whether a given annotator is the same as current, using shallow compare. This is useful for a Document or Sentence to validate if the minimum of annotators required were already applied to them. Allows at the same time the users to instantiate new annotators and configure them as needed.
options() ⇒ Object: Get an Object key-value representation of the annotor's options (excluding prefix)
option(key, [value]) ⇒ string: Get/Set an option value
dependencies() ⇒ Array.<Annotator>: Get a list of annotators dependencies
pipeline() ⇒ Array.<string>: Get a list of annotators dependencies, following by this annotator, all this as a list of strings This is useful to fulfill the annotators param in CoreNLP API properties.
pipelineOptions() ⇒ Array.<string>: Get an object of all the Annotator options including the current and all its dependencies, prefixed by the annotator names This is useful to fulfill the options params in CoreNLP API properties.
toString() ⇒ string: Get a string representation
sentences() ⇒ Array.<Sentence>: Get a list of sentences
sentence(index) ⇒ Sentence: Get the sentence for a given index
coref() ⇒ Promise.<DeterministicCorefAnnotator>: TODO requirements: tokenize, ssplit, pos, lemma, ner, parse https://stanfordnlp.github.io/CoreNLP/dcoref.html
fromJSON(data) ⇒ Document: Update an instance of Document with data provided by a JSON
fromJSON(data) ⇒ Document: Get an instance of Document from a given JSON
groups() ⇒ Array.<ExpressionSentenceMatchGroup>: Returns the main and labeled groups as a list of ExpressionSentenceMatchGroup
group(label) ⇒ ExpressionSentenceMatchGroup: Nodes in a Macthed expression can be named, we call them groups here, and the labels are the name of the nodes.
labels() ⇒ Array.<string>: Labels are those aliases you can add to a group match expression, for example, in Semgrex, you can do {ner:/PERSON/=good_guy}, from where "good_guy" would be the label and internally it will come as $good_guy as a member of ExpressionSentenceMatchGroup.
fromJson(data) ⇒ ExpressionSentenceMatch: Update an instance of ExpressionSentenceMatch with data provided by a JSON
fromJson(data) ⇒ ExpressionSentenceMatch: Get an instance of ExpressionSentenceMatch from a given JSON
matches() ⇒ Array.<ExpressionSentenceMatch>: Retrieves all the contained ExpressionSentenceMatch instances
match(index) ⇒ ExpressionSentenceMatch: Retrieves a ExpressionSentenceMatch at the index specified
mergeTokensFromSentence() ⇒ ExpressionSentence: The Expression / ExpressionSentence objects comes from outside the standard CoreNLP pipelines. This mean that neither TokensRegex, Semgrex nor Tregex will tag the nodes with POS, lemma, NER or any otehr annotation data. This is sometimes a usful resource to count with, if you can apart of getting the matching groups, get the annotated tokens for each word in the match group.
fromJson(data) ⇒ ExpressionSentenceJSON: Update an instance of ExpressionSentence with data provided by a JSON
fromJson(data) ⇒ ExpressionSentence: Get an instance of ExpressionSentence from a given JSON of sentence matches
toString() ⇒ string: Get a string representation
pattern() ⇒ string: Get the pattern
sentences() ⇒ Array.<ExpressionSentence>: Get a list of sentences
sentence(index) ⇒ ExpressionSentence: Get the sentence for a given index
mergeTokensFromDocument(document) ⇒ Expression: Hydrate the Expression instance with Token objects from an annotated Document
fromJson(data) ⇒ Expression: Update an instance of Expression with data provided by a JSON
fromJson(data) ⇒ Expression: Get an instance of Expression from a given JSON
toString() ⇒ string: Get a string representation
fromJSON(data) ⇒ Governor: Get an instance of Governor from a given JSON
toString() ⇒ string: Get a string representation
index() ⇒ number: Get the index relative to the parent document
parse() ⇒ string: Get a string representation of the parse tree structure
words() ⇒ Array.<string>: Get an array of string representations of the sentence words
word(index) ⇒ string: Get a string representations of the Nth word of the sentence
posTags() ⇒ Array.<string>: Get a string representations of the tokens part of speech of the sentence
posTag(index) ⇒ string: Get a string representations of the Nth token part of speech of the sentence
lemmas() ⇒ Array.<string>: Get a string representations of the tokens lemmas of the sentence
lemma(index) ⇒ string: Get a string representations of the Nth token lemma of the sentence
nerTags() ⇒ Array.<string>: Get a string representations of the tokens nerTags of the sentence
nerTag(index) ⇒ string: Get a string representations of the Nth token nerTag of the sentence
governors() ⇒ Array.<Governor>: Get a list of annotated governors by the dependency-parser
governor() ⇒ Governor: Get the N-th annotated governor by the dependency-parser annotator
tokens() ⇒ Array.<Token>: Get an array of token representations of the sentence words
token() ⇒ Token: Get the Nth token of the sentence
toJSON() ⇒ SentenceJSON: The following arrow function data => Sentence.fromJSON(data).toJSON() is idempontent, if considering shallow comparison, not by reference. This JSON will respects the same structure as it expects from {@see Sentence#fromJSON}.
fromJSON(data, [isSentence]) ⇒ Sentence: Update an instance of Sentence with data provided by a JSON
fromJSON(data, [isSentence]) ⇒ Sentence: Get an instance of Sentence from a given JSON
toString() ⇒ string: Get a string representation
index() ⇒ number: Get the inde number associated by the StanfordCoreNLP This index is relative to the sentence it belongs to, and is a 1-based (possitive integer). This number is useful to match tokens within a sentence for depparse, coreference, etc.
word() ⇒ string: Get the original word
originalText() ⇒ string: Get the original text
characterOffsetBegin() ⇒ number: A 0-based index of the word's initial character within the sentence
characterOffsetEnd() ⇒ number: Get the characterOffsetEnd relative to the parent sentence A 0-based index of the word's ending character within the sentence
before() ⇒ string: Get the before string relative to the container sentence
after() ⇒ string: Get the after string relative to the container sentence
lemma() ⇒ string: Get the annotated lemma
pos() ⇒ string: Get the annotated part-of-speech for the current token
posInfo() ⇒ PosInfo: Get additional metadata about the POS annotation NOTE: Do not use this method other than just for study or analysis purposes.
ner() ⇒ string: Get the annotated named-entity for the current token
toJSON() ⇒ TokenJSON: The following arrow function data => Token.fromJSON(data).toJSON() is idempontent, if considering shallow comparison, not by reference. This JSON will respects the same structure as it expects from {@see Token#fromJSON}.
fromJSON(data) ⇒ Token: Get an instance of Token from a given JSON
dump() ⇒ string: Get a Tree string representation for debugging purposes
visitDeepFirst(): Performs Deep-first Search calling a visitor for each node
visitDeepFirstRight(): Performs Deep-first Search calling a visitor for each node, from right to left
visitLeaves(): Performs Deep-first Search calling a visitor only over leaves
fromSentence(sentence, [doubleLink]) ⇒ Tree
fromString(str, [doubleLink]) ⇒ Tree

Typedefs

DocumentJSON: The CoreNLP API JSON structure representing a document
ExpressionSentenceMatchGroup
ExpressionSentenceMatchJSON: A ExpressionSentenceMatch of either TokensRegex, Semrgex or Tregex.
ExpressionJSON: The CoreNLP API JSON structure representing an expression This expression structure can be found as the output of TokensRegex, Semrgex and Tregex.
GovernorJSON: The CoreNLP API JSON structure representing a governor
SentenceJSON: The CoreNLP API JSON structure representing a sentence
TokenJSON: The CoreNLP API JSON structure representing a token
PosInfo: PosInfo does not come as part of the CoreNLP. It is an indexed reference of POS tags by language provided by this library. It's only helpful for analysis and study. The data was collected from different documentation resources on the Web. The PosInfo may vary depending on the POS annotation types used, for example, CoreNLP for Spanish uses custom POS tags developed by Stanford, but this can also be changed to Universal Dependencies, which uses different tags.

External

DeterministicCorefAnnotator TODO ?? ⇐ Annotator: Class representing an DeterministicCorefAnnotator.
DependencyParseAnnotator Hydrates {@link Sentence.governors()} ⇐ Annotator: Class representing an DependencyParseAnnotator.
MorphaAnnotator Hydrates {@link Token.lemma()} ⇐ Annotator: Class representing an MorphaAnnotator.
NERClassifierCombiner Hydrates {@link Token.ner()} ⇐ Annotator: Class representing an NERClassifierCombiner.
ParserAnnotator Hydrates {@link Token.parse()} ⇐ Annotator: Class representing an ParserAnnotator.
POSTaggerAnnotator Hydrates {@link Token.pos()} ⇐ Annotator: Class representing an POSTaggerAnnotator.
RegexNERAnnotator TODO ?? ⇐ Annotator: Class representing an RegexNERAnnotator.
RelationExtractorAnnotator TODO ?? ⇐ Annotator: Class representing an RelationExtractorAnnotator.
WordsToSentenceAnnotator Combines multiple {@link Token}s into sentences ⇐ Annotator: Class representing an WordsToSentenceAnnotator.
TokenizerAnnotator Identifies {@link Token}s ⇐ Annotator: Class representing an TokenizerAnnotator.

setProperty(name, value)

Property setter

Kind: global function

Param	Type	Description
name	`string`	the property name
value	`*`	the property value

getProperty(name, default) ⇒ `*`

Property getter

Kind: global function
Returns: * - value - the property value

Param	Type	Description
name	`string`	the property name
default	`*`	the defaut value to return if not set

getProperties() ⇒ `Object`

Returns an Object map of the given properties

Kind: global function
Returns: Object - properties - the properties object

toJson() ⇒ `Object`

Returns a JSON object of the given properties

Kind: global function
Returns: Object - json - the properties object

toPropertiessFileContent() ⇒ `string`

Returns a properties file-like string of the given properties

Kind: global function
Returns: string - properties - the properties content

get() ⇒ `Promise.<Object>`

Kind: global function

get(config, [utility]) ⇒ `Promise.<Object>`

Kind: global function

Param	Type	Description
config	`Object`
config.annotators	`Array.<string>`	The list of annotators that edfines the pipeline
config.text	`string`	The text to run the pipeline against
config.options	`Object`	Additinal options (properties) for the pipeline
config.language	`string`	Language full name in CamelCase (eg. Spanish)
[utility]	`''` \| `'tokensregex'` \| `'semgrex'` \| `'tregex'`	Name of the utility to use NOTE: most of the utilities receives properties, these should be passed via the options param

text() ⇒ `string`

Get a string representation of the raw text

Kind: global function
Returns: string - text

setLanguageISO() ⇒ `string`

Sets the language ISO (given by the pipeline during the annotation process) This is solely to keep track of the language chosen for further analysis

Kind: global function
Returns: string - text

getLanguageISO() ⇒ `string`

Retrieves the language ISO

Kind: global function
Returns: string - text

addAnnotator(annotator)

Marks an annotator as a met dependency

Kind: global function

Param	Type
annotator	`Annotator` \| `function`

addAnnotators(annotators)

Marks multiple annotators as a met dependencies

Kind: global function

Param	Type
annotators	`Array.<(Annotator\|function())>`

removeAnnotator(annotator)

Unmarks an annotator as a met dependency

Kind: global function

Param	Type
annotator	`Annotator` \| `function`

hasAnnotator(annotator) ⇒ `boolean`

Tells you if an annotator is a met dependency

Kind: global function
Returns: boolean - hasAnnotator

Param	Type
annotator	`Annotator` \| `function`

hasAnyAnnotator(annotators) ⇒ `boolean`

Tells you if at least on of a list of annotators is a met dependency

Kind: global function
Returns: boolean - hasAnyAnnotator

Param	Type
annotators	`Array.<(Annotator\|function())>`

toString() ⇒ `string`

Get a string representation

Kind: global function
Returns: string - annotator

equalsTo(annotator) ⇒ `boolean`

Defines whether a given annotator is the same as current, using shallow compare. This is useful for a Document or Sentence to validate if the minimum of annotators required were already applied to them. Allows at the same time the users to instantiate new annotators and configure them as needed.

Kind: global function

Param	Type
annotator	`Annotator`

options() ⇒ `Object`

Get an Object key-value representation of the annotor's options (excluding prefix)

Kind: global function
Returns: Object - options

option(key, [value]) ⇒ `string`

Get/Set an option value

Kind: global function
Returns: string - value

Param	Type	Default
key	`string`
[value]	`string` \| `boolean`	`null`

dependencies() ⇒ `Array.<Annotator>`

Get a list of annotators dependencies

Kind: global function
Returns: Array.<Annotator> - dependencies

pipeline() ⇒ `Array.<string>`

Get a list of annotators dependencies, following by this annotator, all this as a list of strings This is useful to fulfill the annotators param in CoreNLP API properties.

Kind: global function
Returns: Array.<string> - pipeline

pipelineOptions() ⇒ `Array.<string>`

Get an object of all the Annotator options including the current and all its dependencies, prefixed by the annotator names This is useful to fulfill the options params in CoreNLP API properties.

Kind: global function
Returns: Array.<string> - pipelineOptions

toString() ⇒ `string`

Get a string representation

Kind: global function
Returns: string - document

sentences() ⇒ `Array.<Sentence>`

Get a list of sentences

Kind: global function
Returns: Array.<Sentence> - sentences - The document sentences

sentence(index) ⇒ `Sentence`

Get the sentence for a given index

Kind: global function
Returns: Sentence - sentence - The document sentences

Param	Type	Description
index	`number`	The position of the sentence to get

coref() ⇒ `Promise.<DeterministicCorefAnnotator>`

TODO requirements: tokenize, ssplit, pos, lemma, ner, parse https://stanfordnlp.github.io/CoreNLP/dcoref.html

Kind: global function
Returns: Promise.<DeterministicCorefAnnotator> - dcoref

fromJSON(data) ⇒ `Document`

Update an instance of Document with data provided by a JSON

Kind: global function
Returns: Document - document - The current document instance

Param	Type	Description
data	`DocumentJSON`	The document data, as returned by CoreNLP API service

fromJSON(data) ⇒ `Document`

Get an instance of Document from a given JSON

Kind: global function
Returns: Document - document - A new Document instance

Param	Type	Description
data	`DocumentJSON`	The document data, as returned by CoreNLP API service

groups() ⇒ `Array.<ExpressionSentenceMatchGroup>`

Returns the main and labeled groups as a list of ExpressionSentenceMatchGroup

Kind: global function
Returns: Array.<ExpressionSentenceMatchGroup> - groups

group(label) ⇒ `ExpressionSentenceMatchGroup`

Nodes in a Macthed expression can be named, we call them groups here, and the labels are the name of the nodes.

Kind: global function
Returns: ExpressionSentenceMatchGroup - group
See: https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/semgraph/semgrex/SemgrexPattern.html#Naming_nodes

Param	Type	Description
label	`string`	The label name, not prefixed wih $

labels() ⇒ `Array.<string>`

Labels are those aliases you can add to a group match expression, for example, in Semgrex, you can do {ner:/PERSON/=good_guy}, from where "good_guy" would be the label and internally it will come as $good_guy as a member of ExpressionSentenceMatchGroup.

Kind: global function
Returns: Array.<string> - labels

fromJson(data) ⇒ `ExpressionSentenceMatch`

Update an instance of ExpressionSentenceMatch with data provided by a JSON

Kind: global function
Returns: ExpressionSentenceMatch - expression - The current match instance

Param	Type	Description
data	`ExpressionSentenceMatchJSON`	The match data, as returned by CoreNLP API service

fromJson(data) ⇒ `ExpressionSentenceMatch`

Get an instance of ExpressionSentenceMatch from a given JSON

Kind: global function
Returns: ExpressionSentenceMatch - match - A new ExpressionSentenceMatch instance

Param	Type	Description
data	`ExpressionSentenceMatchJSON`	The match data, as returned by CoreNLP API service

matches() ⇒ `Array.<ExpressionSentenceMatch>`

Retrieves all the contained ExpressionSentenceMatch instances

Kind: global function
Returns: Array.<ExpressionSentenceMatch> - matches

match(index) ⇒ `ExpressionSentenceMatch`

Retrieves a ExpressionSentenceMatch at the index specified

Kind: global function
Returns: ExpressionSentenceMatch - match

Param	Type
index	`number`

mergeTokensFromSentence() ⇒ `ExpressionSentence`

The Expression / ExpressionSentence objects comes from outside the standard CoreNLP pipelines. This mean that neither TokensRegex, Semgrex nor Tregex will tag the nodes with POS, lemma, NER or any otehr annotation data. This is sometimes a usful resource to count with, if you can apart of getting the matching groups, get the annotated tokens for each word in the match group.

Kind: global function
Returns: ExpressionSentence - instance = The current instance

fromJson(data) ⇒ `ExpressionSentenceJSON`

Update an instance of ExpressionSentence with data provided by a JSON

Kind: global function
Returns: ExpressionSentenceJSON - sentence - The current sentence instance

Param	Type	Description
data	`ExpressionSentenceJSON`	The expression data, as returned by CoreNLP API service

fromJson(data) ⇒ `ExpressionSentence`

Get an instance of ExpressionSentence from a given JSON of sentence matches

Kind: global function
Returns: ExpressionSentence - sentence - A new ExpressionSentence instance

Param	Type	Description
data	`ExpressionSentenceJSON`	The sentence data, as returned by CoreNLP API service

toString() ⇒ `string`

Get a string representation

Kind: global function
Returns: string - expression

pattern() ⇒ `string`

Get the pattern

Kind: global function
Returns: string - pattern - The expression pattern

sentences() ⇒ `Array.<ExpressionSentence>`

Get a list of sentences

Kind: global function
Returns: Array.<ExpressionSentence> - sentences - The expression sentences

sentence(index) ⇒ `ExpressionSentence`

Get the sentence for a given index

Kind: global function
Returns: ExpressionSentence - sentence - An expression sentence

Param	Type	Description
index	`number`	The position of the sentence to get

mergeTokensFromDocument(document) ⇒ `Expression`

Hydrate the Expression instance with Token objects from an annotated Document

Kind: global function
Returns: Expression - expression - The current expression instance
See: ExpressionSentence#mergeTokensFromSentence

Param	Type	Description
document	`Document`	An annotated document from where to extract the tokens

fromJson(data) ⇒ `Expression`

Update an instance of Expression with data provided by a JSON

Kind: global function
Returns: Expression - expression - The current expression instance

Param	Type	Description
data	`ExpressionJSON`	The expression data, as returned by CoreNLP API service

fromJson(data) ⇒ `Expression`

Get an instance of Expression from a given JSON

Kind: global function
Returns: Expression - expression - A new Expression instance

Param	Type	Description
data	`ExpressionJSON`	The expression data, as returned by CoreNLP API service

toString() ⇒ `string`

Get a string representation

Kind: global function
Returns: string - governor

fromJSON(data) ⇒ `Governor`

Get an instance of Governor from a given JSON

Kind: global function
Returns: Governor - governor - A new Governor instance
Todo

It is not possible to properly generate a Governor from a GovernorJSON the Governor requires references to the Token instances in order to work

Param	Type	Description
data	`GovernorJSON`	The token data, as returned by CoreNLP API service

toString() ⇒ `string`

Get a string representation

Kind: global function
Returns: string - sentence

index() ⇒ `number`

Get the index relative to the parent document

Kind: global function
Returns: number - index

parse() ⇒ `string`

Get a string representation of the parse tree structure

Kind: global function
Returns: string - parse

words() ⇒ `Array.<string>`

Get an array of string representations of the sentence words

Kind: global function
Returns: Array.<string> - words
Throws:

Error in case the require annotator was not applied to the sentence

Requires: {@link TokenizerAnnotator}

word(index) ⇒ `string`

Get a string representations of the Nth word of the sentence

Kind: global function
Returns: string - word
Throws:

Error in case the require annotator was not applied to the sentence
Error in case the token for the given index does not exists

Requires: {@link TokenizerAnnotator}

Param	Type	Description
index	`number`	0-based index as they are arranged naturally

posTags() ⇒ `Array.<string>`

Get a string representations of the tokens part of speech of the sentence

Kind: global function
Returns: Array.<string> - posTags

posTag(index) ⇒ `string`

Get a string representations of the Nth token part of speech of the sentence

Kind: global function
Returns: string - posTag
Throws:

Error in case the token for the given index does not exists

Param	Type	Description
index	`number`	0-based index as they are arranged naturally

lemmas() ⇒ `Array.<string>`

Get a string representations of the tokens lemmas of the sentence

Kind: global function
Returns: Array.<string> - lemmas

lemma(index) ⇒ `string`

Get a string representations of the Nth token lemma of the sentence

Kind: global function
Returns: string - lemma
Throws:

Error in case the token for the given index does not exists

Param	Type	Description
index	`number`	0-based index as they are arranged naturally

nerTags() ⇒ `Array.<string>`

Get a string representations of the tokens nerTags of the sentence

Kind: global function
Returns: Array.<string> - nerTags

nerTag(index) ⇒ `string`

Get a string representations of the Nth token nerTag of the sentence

Kind: global function
Returns: string - nerTag
Throws:

Error in case the token for the given index does not exists

Param	Type	Description
index	`number`	0-based index as they are arranged naturally

governors() ⇒ `Array.<Governor>`

Get a list of annotated governors by the dependency-parser

Kind: global function
Returns: Array.<Governor> - governors
Throws:

Error in case the require annotator was not applied to the sentence

Requires: {@link DependencyParseAnnotator}

governor() ⇒ `Governor`

Get the N-th annotated governor by the dependency-parser annotator

Kind: global function
Returns: Governor - governor
Throws:

Error in case the require annotator was not applied to the sentence

Requires: {@link DependencyParseAnnotator}

tokens() ⇒ `Array.<Token>`

Get an array of token representations of the sentence words

Kind: global function
Returns: Array.<Token> - tokens
Throws:

Error in case the require annotator was not applied to the sentence

Requires: {@link TokenizerAnnotator}

token() ⇒ `Token`

Get the Nth token of the sentence

Kind: global function
Returns: Token - token
Throws:

Error in case the require annotator was not applied to the sentence

Requires: {@link TokenizerAnnotator}

toJSON() ⇒ `SentenceJSON`

The following arrow function data => Sentence.fromJSON(data).toJSON() is idempontent, if considering shallow comparison, not by reference. This JSON will respects the same structure as it expects from {@see Sentence#fromJSON}.

Kind: global function
Returns: SentenceJSON - data

fromJSON(data, [isSentence]) ⇒ `Sentence`

Update an instance of Sentence with data provided by a JSON

Kind: global function
Returns: Sentence - sentence - The current sentence instance

Param	Type	Default	Description
data	`SentenceJSON`		The document data, as returned by CoreNLP API service
[isSentence]	`boolean`	`false`	Indicate if the given data represents just the sentence or a full document with just a sentence inside

fromJSON(data, [isSentence]) ⇒ `Sentence`

Get an instance of Sentence from a given JSON

Kind: global function
Returns: Sentence - document - A new Sentence instance

Param	Type	Default	Description
data	`SentenceJSON`		The document data, as returned by CoreNLP API service
[isSentence]	`boolean`	`false`	Indicate if the given data represents just the sentence of a full document

toString() ⇒ `string`

Get a string representation

Kind: global function
Returns: string - token

index() ⇒ `number`

Get the inde number associated by the StanfordCoreNLP This index is relative to the sentence it belongs to, and is a 1-based (possitive integer). This number is useful to match tokens within a sentence for depparse, coreference, etc.

Kind: global function
Returns: number - index

word() ⇒ `string`

Get the original word

Kind: global function
Returns: string - word

originalText() ⇒ `string`

Get the original text

Kind: global function
Returns: string - originalText

characterOffsetBegin() ⇒ `number`

A 0-based index of the word's initial character within the sentence

Kind: global function
Returns: number - characterOffsetBegin

characterOffsetEnd() ⇒ `number`

Get the characterOffsetEnd relative to the parent sentence A 0-based index of the word's ending character within the sentence

Kind: global function
Returns: number - characterOffsetEnd

before() ⇒ `string`

Get the before string relative to the container sentence

Kind: global function
Returns: string - before

after() ⇒ `string`

Get the after string relative to the container sentence

Kind: global function
Returns: string - after

lemma() ⇒ `string`

Get the annotated lemma

Kind: global function
Returns: string - lemma

pos() ⇒ `string`

Get the annotated part-of-speech for the current token

Kind: global function
Returns: string - pos

posInfo() ⇒ `PosInfo`

Get additional metadata about the POS annotation NOTE: Do not use this method other than just for study or analysis purposes.

Kind: global function
Returns: PosInfo - posInfo
See: PosInfo for more details

ner() ⇒ `string`

Get the annotated named-entity for the current token

Kind: global function
Returns: string - ner

toJSON() ⇒ `TokenJSON`

The following arrow function data => Token.fromJSON(data).toJSON() is idempontent, if considering shallow comparison, not by reference. This JSON will respects the same structure as it expects from {@see Token#fromJSON}.

Kind: global function
Returns: TokenJSON - data

fromJSON(data) ⇒ `Token`

Get an instance of Token from a given JSON

Kind: global function
Returns: Token - token - A new Token instance

Param	Type	Description
data	`TokenJSON`	The token data, as returned by CoreNLP API service

dump() ⇒ `string`

Get a Tree string representation for debugging purposes

Kind: global function
Returns: string - tree

visitDeepFirst()

Performs Deep-first Search calling a visitor for each node

Kind: global function
See: DFS

visitDeepFirstRight()

Performs Deep-first Search calling a visitor for each node, from right to left

Kind: global function
See: DFS

visitLeaves()

Performs Deep-first Search calling a visitor only over leaves

Kind: global function
See: DFS

fromSentence(sentence, [doubleLink]) ⇒ `Tree`

Kind: global function
Returns: Tree - tree

Param	Type	Default	Description
sentence	`Sentence`
[doubleLink]	`boolean`	`false`	whether the child nodes should have a reference to their parent or not - this allows the use of Node.parent()

fromString(str, [doubleLink]) ⇒ `Tree`

Kind: global function
Returns: Tree - tree

Param	Type	Default	Description
str	`string`
[doubleLink]	`boolean`	`false`	whether the child nodes should have a reference to their parent or not - this allows the use of Node.parent()