compromise/data at master · spencermountain/compromise

Name	Name	Last commit message	Last commit date
parent directory ..
lexicon	lexicon
pairs	pairs
README.md	README.md

Name

Last commit message

Last commit date

hello!

here is the data compressed and compiled into the word models that compromise uses to understand text.

there are some things to note:

run npm run pack after making a change, to see changes appear.
lexicon words are lowercased and compressed with efrt, some characters are reserved -[0-9,;!:|¦]
be careful adding ambiguous words - 'ray' should not be a #Person - it's a better fit for ./switches/person-date.js
many word-lists have conjugations automatically applied to them - #Singular words are pluralized, etc.

the lexicon output data can be found in ./src/2-two/preTagger/model/lexicon/_data.js

and the word-conjugation data can be found in ./src/2-two/preTagger/model/models/_data.js

for more information, see the compromise-lexicon docs.

Provide feedback