Skip to content
Open
Changes from 2 commits
Commits
Show all changes
551 commits
Select commit Hold shift + click to select a range
ce97b2b
removeUTF8BOM 健壮性
hankcs Aug 25, 2018
2a071ec
:checkered_flag:一亿字全世界最大中文语料库;小版本+1,发布v1.6.8
hankcs Aug 25, 2018
9798524
Merge branch 'master' into portable
hankcs Aug 25, 2018
847dc88
预备发布 portable-1.6.8
hankcs Aug 25, 2018
94e8eb8
排除人名<周有>
resorcap Aug 28, 2018
402dc0c
Merge pull request #941 from resorcap/patch-1
hankcs Aug 28, 2018
b435880
修正“体面”的拼音 fix https://github.com/hankcs/HanLP/issues/937
hankcs Aug 28, 2018
42fb6fd
删除冷僻姓氏“年” fix https://github.com/hankcs/HanLP/issues/939
hankcs Aug 28, 2018
135abf8
删除繁转简“通道=信道”
hankcs Aug 30, 2018
3082b12
自定义词典更新时自动删除缓存文件
AnyListen Sep 3, 2018
1600ab2
Merge pull request #954 from AnyListen/master
hankcs Sep 3, 2018
9e2aeed
Merge remote-tracking branch 'origin/master'
hankcs Sep 8, 2018
066ae2e
修复双数组trie树的最长匹配问题 fix https://github.com/hankcs/HanLP/issues/966
hankcs Sep 15, 2018
61e59d0
删除二阶隐马等废弃模块的调用入口 fix https://github.com/hankcs/HanLP/issues/964
hankcs Sep 15, 2018
afa21c3
TextRankKeyword支持构造自任意分词器
hankcs Sep 22, 2018
431ffc8
CustomDictionary.insert("新词语", "词性标签") 支持省略频次
hankcs Sep 28, 2018
b2f93a0
优化双数组trie树,构建后自动shrink到最低内存 close https://github.com/hankcs/HanLP/iss…
hankcs Sep 29, 2018
b5a02ef
支持通过JVM的启动参数指定data路径:java -DHANLP_ROOT=/opt/hanlp 则加载/opt/hanlp/data …
hankcs Sep 29, 2018
36a9bf1
载入停用词词典失败时以RuntimeException形式通知 https://github.com/hankcs/HanLP/issue…
hankcs Sep 29, 2018
c6d131e
NeuralNetworkDependencyParser构造函数接受Segment
hankcs Oct 4, 2018
29275b5
校正“大家@电”bigram fix https://github.com/hankcs/HanLP/issues/999
hankcs Oct 18, 2018
f1a7b58
修订简繁转换词条 fix https://github.com/hankcs/HanLP/issues/998
hankcs Oct 18, 2018
bc061f9
新词发现模块不再过滤英文字符,以抽取"K线图"等词汇
hankcs Oct 18, 2018
655ae03
词法分析器加入规则 enableRuleBasedSegment https://github.com/hankcs/HanLP/issu…
hankcs Oct 28, 2018
3889c9c
修订繁简转换 https://github.com/hankcs/HanLP/issues/835#issuecomment-434198320
hankcs Nov 3, 2018
e25103a
修订繁简转换 https://github.com/hankcs/HanLP/issues/1011
hankcs Nov 4, 2018
52df5a9
微调ngram和nr模型
hankcs Nov 10, 2018
69329f9
词法分析器新增流水线模式
hankcs Nov 10, 2018
b2c72f4
分词断句支持指定断句颗粒度 fix https://github.com/hankcs/HanLP/issues/1018
hankcs Nov 10, 2018
49ffc9d
:checkered_flag:新增文本聚类、流水线分词;中版本+1,发布v1.7.0
hankcs Nov 10, 2018
2bc2bbb
Merge branch 'master' into portable
hankcs Nov 10, 2018
6a4381c
预备发布 portable-1.7.0
hankcs Nov 10, 2018
2f5b8bb
修订现代汉语词典
hankcs Nov 11, 2018
9d8b57c
自定义词典兼容含有空格的路径 fix https://github.com/hankcs/HanLP/issues/1025
hankcs Nov 17, 2018
1094563
*Tokenizer支持分句颗粒度 fix https://github.com/hankcs/HanLP/issues/1019
hankcs Nov 17, 2018
583e97d
更新论文和演示链接
hankcs Nov 23, 2018
de42c07
修订繁转简 https://github.com/hankcs/HanLP/issues/835#issuecomment-440953253
hankcs Nov 24, 2018
9acb8c4
使热更新产生的缓存文件包含用户词性 fix https://github.com/hankcs/HanLP/issues/1028
hankcs Nov 24, 2018
54a230b
利用BufferedOutputStream加速缓存生成,快37倍
hankcs Nov 28, 2018
2881df4
新增可自定义用户词典的维特比分词器
AnyListen Dec 3, 2018
49d34d6
新增可自定义用户词典的维特比分词器
AnyListen Dec 3, 2018
4a29ae0
扩展维特比分词添加是否缓存词典选择
AnyListen Dec 3, 2018
4b03c2b
维特比分词设置默认自定义词典
AnyListen Dec 3, 2018
a538b39
复用自定义词典加载代码
AnyListen Dec 5, 2018
82d478c
完善扩展维特比分词器测试代码
AnyListen Dec 5, 2018
43f0ea8
修复可变DAT的entrySet方法 fix https://github.com/hankcs/HanLP/issues/1038
hankcs Dec 8, 2018
58e5f6c
整合维特比自定义词典代码
AnyListen Dec 11, 2018
572b2aa
修订繁转简 fix https://github.com/hankcs/HanLP/issues/1046
hankcs Dec 11, 2018
a5fae4d
Merge pull request #1040 from AnyListen/master
hankcs Dec 16, 2018
cd2e7a7
删除 未##数@请
hankcs Dec 20, 2018
3da7c41
:checkered_flag:高速缓存、动态词典;小版本+1,发布v1.7.1
hankcs Dec 23, 2018
d483b33
Merge branch 'master' into portable
hankcs Dec 23, 2018
47f2fa3
预备发布 portable-1.7.1
hankcs Dec 23, 2018
455f593
调整繁體分詞策略 fix https://github.com/hankcs/HanLP/issues/1059
hankcs Dec 25, 2018
d72b789
Catalog添加toString方法
hankcs Jan 4, 2019
179a373
修正卡方检验整型溢出的问题,准确率提升(95.47->96.08) fix https://github.com/hankcs/HanLP…
hankcs Jan 4, 2019
02de2ac
更新文本分类示例
hankcs Jan 5, 2019
89ef642
Merge branch 'master' into portable
hankcs Jan 5, 2019
ce949cd
使LexicalAnalyzer支持TranslatedPersonRecognition和JapanesePersonRecogniti…
hankcs Jan 12, 2019
86bd212
微调人名识别
hankcs Jan 16, 2019
801d797
提示在线学习不可能学习新的标签
hankcs Jan 20, 2019
0421b87
tokenizer的seg2sentence修改为static
hankcs Jan 27, 2019
a55b210
补充ngram
hankcs Jan 30, 2019
7f6a2d0
新增基于ArcEager转移系统以平均感知机作为分类器的柱搜索依存句法分析器
hankcs Feb 6, 2019
c706a1a
词法分析器默认关闭规则系统
hankcs Feb 6, 2019
70bb4ae
删除错误的unigram和bigram fix https://github.com/hankcs/HanLP/issues/1054
hankcs Feb 8, 2019
5612f7f
发布KBeamArcEagerDependencyParser,废弃MaxEntDependencyParser
hankcs Feb 8, 2019
5794097
感知机句法分析器文档
hankcs Feb 8, 2019
5b9813c
调整感知机句法分析器训练接口
hankcs Feb 8, 2019
447cce0
感知机句法分析器evaluate接口
hankcs Feb 11, 2019
27650a6
CoNLLSentence新增两个方法
hankcs Feb 12, 2019
52bc3d5
修订拼音 fix https://github.com/hankcs/HanLP/issues/1093
hankcs Feb 18, 2019
e154ad1
更新PKU98语料库地址 fix https://github.com/hankcs/HanLP/issues/1101
hankcs Feb 22, 2019
710d81f
修正CustomDictionary.reload(); fix https://github.com/hankcs/HanLP/issu…
hankcs Feb 22, 2019
6de6262
:checkered_flag:新的句法分析模块、多项改进;小版本+1,发布v1.7.2
hankcs Feb 22, 2019
c503dc5
Merge branch 'master' into portable
hankcs Feb 22, 2019
e601bc6
:checkered_flag:新的句法分析模块、多项改进;小版本+1,发布v1.7.2
hankcs Feb 22, 2019
91c45b7
更新word2vec示例
hankcs Feb 27, 2019
7f49ec5
添加customerize ner tag 功能
Feb 28, 2019
d276035
添加customerize ner tag 功能
Feb 28, 2019
7761ef5
添加customerize ner tag 功能, fix typo
Feb 28, 2019
a241213
Merge pull request #1104 from zhangruinan/master
hankcs Mar 6, 2019
2b928e6
修订拼音 fix https://github.com/hankcs/HanLP/issues/1118
hankcs Mar 20, 2019
e148f23
Merge remote-tracking branch 'origin/master'
hankcs Mar 20, 2019
a39b14a
防止ViterbiSegment.dat不必要的初始化
hankcs Mar 28, 2019
478f895
优化DoubleArrayTrie fix https://github.com/hankcs/HanLP/issues/1136
hankcs Mar 29, 2019
d82cc08
修复词法分析器对动态插入的词条的处理 fix https://github.com/hankcs/HanLP/issues/271#iss…
hankcs Apr 4, 2019
668ec6b
修复语料库下载链接 fix https://github.com/hankcs/HanLP/issues/1148
hankcs Apr 10, 2019
073a0ea
词法分析器seg接口支持自定义词性覆盖统计词性 fix https://github.com/hankcs/HanLP/issues/1156
hankcs Apr 20, 2019
82a48f9
感知机词法分析器默认使用98年人民日报6个月的大模型
hankcs Apr 20, 2019
b6e19fe
:checkered_flag:常规维护、多项改进;小版本+1,发布v1.7.3
hankcs Apr 20, 2019
613022b
Merge branch 'master' into portable
hankcs Apr 20, 2019
30a2015
预备发布 portable-1.7.3
hankcs Apr 20, 2019
8d2057c
修复gpg签名
hankcs Apr 20, 2019
c74ef77
停用词典支持热更新:fix https://github.com/hankcs/HanLP/issues/1158
hankcs Apr 27, 2019
a538d07
修正 CollectionUtility.sortMapByValue(java.util.Map<K,V>, boolean) fix …
hankcs Apr 27, 2019
e33b1e5
微调bigram
hankcs May 1, 2019
69cddf7
修复自定义词性 fix https://github.com/hankcs/HanLP/issues/1172
hankcs May 6, 2019
8318bee
微调bigram fix https://github.com/hankcs/HanLP/issues/1015
hankcs May 8, 2019
c6ee46f
修订简繁转换 fix https://github.com/hankcs/HanLP/issues/1182
hankcs May 25, 2019
5be39b0
修正URLTokenizer中的正则表达式 fix https://github.com/hankcs/HanLP/issues/1188
hankcs Jun 1, 2019
bd60162
文档
hankcs Jun 5, 2019
c5391d5
Add unit tests for com.hankcs.hanlp.algorithm.EditDistance
ThomasPerkins1123 Jun 3, 2019
a6b0d85
Merge pull request #1194 from Diffblue-benchmarks/add-EditDistance-tests
hankcs Jun 6, 2019
a179699
Add unit tests for com.hankcs.hanlp.utility.MathUtilityTest
ThomasPerkins1123 Jun 10, 2019
9c34ed1
Merge pull request #1199 from Diffblue-benchmarks/add-MathUtil-Tests
hankcs Jun 12, 2019
80f6215
修正角色标注时“始##始”的A标签 fix https://github.com/hankcs/HanLP/issues/434
hankcs Jun 22, 2019
9495a4d
修订人名词典
hankcs Jun 28, 2019
f7c928c
无损转换OpenCC词典,结果一致 https://github.com/hankcs/OpenCC-to-HanLP fix https…
hankcs Jun 28, 2019
590af00
:checkered_flag:简繁转换与OpenCC完全一致;小版本+1,发布v1.7.4
hankcs Jun 28, 2019
2131d8f
Merge branch 'master' into portable
hankcs Jun 28, 2019
d9e7a25
预备发布 portable-1.7.4
hankcs Jun 28, 2019
eecb4aa
修复Analyzer的enableCustomDictionaryForcing方法 fix https://github.com/han…
hankcs Jul 4, 2019
d87b3a2
让CoreStopWordDictionary.apply返回结果
hankcs Jul 4, 2019
c4725b8
DocVectorModel支持自定义分词器、开/关停用词过滤器 fix https://github.com/hankcs/HanLP/…
hankcs Jul 27, 2019
5fd89fd
Change method name 'convert' to 'createSynonymList'
doubleblinddoubleblinddoubleblind Aug 1, 2019
8aad0a8
Merge pull request #1259 from pdhung3012/master
hankcs Aug 1, 2019
9a0d81c
修复repeated bisection聚类算法 fix https://github.com/hankcs/HanLP/issues/1…
hankcs Aug 8, 2019
498b6f7
将换行空格等视作CT_OTHER fix https://github.com/hankcs/HanLP/issues/1283
hankcs Sep 19, 2019
49fefec
文档
hankcs Sep 19, 2019
19c11b4
删除“一推” fix https://github.com/hankcs/HanLP/issues/1288#issuecomment-5…
hankcs Oct 3, 2019
a7c05c7
删除“要买”
hankcs Oct 9, 2019
422077b
《自然语言处理入门》新书携v1.7.5发布🔥:http://nlp.hankcs.com/book.php
hankcs Oct 10, 2019
523b3d3
Merge branch 'master' into portable
hankcs Oct 17, 2019
598b73c
预备发布 portable-1.7.5
hankcs Oct 17, 2019
3c214ec
WordVectorModel支持自定义Map类型:https://github.com/hankcs/HanLP/issues/1304
hankcs Oct 20, 2019
233b550
删除“邀请人”
hankcs Oct 25, 2019
af5d8a9
Using `buffer` instead of `_` in code in order to prevent compile fai…
LucienShui Oct 30, 2019
12ccffc
Merge pull request #1312 from LucienShui/dev
hankcs Oct 30, 2019
9171a1b
Merge remote-tracking branch 'origin/master'
hankcs Oct 30, 2019
511b978
NGramDictionaryMaker等默认UTF-8编码 fix https://github.com/hankcs/HanLP/is…
hankcs Nov 8, 2019
1c38a6d
优化 segmentBackwardLongest 的运行速度
hankcs Nov 10, 2019
6877863
更新文档
hankcs Nov 13, 2019
50a05e4
清理代码 fix https://github.com/hankcs/HanLP/issues/1322
hankcs Nov 13, 2019
2874b14
自动下载文件时加上 User-Agent
hankcs Nov 19, 2019
ab0cf20
修订现代汉语补充词库 fix https://github.com/hankcs/HanLP/issues/1330
hankcs Nov 23, 2019
9751c98
自动下载支持重定向
hankcs Nov 26, 2019
5fb8a4d
利用配置文件中的路径判断data是否已下载
hankcs Nov 26, 2019
9cca30b
词法分析器新增空格处理 fix https://github.com/hankcs/HanLP/issues/797
hankcs Nov 26, 2019
5cd35e4
新增 DocVectorModel.nearest(java.lang.String, int) 方法 fix https://githu…
hankcs Nov 26, 2019
d5c63f2
修复:加载自定义停用词文件无效
allen615 Dec 5, 2019
0672bb1
Merge pull request #1346 from allen615/master
hankcs Dec 9, 2019
4bfbfcd
HMMLexicalAnalyzerTest自动下载PKU语料库
hankcs Dec 10, 2019
651382d
开放 CoreStopWordDictionary.dictionary https://github.com/hankcs/HanLP/…
hankcs Dec 18, 2019
832aae9
tfidf,idf的数据可以通过加载idf文件得到
allen615 Dec 23, 2019
b44c3b4
tfidf,idf的数据可以通过加载idf文件得到
allen615 Dec 23, 2019
6b93df0
Merge pull request #1360 from allen615/master
hankcs Dec 24, 2019
c6b4ab8
Nature is not concurrent safe. Change TreeMap to ConcurrentHashMap
Dec 27, 2019
477bae1
Merge pull request #1365 from zhuchaokn/master
hankcs Dec 27, 2019
7dd79cc
修复信息熵计算中的除零错误 fix https://github.com/hankcs/HanLP/issues/1366
hankcs Dec 31, 2019
78769d8
:checkered_flag:常规维护、多项改进;小版本+1,发布v1.7.6
hankcs Dec 31, 2019
773b9af
Merge branch 'master' into portable
hankcs Dec 31, 2019
143e1bc
Portable同步升级到v1.7.6
hankcs Dec 31, 2019
19809f3
修复聚类数目大于文档数目时引发的异常 fix https://github.com/hankcs/HanLP/issues/1397
hankcs Jan 10, 2020
a5efa69
开放 CWSEvaluator.Result 内部成员 fix https://bbs.hankcs.com/t/topic/887
hankcs Feb 1, 2020
854ed9c
更新文档
hankcs Feb 1, 2020
3a30684
修复 AbstractClassifier.enableProbability fix https://github.com/hankcs…
hankcs Feb 14, 2020
65e0475
格式化代码
hankcs Feb 14, 2020
e8a920c
改进原子切分 fix https://github.com/hankcs/HanLP/issues/1421
hankcs Feb 14, 2020
b62db0d
进一步改进原子切分 fix https://github.com/hankcs/HanLP/issues/1421#issuecommen…
hankcs Feb 15, 2020
33f2973
去掉 幺=么 fix https://github.com/hankcs/HanLP/issues/1427
hankcs Feb 18, 2020
90b0c15
support getting all tags
tiandiweizun Feb 19, 2020
de656bf
Merge pull request #1428 from tiandiweizun/patch-1
hankcs Feb 19, 2020
c45c0cd
使用构造函数代替静态create,方便子类继承
hankcs Mar 2, 2020
958b7c0
公开HMM的成员
hankcs Mar 5, 2020
9577651
:checkered_flag:常规维护、多项改进;小版本+1,发布v1.7.7
hankcs Mar 5, 2020
b698f0f
Merge branch '1.x' into portable
hankcs Mar 5, 2020
1ea796c
Portable同步升级到v1.7.7
hankcs Mar 5, 2020
dd561bd
开放 CRFNERecognizer.tagSet,补充CRF自动机名称识别案例 https://bbs.hankcs.com/t/crf…
hankcs Mar 9, 2020
3148af0
Typo Fix
caoyi0905 Mar 16, 2020
fd2a829
Merge pull request #1439 from caoyi0905/patch-1
hankcs Mar 16, 2020
fee3e14
CharType使用IOAdapter fix https://github.com/hankcs/HanLP/issues/1480
hankcs May 27, 2020
ef1c59b
加入自定义词条“雄安”
hankcs Jun 1, 2020
180f7c0
:checkered_flag:常规维护、多项改进;小版本+1,发布v1.7.8
hankcs Jun 15, 2020
f338b1d
Merge branch '1.x' into portable
hankcs Jun 15, 2020
d7ece24
Portable同步升级到v1.7.8
hankcs Jun 15, 2020
1afbaf1
fix errors when compound word consists of two words and appears at th…
bqwu Jun 26, 2020
83ee72b
Merge pull request #1497 from bqwu/1.x
hankcs Jun 26, 2020
cb2d20f
清理代码
hankcs Aug 8, 2020
ef44aad
HiddenMarkovModel构造时备份参数 fix https://github.com/hankcs/HanLP/issues/1530
hankcs Aug 15, 2020
3b163cb
Fix Sentence.create on compound word consisting of single word
hankcs Oct 1, 2020
66e328d
新增 KBeamArcEagerDependencyParser(String modelPath, String cwsModelPat…
hankcs Nov 15, 2020
ae845b6
新增热更新方法 CoreDictionary.reload() fix https://github.com/hankcs/HanLP/i…
hankcs Dec 22, 2020
6265aec
双数组trie树防止传入空白key导致无法转移状态 fix https://bbs.hankcs.com/t/dat/3196/8
hankcs Jan 15, 2021
2577426
修复 CoreStopWordDictionary.dictionary.clear() fix https://github.com/h…
hankcs Jan 16, 2021
b9a899b
支持𩽾𩾌(ān kāng)之类的补充字符集 fix https://github.com/hankcs/HanLP/issues/1564
hankcs Jan 31, 2021
aff3f3a
重构CustomDictionary,支持多实例 https://github.com/hankcs/HanLP/issues/1339
hankcs Jan 31, 2021
b3562d9
:checkered_flag:支持多实例、补充字符集;中版本+1,发布v1.8.0
hankcs Feb 6, 2021
49bc6c2
Merge branch '1.x' into portable
hankcs Feb 11, 2021
4d14a07
Portable同步升级到v1.8.0
hankcs Feb 11, 2021
68d063a
修复CharTable 归一化部分字符错误 fix https://github.com/hankcs/HanLP/issues/1615
hankcs Feb 22, 2021
88d3eb0
提问请上论坛:https://bbs.hanlp.com/
hankcs Mar 10, 2021
18e5c7a
修复 convertToPinyinList fix https://github.com/hankcs/HanLP/issues/1634
hankcs Mar 19, 2021
4704cc1
:checkered_flag:常规维护与修复;小版本+1,发布v1.8.1
hankcs Mar 19, 2021
80cc0fd
Merge branch '1.x' into portable
hankcs Mar 19, 2021
9e3e8c4
Portable同步升级到v1.8.1
hankcs Mar 19, 2021
6b60684
修复 CustomDictionary.reload() fix https://github.com/hankcs/HanLP/issu…
hankcs Mar 23, 2021
1696479
lve4的声母修正为ve fix https://github.com/hankcs/HanLP/issues/1644
hankcs Apr 17, 2021
1632955
修订简繁映射表
hankcs May 14, 2021
a3f9d02
修订bigram模型
hankcs May 25, 2021
99548e7
修复CoreDictionary的reload方法
hankcs Jun 7, 2021
3a99bc6
调整公式,维特比分词准确率从94.49提升至94.69 https://bbs.hankcs.com/t/topic/136/61?u=h…
hankcs Jun 8, 2021
61631b0
改进 HMM 采样函数 https://bbs.hankcs.com/t/topic/136/64?u=hankcs
hankcs Jun 10, 2021
8ee039b
支持禁用自动刷新词典缓存(CustomDictionaryAutoRefreshCache=false)fix https://githu…
hankcs Jun 18, 2021
6b89f39
:checkered_flag:常规维护与准确率提升;小版本+1,发布v1.8.2
hankcs Jun 18, 2021
24ccd6e
Merge branch '1.x' into portable
hankcs Jun 18, 2021
babc1e4
Portable同步升级到v1.8.2
hankcs Jun 18, 2021
9ae1498
调整`莎=sha1,suo1` fix https://github.com/hankcs/HanLP/issues/1670
hankcs Aug 11, 2021
a9997d8
DoubleArrayTrie里的LongestSearcher的next方法需要进行强化,当传入的treemap的value为null时…
Aug 24, 2021
7b03824
Merge pull request #1674 from tiandiweizun/1.x
hankcs Aug 24, 2021
363d0b0
Do not allow any transition when parse with empty trie fix https://gi…
hankcs Aug 13, 2021
61cc753
清理代码
hankcs Oct 15, 2021
6cff689
根据总词频动态决定未登录词的默认词频
hankcs Nov 5, 2021
d34dab3
Update DoubleArrayTrie.java
TITC Dec 7, 2021
2de961b
Merge pull request #1699 from TITC/patch-1
hankcs Dec 7, 2021
2f796df
删除几个“名+名词”
hankcs Dec 31, 2021
8e750ee
修复动态自定义词典与CustomDictionaryForcing的搭配问题 fix https://github.com/hankcs/…
hankcs Feb 21, 2022
4737766
Merge branch '1.x' into portable
hankcs Feb 21, 2022
51b97e9
Portable同步升级到v1.8.3
hankcs Feb 21, 2022
4b43124
将<>视作分隔符 fix https://bbs.hankcs.com/t/topic/4527
hankcs Feb 27, 2022
69506a7
Segment 添加是否进行 Normalize 的配置方法 close https://github.com/hankcs/HanLP/…
hankcs Mar 8, 2022
867cc8d
修复文本推荐的评分器分数计算时 scorer.boost 的 bug fix: https://github.com/hankcs/Han…
hankcs Apr 9, 2022
551d578
bugfix: 修复 bintrie 树全分词时 提前跳出循环 bug
carl10086 Aug 11, 2022
b216b24
Merge pull request #1775 from carl10086/bugfix/bintrie_parsetext
hankcs Aug 12, 2022
b165273
自定义词典支持.tsv格式 fix: https://github.com/hankcs/HanLP/issues/1785
hankcs Sep 15, 2022
1323221
修复自定义词典路径传参 fix: https://github.com/hankcs/HanLP/issues/1799
hankcs Jan 13, 2023
ce07395
增加enableFastBuild
Feb 23, 2023
6b4c681
调整注释
Feb 23, 2023
41b2f3a
🙅补单测
Feb 23, 2023
94b41c5
调整单测
Feb 23, 2023
e1020b0
修复word2vec文件流关闭问题 fix: https://github.com/hankcs/HanLP/issues/1806
hankcs Feb 24, 2023
9f26460
:checkered_flag:常规维护与准确率提升;小版本+1,发布v1.8.4
hankcs Feb 25, 2023
08d091e
Merge branch '1.x' into portable
hankcs Feb 25, 2023
6316759
Portable同步升级到v1.8.4
hankcs Feb 25, 2023
d57aab2
欢迎引用我们的论文:https://aclanthology.org/2021.emnlp-main.451/
hankcs Feb 27, 2023
e19bc7a
演示如何调整二元文法: https://bbs.hankcs.com/t/topic/5326
hankcs Apr 4, 2023
9b2ff93
修复ViterbiSegment分词器中加载自定义词典时未替换DoubleArrayTrie导致分词不符合预期的问题
Aug 11, 2023
2d3b1bf
Merge pull request #1835 from wxy929629/1.x
hankcs Aug 13, 2023
9e2c58c
Merge remote-tracking branch 'origin/1.x' into 1.x
hanlpbot Aug 13, 2023
4b2686c
修复mini二元文法在JRE初始化后第一次分词可能出现的不一致 fix: https://github.com/hankcs/HanLP/…
hankcs Oct 19, 2023
69e69b5
fix:修复中文分词评测工具比较时的计算错误
webSue Oct 19, 2023
4ac13f1
Merge pull request #1853 from webSue/fix/cws_evaluate
hankcs Oct 20, 2023
a089963
:checkered_flag:常规维护;小版本+1,发布v1.8.5
hankcs Nov 16, 2024
926e126
Merge branch '1.x' into portable
hankcs Nov 16, 2024
0df6a5d
Portable同步升级到v1.8.5
hankcs Nov 16, 2024
e68be80
Merge remote-tracking branch 'origin/1.x' into 1.x
hankcs Nov 16, 2024
03d3e63
清理 `Predefine`
hankcs Dec 28, 2024
dadd5c7
:checkered_flag:常规维护;小版本+1,发布v1.8.6
hankcs Dec 28, 2024
e3cfaa0
Merge branch '1.x' into portable
hankcs Dec 28, 2024
4f7949a
Portable同步升级到v1.8.6
hankcs Dec 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ public static double similarity(String A, String B)
* @param withUndefinedItem 是否保留词典中没有的词语
* @return
*/
public static List<CommonSynonymDictionary.SynonymItem> convert(List<Term> sentence, boolean withUndefinedItem)
public static List<CommonSynonymDictionary.SynonymItem> createSynonymList(List<Term> sentence, boolean withUndefinedItem)
{
List<CommonSynonymDictionary.SynonymItem> synonymItemList = new ArrayList<CommonSynonymDictionary.SynonymItem>(sentence.size());
for (Term term : sentence)
Expand Down