在运行时更改 CoreNLP 设置
Changing CoreNLP settings at runtime
我正在使用 Stanford CoreNLP 管道,我想知道是否有一种方法可以在不重新启动整个工具(避免重新加载模型)的情况下编辑基本设置。
现在我有:
Properties props = new Properties();
props.setProperty("tokenize.whitespace", "true");
props.setProperty("annotators", "tokenize, ssplit, pos, ...");
StanfordCoreNLP stanfordPipeline = new StanfordCoreNLP(props);
我想即时更改 tokenize.whitespace
设置,而无需重新启动所有内容。可能吗?
您应该只创建一个具有其他属性的 StanfordCoreNLP 的新实例;所有常见的注释器及其模型都不会重新加载,因为 StanfordCoreNLP 使用静态 AnnotatorPool(参见 src code,第 103 行),其中 AnnotatorPool 是:
An object for keeping track of Annotators. Typical use is to allow
multiple pipelines to share any Annotators in common.
For example, if multiple pipelines exist, and they both need a
ParserAnnotator, it would be bad to load two such Annotators into
memory. Instead, an AnnotatorPool will only create one Annotator
and allow both pipelines to share it.
(取自javadoc)
我正在使用 Stanford CoreNLP 管道,我想知道是否有一种方法可以在不重新启动整个工具(避免重新加载模型)的情况下编辑基本设置。
现在我有:
Properties props = new Properties();
props.setProperty("tokenize.whitespace", "true");
props.setProperty("annotators", "tokenize, ssplit, pos, ...");
StanfordCoreNLP stanfordPipeline = new StanfordCoreNLP(props);
我想即时更改 tokenize.whitespace
设置,而无需重新启动所有内容。可能吗?
您应该只创建一个具有其他属性的 StanfordCoreNLP 的新实例;所有常见的注释器及其模型都不会重新加载,因为 StanfordCoreNLP 使用静态 AnnotatorPool(参见 src code,第 103 行),其中 AnnotatorPool 是:
An object for keeping track of Annotators. Typical use is to allow multiple pipelines to share any Annotators in common.
For example, if multiple pipelines exist, and they both need a ParserAnnotator, it would be bad to load two such Annotators into memory. Instead, an AnnotatorPool will only create one Annotator and allow both pipelines to share it.
(取自javadoc)