我试图使用uimaFit构建数据处理管道,如下所示:
[annotatorA] => [Consumer to dump annotatorA's annotations from CAS into DB]
[annotatorB (should take on annotatorA's annotations from DB as input)]=>[Consumer for annotatorB]
司机代码:
/* Step 0: Create a reader */
CollectionReader readerInstance= CollectionReaderFactory.createCollectionReader(
FilePathReader.class, typeSystem,
FilePathReader.PARAM_INPUT_FILE,"/path/to/file/to/be/processed");
/*Step1: Define Annotoator A*/
AnalysisEngineDescription annotatorAInstance=
AnalysisEngineFactory.createPrimitiveDescription(
annotatorADbConsumer.class, typeSystem,
annotatorADbConsumer.PARAM_DB_URL,"localhost",
annotatorADbConsumer.PARAM_DB_NAME,"xyz",
annotatorADbConsumer.PARAM_DB_USER_NAME,"name",
annotatorADbConsumer.PARAM_DB_USER_PWD,"pw");
builder.add(annotatorAInstance);
/* Step2: Define binding for annotatorB to take
what-annotator-a put in DB above as input */
/*Step 3: Define annotator B */
AnalysisEngineDescription annotatorBInstance =
AnalysisEngineFactory.createPrimitiveDescription(
GateDateTimeLengthAnnotator.class,typeSystem)
builder.add(annotatorBInstance);
/*Step 4: Run the pipeline*/
SimplePipeline.runPipeline(readerInstance, builder.createAggregate());我有以下问题:
注入提出的方法是否是实现这一目标的正确方向?
发布于 2015-03-13 07:01:47
可以使用@TypeCapability定义依赖项,如下所示:
@TypeCapability(inputs = { "com.myproject.types.MyType", ... }, outputs = { ... })
public class MyAnnotator extends JCasAnnotator_ImplBase {
....
}注意,它在注释级别定义了一个契约,而不是在引擎级别(这意味着任何引擎都可以创建com.myproject.types.MyType)。
我认为没有办法让强制执行 it。
我确实创建了一些代码来检查引擎是否在管道上游提供了正确的所需注释,并打印错误日志(参见Pipeline.checkAndAddCapabilities()和Pipeline.addCapabilities() )。但是,请注意,只有当所有引擎都定义了它们的TypeCapabilities时,它才能工作,而当使用外部引擎/库时,情况通常并非如此。
https://stackoverflow.com/questions/29007478
复制相似问题