我正在尝试对来自Java的图像文件进行OCR。因此,我决定使用https://github.com/naptha/tesseract.js中的Tesseract.js,并通过GraalVM中的graal.js功能调用它,但无法使其工作。
这是我尝试过的。
public static final String TESSERACT = "src/tesseract.js";
private static void tesseract(String imageFile) throws IOException
{
System.out.println("=== Calling Tesseract === ");
try(Context context = Context.create())
{
context.eval(Source.newBuilder("js", new File(TESSERACT)).build());
Value Tesseract = context.getBindings("js").getMember("Tesseract");
Value recognize = Tesseract.getMember("recognize");
long start = System.currentTimeMillis();
String result = recognize.execute(imageFile).asString();
long took = System.currentTimeMillis() - start;
System.out.println("Tesseract call took: " + took + "ms with result: " + result);
} // context.close() is automatic
}它会编译,但在运行时会抛出以下异常:
=== Calling Tesseract ===
Exception in thread "main" ReferenceError: window is not defined
at <js> spawnWorker(src\tesseract.js:286:8848-8853)
at <js> _delay(src\tesseract.js:504:16140-16184)
at <js> recognize(src\tesseract.js:472-481:15321-15620)
at org.graalvm.polyglot.Value.execute(Value.java:457)
at com.mycompany.app.JsApp.tesseract(JsApp.java:98)
at com.mycompany.app.JsApp.main(JsApp.java:70)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at com.intellij.rt.execution.application.AppMainV2.main(AppMainV2.java:131)Anyone know to to fix this?发布于 2021-08-04 09:00:54
主要的问题是tesseract.js期望在浏览器中运行。之所以没有定义该窗口,是因为您在不同的JavaScript运行时中运行它,而不是在浏览器中运行tesseract.js。
为了解决您的问题,我将使用Tess4j运行Tesseract OCR。Tess4j是Tesseract的JNA包装器(就像tesseract.js是浏览器包装器一样)。
https://stackoverflow.com/questions/68648166
复制相似问题