我正在尝试生成DeepSpeech-Polyglot-Project的记分器。我遵循了文档的每一步,但是当我运行:
python3 /DeepSpeech/data/lm/generate_lm.py --input_txt /DeepSpeech/data_prepared/texts/${LANGUAGE}/clean_vocab.txt --output_dir /DeepSpeech/data_prepared/texts/${LANGUAGE}/ --top_k 500000 --kenlm_bins /DeepSpeech/native_client/kenlm/build/bin/ --arpa_order 5 --max_arpa_memory "85%" --arpa_prune "0|0|1" --binary_a_bits 255 --binary_q_bits 8 --binary_type trie --discount_fallback我得到以下错误:
Saving top 500000 words ...
Calculating word statistics ...
Your text file has 202185630 words in total
It has 2106729 unique words
Your top-500000 words are 98.7433 percent of all words
Your most common word "die" occurred 7853080 times
The least common word in your top-k is "adamantium" with 5 times
The first word with 6 occurrences is "begibst" at place 448270
Creating ARPA file ...
=== 1/5 Counting and sorting n-grams ===
Reading /DeepSpeech/data_prepared/texts/de/lower.txt.gz
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
Traceback (most recent call last):
File "/DeepSpeech/data/lm/generate_lm.py", line 210, in <module>
main()
File "/DeepSpeech/data/lm/generate_lm.py", line 201, in main
build_lm(args, data_lower, vocab_str)
File "/DeepSpeech/data/lm/generate_lm.py", line 97, in build_lm
subprocess.check_call(subargs)
File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/DeepSpeech/native_client/kenlm/build/bin/lmplz', '--order', '5', '--temp_prefix', '/DeepSpeech/data_prepared/texts/de/', '--memory', '85%', '--text', '/DeepSpeech/data_prepared/texts/de/lower.txt.gz', '--arpa', '/DeepSpeech/data_prepared/texts/de/lm.arpa', '--prune', '0', '0', '1', '--discount_fallback']' died with <Signals.SIGSEGV: 11>.我正在使用这个文档:https://gitlab.com/Jaco-Assistant/deepspeech-polyglot
我非常感谢每一个提示。
发布于 2021-01-14 16:27:40
这已经在DeepSpeech's Discourse.上讨论过了
基本上,您的KenLM没有正确安装。在Google上搜索这个错误,你会发现你必须重新安装并检查你的环境。
https://stackoverflow.com/questions/65705601
复制相似问题