import mammoth
f = open("D:\filename.docx", 'rb')
document = mammoth.convert_to_html(f)我无法获得一个.html文件,而我运行这段代码,请帮我得到它,当我转换成.html文件时,我没有得到图像插入到word文件到.html文件,你能帮我如何获得图像到.html从.docx?
发布于 2017-10-30 13:52:18
试试这个:
import mammoth
f = open("path_to_file.docx", 'rb')
b = open('filename.html', 'wb')
document = mammoth.convert_to_html(f)
b.write(document.value.encode('utf8'))
f.close()
b.close()发布于 2017-10-30 13:34:47
我建议您尝试以下代码
import mammoth
with open("document.docx", "rb") as docx_file:
result = mammoth.convert_to_html(docx_file)
html = result.value发布于 2021-07-27 13:48:50
现在回答这个问题可能太晚了,但是如果有人仍然在寻找答案,那么下面的答案将会有所帮助,因为在转换为html之后,单词"tables/images/“应该保持不变。
import win32com.client as win32
# Open MS Word
word = win32.gencache.EnsureDispatch('Word.Application')
doc = word.Documents.Open("D:\filename.docx")
# change to a .html
txt_path = word_file.split('.')[0] + '.html'
# wdFormatFilteredHTML has value 10
# saves the doc as an html
doc.SaveAs(txt_path, 10)
doc.Close()
# noinspection PyBroadException
try:
word.ActiveDocument()
except Exception:
word.Quit()https://stackoverflow.com/questions/47008593
复制相似问题