我在使用minidom解析url中的数据时遇到了这个错误xml.parsers.expat.ExpatError: syntax error: line 1, column 0。有人能帮我吗?
下面是我的代码:
from xml.dom import minidom
import urllib2
url= 'http://www.awgp.org/about_us'
openurl=urllib2.urlopen(url)
doc=minidom.parse("about_us.xml")错误:
Traceback (most recent call last):
File "test3.py", line 11, in <module>
doc=minidom.parse("about_us.xml")
File "C:\Python27\lib\xml\dom\minidom.py", line 1918, in parse
return expatbuilder.parse(file)
File "C:\Python27\lib\xml\dom\expatbuilder.py", line 924, in parse
result = builder.parseFile(fp)
File "C:\Python27\lib\xml\dom\expatbuilder.py", line 211, in parseFile
parser.Parse("", True)
xml.parsers.expat.ExpatError: syntax error: line 1, column 0发布于 2019-07-03 17:36:02
parser.Parse("", True)
xml.parsers.expat.ExpatError: syntax error: line 1, column 0上面来自您的回溯的内容告诉我您的"about_us.xml“文件是空的。您有openurl,但是您还没有显示出您曾经调用过openurl.read()来实际获取数据。您也没有展示在哪里或者如何将所述数据写入"about_us.xml“文件。
from xml.dom import minidom
import urllib2
url= 'http://www.awgp.org/about_us'
openurl=urllib2.urlopen(url)
doc=minidom.parse(openurl)
print doc给了我
Traceback (most recent call last):
File "main.py", line 5, in <module>
doc=minidom.parse(openurl)
File "/usr/local/lib/python2.7/xml/dom/minidom.py", line 1918, in parse
return expatbuilder.parse(file)
File "/usr/local/lib/python2.7/xml/dom/expatbuilder.py", line 928, in parse
result = builder.parseFile(file)
File "/usr/local/lib/python2.7/xml/dom/expatbuilder.py", line 207, in parseFile
parser.Parse(buffer, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 51, column 81这表明您试图解析为XML的页面不是格式良好的。试着用beautiful soup代替,这在内存中是非常容易理解的。
from BeautifulSoup import BeautifulSoup
import urllib2
url= 'http://www.awgp.org/about_us'
openurl=urllib2.urlopen(url)
soup = BeautifulSoup(openurl.read())
for a in soup.findAll('a'):
print (a.text, a.get('href'))顺便说一句,你需要3版的Beautiful Soup,因为你还在使用python2.7
https://stackoverflow.com/questions/56866707
复制相似问题