首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >DocumentBuilder tagName根问题

DocumentBuilder tagName根问题
EN

Stack Overflow用户
提问于 2020-02-11 04:02:35
回答 1查看 23关注 0票数 0

我正在尝试使用DocumentBuilder并解析一个包含多个文档的大文件。当我运行我的程序时,我得到这个错误:"The markup in the document following the root element must be well-formed."

我认为这是因为我的文档中没有实际的根,而它是一个TextEdit文档,结构如下:

代码语言:javascript
复制
<DOC>
<DOCNO> AP890106-0001 </DOCNO>
<FILEID>AP-NR-01-06-89 0033EST</FILEID>
<FIRST>r a PM-BRF--Heidnik     01-06 0136</FIRST>
<SECOND>PM-BRF--Heidnik,0139</SECOND>
<HEAD>Torture-Murderer In Fair Condition, Conscious</HEAD>
<DATELINE>PITTSBURGH (AP) </DATELINE>
<TEXT>
   Convicted torture-murderer Gary Heidnik has
regained consciousness after apparently attempting suicide in his
prison cell with a drug overdose, prison officials said.
   Heidnik's condition was upgraded to fair Thursday, but he
remained under tight security in the intensive care unit of West
Penn Hospital, said Tom Seiverling, a spokesman for the State
Correctional Institution at Pittsburgh.
   Heidnik, 45, was semi-comatose earlier this week after being
found unconscious in his cell Sunday. Prison officials believe
Heidnik stored up medications that were prescribed for him by
pretending to take them at the designated times.
   The self-proclaimed minister faces the death sentence for the
slayings of two of six women he kept chained in the basement of his
Philadelphia row house. He was convicted and sentenced last July.
</TEXT>
</DOC>
<DOC>
<DOCNO> AP890106-0002 </DOCNO>
<FILEID>AP-NR-01-06-89 0524EST</FILEID>
<FIRST>d a PM-BRF--DrivingToddler     01-06 0162</FIRST>
<SECOND>PM-BRF--Driving Toddler,0166</SECOND>
<HEAD>3-Year-Old Takes Careening First Drive; Emerges Unharmed</HEAD>
<DATELINE>CAZENOVIA, N.Y. (AP) </DATELINE>
<TEXT>
   Going out to buy a puppy, Cecilia Kaler
placed her three-year-old son in a child seat, left the car running
and got out to clear snow from the windshield. She never finished
the job.
   As soon as his mother closed the door, little Michael Kaler
locked it, put the car in drive, and rode away Wednesday. The car
went down the driveway, across a busy road, narrowly missed a tree
and fire hydrant, rolled on its side down an embankment and finally
came to rest in a creek.
   Michael was wet, cold and otherwise unharmed, said Kaler, a
resident of this community 15 miles southeast of Syracuse.
   A nearby man heard Kaler screaming and rushed over. He smashed a
window and freed little Michael.
   ``Anybody who says there's no God doesn't know what they're
talking about, because someone certainly was looking out for him,''
Kaler said Thursday.
</TEXT>
</DOC>

我想用tagNames <DOC></DOC>将每个文档分开

到目前为止我的代码如下:

代码语言:javascript
复制
 DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
  DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();

  Document doc = dBuilder.parse(document);
  doc.getElementsByTagName("doc").toString();
EN

回答 1

Stack Overflow用户

发布于 2020-02-11 04:17:05

解析文件是不可能的,因为没有“唯一”的根元素。你的

代码语言:javascript
复制
<doc> </doc> 

块必须用另一个标记容器包围:选择您喜欢的名称。然后,当xml格式良好时,您可以尝试解析。

示例:

代码语言:javascript
复制
<mytag>
    <doc> ........</doc> 
    <doc>........... </doc> 
</mytag>
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60157575

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档