我正在从web服务获取值,就像许多标签跨度样式,p,类样式c&“。我想转换为xml标签并解析这些值。谁能告诉我如何将html标签转换为xml标签?给出一个例子。
我的web服务值是:
11-11 19:35:36.922: INFO/System.out(6956): Article detail response<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><soap:Body><getDataResponse xmlns="http://tempuri.org/QuestIPhoneWebService/QuestIPhoneWebService"><getDataResult><ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"><ARTICLE ARTICLE_ID="23221" HIDE_HEADER="0" MIGRATED="0" CITNART_DOC_REGION_INFO="" ISCSUSER="1" ARTICLE_TYPE_ID="31" ARTICLE_TYPE="Mobile- News and Commentary - Europe" CITN_ISSUE_NUMBER="" CITN_ARTICLE_TYPE_ID="" CITN_ARTICLE_TYPE="" SHOW_AUTH="1" LOGO_TYPE="QUEST" TITLE="Elementis - europe" DATE="2010-11-04T11:58:21.387" BODY="&lt;span style=&quot;WIDOWS: 2; TEXT-TRANSFORM: none; TEXT-INDENT: 0px; BORDER-COLLAPSE: separate; FONT: medium 'Times New Roman'; WHITE-SPACE: normal; ORPHANS: 2; LETTER-SPACING: normal; COLOR: rgb(0,0,0); WORD-SPACING: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: none; -webkit-text-stroke-width: 0px&quot; class=&quot;Apple-style-span&quot;&gt;&lt;span class=&quot;Apple-style-span&quot;&gt;
11-11 19:35:36.932: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;span lang=&quot;EN-US&quot;&gt;At the end of 2008, the FTSE350 chemical sector consisted of just two names &amp;#8211; Johnston Matthey and Croda. Since then we have had the admission of Victrex and, as of last week, Elementis and Yule Catto. Having met management, we believe that Elementis has all the ingredients for value creation that Croda has so successfully exhibited.&lt;/span&gt;&lt;/p&gt;
11-11 19:35:36.961: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;span lang=&quot;EN-US&quot;&gt;Being promoted into the FTSE250 opens Elementis up to a whole new investment audience. It has not just got there through a cyclical bounce back either. The company has gone through a very sensible rationalisation programme, exited a low-returning business (UK Chromium), is running much more efficient levels of working capital, and crucially, is more exposed to growth markets. To give an idea of management&amp;#8217;s resolve, instead of selling the UK Chromium business they decided to effectively bulldoze the site. This will prevent a competitor from interfering in Elementis&amp;#8217; position in US Chromium.&lt;/span&gt;&lt;/p&gt;
11-11 19:35:36.971: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;span lang=&quot;EN-US&quot;&gt;During the credit crunch Elementis picked up an Asian-focused speciality chemicals business called Deuchem for &amp;#163;38m (&amp;#163;45m sales). Deuchem has 12 offices in&lt;?xml:namespace prefix = st1 /&gt;&lt;st1:country-region&gt;&lt;st1:place&gt;China&lt;/st1:place&gt;&lt;/st1:country-region&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;and is benefiting as the Chinese customer moves up the quality/performance scale. Previously, Chinese demand was not for sophisticated products &amp;#8211; this is changing as we type. Coatings are the main market for speciality products, with Oilfield Chemicals the next biggest category. The cost of Elementis&amp;#8217; products per end unit remains small, typically &amp;lt;5%. Yet the relationship with the customer (its largest is Akzo Nobel) is generally one that has been forged over many years (even decades) and required them to work closely together. In short, it is not particularly competitive, but does require consistent delivery and performance from Elementis. We have a very conservative top-line growth forecast of 3% for specialty chemicals, yet would not be surprised if it was nearer 5%. Margin progression here is key and we expect a mid-to-high teens margin up from 9%.&lt;/span&gt;&lt;/p&gt;
11-11 19:35:36.983: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;span lang=&quot;EN-US&quot;&gt;Another growth area is shale gas. Elementis makes the lubricant for the drill bit. Typically, drilling was vertical. But, now drill bits can be turned 90 degrees accessing much more of the shale seam. This requires much more lubricant &amp;#8211; hence H1 2010 volumes were double the year before. There is only one competitor in this area. Elsewhere in the US Elementis has its US Chromium business. This is steady, has high&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;st1:country-region&gt;US&lt;/st1:country-region&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;market shares and has a superior transport advantage to competitors exporting to the&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;st1:country-region&gt;&lt;st1:place&gt;US&lt;/st1:place&gt;&lt;/st1:country-region&gt;. This is a solid business growing at 3% with a 15% operating margin.&lt;/span&gt;&lt;/p&gt;
11-11 19:35:36.983: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;span lang=&quot;EN-US&quot;&gt;Since the credit crunch the CFO has tightened up inventory management and creditor days. This has helped to transfer c.&amp;#163;25m of value to shareholders, a vital step in maximizing returns for shareholders. On a separate note management think there is a chance that an EU fine worth &amp;#163;21m that Elementis has paid could be reversed.&lt;/span&gt;&lt;/p&gt;
11-11 19:35:36.992: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;span lang=&quot;EN-US&quot;&gt;We&amp;#8217;ve updated the Modeller approach we used in last month&amp;#8217;s CITN note &amp;#8220;&lt;a href=&quot;http://www.csquest.com/QUEST?uid=MAIL&amp;amp;Tp=Cn&amp;amp;PCF=CNAR&amp;amp;ID=23243&quot; target=&quot;_blank&quot;&gt;It&amp;#8217;s Elementary&lt;/a&gt;&amp;#8221;. Instead of using a&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;http://www.csquest.com/QUEST?clpg=ART&amp;amp;id=13586&amp;amp;clid=&amp;amp;pg=MDL&amp;amp;spl=&amp;amp;cid=0241854&quot; target=&quot;_blank&quot;&gt;central valuation (100p)&lt;/a&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&amp;#8211; half way between the&lt;a href=&quot;http://www.csquest.com/QUEST?clpg=ART&amp;amp;id=13629&amp;amp;clid=&amp;amp;pg=MDL&amp;amp;spl=&amp;amp;cid=0241854&quot; target=&quot;_blank&quot;&gt;bull (135p)&lt;/a&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;and bear (67p) scenarios &amp;#8211; since seeing management, we&amp;#8217;re now happier using a valuation halfway between the bull case and the central case. Given this renewed confidence, we think this 118p adjusted valuation is very credible indeed. With 24% upside to Friday&amp;#8217;s close, Elementis is a buy.&lt;/span&gt;&lt;/p&gt;
11-11 19:35:36.992: INFO/System.out(6956): &lt;p&gt;
11-11 19:35:37.002: INFO/System.out(6956): &lt;table style=&quot;WIDTH: 345.75pt; BORDER-COLLAPSE: collapse; MARGIN-LEFT: 4pt&quot; class=&quot;MsoTableGrid&quot; border=&quot;0&quot; cellspacing=&quot;0&quot; cellpadding=&quot;0&quot; width=&quot;461&quot;&gt;
11-11 19:35:37.012: INFO/System.out(6956): &lt;tbody&gt;
11-11 19:35:37.023: INFO/System.out(6956): &lt;tr&gt;
11-11 19:35:37.023: INFO/System.out(6956): &lt;td style=&quot;PADDING-BOTTOM: 0cm; PADDING-LEFT: 5.4pt; WIDTH: 345.75pt; PADDING-RIGHT: 5.4pt; PADDING-TOP: 0cm&quot; valign=&quot;top&quot; width=&quot;461&quot;&gt;
11-11 19:35:37.033: INFO/System.out(6956): &lt;p style=&quot;LINE-HEIGHT: 11pt; MARGIN: 0.75pt 0cm 0.75pt -3.95pt&quot; class=&quot;MsoNormal&quot;&gt;&lt;b&gt;&lt;span lang=&quot;EN-US&quot;&gt;Sales Team&lt;/span&gt;&lt;/b&gt;&lt;span lang=&quot;EN-US&quot;&gt;&lt;span class=&quot;Apple-converted-space&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;mailto:salesteam@collinsstewart.com&quot; target=&quot;_blank&quot;&gt;salesteam@collinsstewart.com&lt;/a&gt;, Tel: +44 (0) 20 7523 8493&lt;/span&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/p&gt;&lt;/span&gt;&lt;/span&gt;" IS_PROTECTED="0" PDF_NAME="" REFERENCE_CITN_ARTICLE_ID="23221" ISNEWARTICLE="5" HYPERLINK="/PATH/23221.pdf"><SUMMARY>Elementis Europe Summary</SUMMARY><AUTHORS/></ARTICLE><ASSOCIATED_COMPANIES ARTICLE_ID="23221"/><COMPANIES_WITH_AUTH context="COMPANIES"/></ROOT>
11-11 19:35:37.033: INFO/System.out(6956): </getDataResult></getDataResponse></soap:Body></soap:Envelope>发布于 2010-11-11 22:54:43
他们会告诉您使用HTML解析类,这是一个不错的建议。但是,在问题空间足够有限的情况下(就像这里),可以使用正则表达式来运行。这种尝试是否有意义取决于几个方面,包括您自己对这两种方法的舒适度。
不过,我还不能说这个正则表达式可能是什么,因为您还没有确切地向展示,您需要什么样的输出来给定一些特定的输入。如果您这样做了,我将编辑此答案以向您展示如何操作。
发布于 2011-03-18 18:29:27
您也可以使用htmlcleaner。它可以将任何HTML代码转换为XML。
https://stackoverflow.com/questions/4155613
复制相似问题