我正在解析一个由外部program生成的xml文件。然后,我想使用我自己的命名空间向该文件添加自定义注释。我的输入如下所示:
<sbml xmlns="http://www.sbml.org/sbml/level2/version4" xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" level="2" version="4">
<model metaid="untitled" id="untitled">
<annotation>...</annotation>
<listOfUnitDefinitions>...</listOfUnitDefinitions>
<listOfCompartments>...</listOfCompartments>
<listOfSpecies>
<species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
<annotation>
<celldesigner:extension>...</celldesigner:extension>
</annotation>
</species>
<species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
<annotation>
<celldesigner:extension>...</celldesigner:extension>
</annotation>
</species>
</listOfSpecies>
<listOfReactions>...</listOfReactions>
</model>
</sbml>问题是,lxml只在使用名称空间时声明它们,这意味着声明会重复多次,如下所示(简化):
<sbml xmlns="namespace" xmlns:celldesigner="morenamespace" level="2" version="4">
<listOfSpecies>
<species>
<kjw:test xmlns:kjw="http://this.is.some/custom_namespace"/>
<celldesigner:data>Some important data which must be kept</celldesigner:data>
</species>
<species>
<kjw:test xmlns:kjw="http://this.is.some/custom_namespace"/>
</species>
....
</listOfSpecies>
</sbml>是否可以强制lxml在父元素(如sbml或listOfSpecies )中只编写一次此声明?或者,有没有好的理由不这么做?我想要的结果是:
<sbml xmlns="namespace" xmlns:celldesigner="morenamespace" level="2" version="4" xmlns:kjw="http://this.is.some/custom_namespace">
<listOfSpecies>
<species>
<kjw:test/>
<celldesigner:data>Some important data which must be kept</celldesigner:data>
</species>
<species>
<kjw:test/>
</species>
....
</listOfSpecies>
</sbml>重要的问题是,从文件中读取的现有数据必须保留,因此我不能只创建一个新的根元素(我认为?)。
编辑:代码附在下面。
def annotateSbml(sbml_input):
from lxml import etree
checkSbml(sbml_input) # Makes sure the input is valid sbml/xml.
ns = "http://this.is.some/custom_namespace"
etree.register_namespace('kjw', ns)
sbml_doc = etree.ElementTree()
root = sbml_doc.parse(sbml_input, etree.XMLParser(remove_blank_text=True))
nsmap = root.nsmap
nsmap['sbml'] = nsmap[None] # Makes code more readable, but seems ugly. Any alternatives to this?
nsmap['kjw'] = ns
ns = '{' + ns + '}'
sbmlns = '{' + nsmap['sbml'] + '}'
for species in root.findall('sbml:model/sbml:listOfSpecies/sbml:species', nsmap):
species.append(etree.Element(ns + 'test'))
sbml_doc.write("test.sbml.xml", pretty_print=True, xml_declaration=True)
return发布于 2012-07-06 02:08:32
在lxml中不能修改节点的名称空间映射。请参阅将此功能作为愿望列表项的this open ticket。
它起源于lxml邮件列表中的this thread,其中给出了一个workaround replacing the root node作为替代。不过,替换根节点有一些问题:请参阅上面的票证。
为了完整起见,我将建议的根目录替换解决方法代码放在这里:
>>> DOC = """<sbml xmlns="http://www.sbml.org/sbml/level2/version4" xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" level="2" version="4">
... <model metaid="untitled" id="untitled">
... <annotation>...</annotation>
... <listOfUnitDefinitions>...</listOfUnitDefinitions>
... <listOfCompartments>...</listOfCompartments>
... <listOfSpecies>
... <species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
... <annotation>
... <celldesigner:extension>...</celldesigner:extension>
... </annotation>
... </species>
... <species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
... <annotation>
... <celldesigner:extension>...</celldesigner:extension>
... </annotation>
... </species>
... </listOfSpecies>
... <listOfReactions>...</listOfReactions>
... </model>
... </sbml>"""
>>>
>>> from lxml import etree
>>> from StringIO import StringIO
>>> NS = "http://this.is.some/custom_namespace"
>>> tree = etree.ElementTree(element=None, file=StringIO(DOC))
>>> root = tree.getroot()
>>> nsmap = root.nsmap
>>> nsmap['kjw'] = NS
>>> new_root = etree.Element(root.tag, nsmap=nsmap)
>>> new_root[:] = root[:]
>>> new_root.append(etree.Element('{%s}%s' % (NS, 'test')))
>>> new_root.append(etree.Element('{%s}%s' % (NS, 'test')))
>>> print etree.tostring(new_root, pretty_print=True)
<sbml xmlns:celldesigner="http://www.sbml.org/2001/ns/celldesigner" xmlns:kjw="http://this.is.some/custom_namespace" xmlns="http://www.sbml.org/sbml/level2/version4"><model metaid="untitled" id="untitled">
<annotation>...</annotation>
<listOfUnitDefinitions>...</listOfUnitDefinitions>
<listOfCompartments>...</listOfCompartments>
<listOfSpecies>
<species metaid="s1" id="s1" name="GenA" compartment="default" initialAmount="0">
<annotation>
<celldesigner:extension>...</celldesigner:extension>
</annotation>
</species>
<species metaid="s2" id="s2" name="s2" compartment="default" initialAmount="0">
<annotation>
<celldesigner:extension>...</celldesigner:extension>
</annotation>
</species>
</listOfSpecies>
<listOfReactions>...</listOfReactions>
</model>
<kjw:test/><kjw:test/></sbml>发布于 2016-07-26 16:33:12
我知道这是一个老问题,但它仍然有效,从lxml 3.5.0开始,可能有更好的解决方案:
cleanup_namespaces()接受一个新参数top_nsmap,该参数将所提供的前缀名称空间映射的定义移动到树的顶部。
因此,现在只需调用以下命令即可将名称空间映射上移:
nsmap = {'kjw': 'http://this.is.some/custom_namespace'}
etree.cleanup_namespaces(root, top_nsmap=nsmap)发布于 2013-06-18 12:36:43
如果临时将命名空间属性添加到根节点,则可以做到这一点。
ns = '{http://this.is.some/custom_namespace}'
# add 'kjw:foobar' attribute to root node
root.set(ns+'foobar', 'foobar')
# add kjw namespace elements (or attributes) elsewhere
... get child element species ...
species.append(etree.Element(ns + 'test'))
# remove temporary namespaced attribute from root node
del root.attrib[ns+'foobar']https://stackoverflow.com/questions/11346480
复制相似问题