python解析xml模块封装代码_Python模块

python解析xml模块封装代码: 发布时间：2020-08-27编辑：脚本学堂

分享下在python中解析xml文件的模块用法，以及对模块封装的方法，有需要的朋友参考下。

有如下的xml文件：

复制代码代码示例:

<?xml version="1.0" encoding="utf-8" ?>  

<root>  

<childs>  

<child name='first' >1</child>  

<child value="2">2</child>  

</childs>  

</root>

下面介绍python解析xml文件的几种方法，使用python模块实现。

方式1，python模块实现自动遍历所有节点：

复制代码代码示例:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from xml.sax.handler import ContentHandler
from xml.sax import parse

class TestHandle(ContentHandler):
    def __init__(self, inlist):
        self.inlist = inlist

    def startElement(self,name,attrs):
        print 'name:',name, 'attrs:',attrs.keys()

    def endElement(self,name):
        print 'endname',name

    def characters(self,chars):
        print 'chars',chars
        self.inlist.append(chars)


if __name__ == '__main__':
    lt = []
    parse('test.xml', TestHandle(lt))
    print lt

结果：
[html] view plaincopy
name: root attrs: []
chars

name: childs attrs: []
chars

name: child attrs: [u'name']
chars 1
endname child
chars

name: child attrs: [u'value']
chars 2
endname child
chars

endname childs
chars

endname root
[u'n', u'n', u'1', u'n', u'2', u'n', u'n']

方式2，python模块实现获取根节点，按需查找指定节点：

复制代码代码示例:

#!/usr/bin/env python    

# -*- coding: utf-8 -*-    

from xml.dom import minidom    

xmlstr = '''''<?xml version="1.0" encoding="UTF-8"?> 

<hash> 

    <request name='first'>/2/photos/square/type.xml</request> 

    <error_code>21301</error_code> 

    <error>auth faild!</error> 

</hash> 

'''  

def doxml(xmlstr):  

    dom = minidom.parseString(xmlstr)      

    print 'Dom:'      

    print dom.toxml()    

    root = dom.firstChild      

    print 'root:'      

    print root.toxml()    

    childs = root.childNodes    

    for child in childs:  

        print child.toxml()  

        if child.nodeType == child.TEXT_NODE:  

            pass  

        else:  

            print 'child node attribute name:', child.getAttribute('name')  

            print 'child node name:', child.nodeName  

            print 'child node len:',len(child.childNodes)  

            print 'child data:',child.childNodes[0].data  

            print '======================================='  

            print 'more help info to see:'  

            for med in dir(child):  

                print help(med)      

if __name__ == '__main__':    

    doxml(xmlstr)

结果：
[html] view plaincopy
Dom:
<?xml version="1.0" ?><hash>
    <request name="first">/2/photos/square/type.xml</request>
    <error_code>21301</error_code>
    <error>auth faild!</error>
</hash>
root:
<hash>
    <request name="first">/2/photos/square/type.xml</request>
    <error_code>21301</error_code>
    <error>auth faild!</error>
</hash>

<request name="first">/2/photos/square/type.xml</request>
child node attribute name: first
child node name: request
child node len: 1
child data: /2/photos/square/type.xml
=======================================
more help info to see:
两种方法各有其优点，python的xml处理模块太多，目前只用到这2个。

=====补充分割线================
实际工作中发现python的mimidom无法解析其它编码的xml，只能解析utf-8的编码，而其xml文件的头部申明也必须是utf-8，为其它编码会报错误。
网上的解决办法都是替换xml文件头部的编码申明，然后转换编码为utf-8再用minidom解码，实际测试为可行，不过有点累赘的感觉。

1/2 1 2 下一页尾页

上一篇：python网页请求urllib2模块封装代码
下一篇：python 解析XML python模块xml.dom解析xml实例

与 python解析xml模块封装代码有关的文章

本文标题：python解析xml模块封装代码
本页链接：http://www.jb200.com/python/16586.htm

浏览排行

栏目分类

热点文章

python解析xml模块封装代码