目录
- PYTHON 操作 XML
- 读取XML文件
- 遍历XML元素
- 查找XML元素
- 添加XML元素
- 修改XML元素
PYTHON 操作 XML
读取XML文件
关于XML的介绍
<data> 与 </data> 是一对标签的开始与结束
<property … /> 也是一个正确的标签,以 /> 结尾,是在标签没有嵌套内容时的简写形式
name=“cat”,name是<data>标签的一个属性,cat是name属性的值
description here …是<data>标签的内容,这里是一段文本。当然也可以是xml的嵌套
<data name="cat" num=""> description here ... </data> | |
<property value="node" /> | |
<country name="china"> | |
<province name="beijing"> | |
<school name="the sunshine school" /> | |
</province> | |
</country> |
准备一个demo.xml文件
<data> | |
<teacher name="Albert"> | |
<birthday></birthday> | |
<gender>male</gender> | |
<subject>Math</subject> | |
</teacher> | |
<student name="Becky"> | |
<birthday></birthday> | |
<gender>female</gender> | |
<hobbies> | |
<hobby>skating</hobby> | |
<hobby>rocks</hobby> | |
</hobbies> | |
<exam absence="no"> | |
<math></math> | |
<english></english> | |
<music></music> | |
</exam> | |
</student> | |
<student name="Cindy"> | |
<birthday></birthday> | |
<gender>female</gender> | |
<hobbies> | |
<hobby>reading</hobby> | |
<hobby>guitar</hobby> | |
</hobbies> | |
<exam absence="yes"> | |
</exam> | |
</student> | |
<student name="Duke"> | |
<birthday></birthday> | |
<gender>male</gender> | |
<hobbies> | |
<hobby>football</hobby> | |
<hobby>surfing</hobby> | |
</hobbies> | |
<exam absence="no"> | |
<math></math> | |
<english></english> | |
<music></music> | |
</exam> | |
</student> | |
</data> |
读取xml文件内容
# Read the .xml file | |
tree = ET.parse("demo.xml") | |
root = tree.getroot() | |
print(root) |
结果
<Element 'data' at 0x102d80cf8>
遍历XML元素
for … in … 可以遍历当前元素的所有直接子节点
for n in root: | |
# items() returns all <key, value> pairs of the tag | |
print(n, n.tag , n.attrib, n.items()) |
结果
(<Element 'teacher' at 0x1048b9e48>, 'teacher', {'name': 'Albert'}, [('name', 'Albert')])
(<Element 'student' at 0x1048bf0f0>, 'student', {'name': 'Becky'}, [('name', 'Becky')])
(<Element 'student' at 0x1048bf3c8>, 'student', {'name': 'Cindy'}, [('name', 'Cindy')])
(<Element 'student' at 0x1048bf5f8>, 'student', {'name': 'Duke'}, [('name', 'Duke')])
想要迭代遍历当前元素的所有子节点(包括子孙节点)
for n in root.iter(): | |
print(n, n.tag) |
结果
(<Element 'data' at 0x1052f0cf8>, 'data')
(<Element 'teacher' at 0x1052f0e48>, 'teacher')
(<Element 'birthday' at 0x1052f0d30>, 'birthday')
(<Element 'gender' at 0x1052f6080>, 'gender')
(<Element 'subject' at 0x1052f60b8>, 'subject')
(<Element 'student' at 0x1052f60f0>, 'student')
(<Element 'birthday' at 0x1052f6048>, 'birthday')
(<Element 'gender' at 0x1052f6128>, 'gender')
(<Element 'hobbies' at 0x1052f6198>, 'hobbies')
(<Element 'hobby' at 0x1052f6208>, 'hobby')
(<Element 'hobby' at 0x1052f6240>, 'hobby')
(<Element 'exam' at 0x1052f62b0>, 'exam')
(<Element 'math' at 0x1052f6320>, 'math')
(<Element 'english' at 0x1052f6390>, 'english')
(<Element 'music' at 0x1052f6400>, 'music')
(<Element 'student' at 0x1052f63c8>, 'student')
(<Element 'birthday' at 0x1052f6438>, 'birthday')
(<Element 'gender' at 0x1052f6470>, 'gender')
(<Element 'hobbies' at 0x1052f64a8>, 'hobbies')
(<Element 'hobby' at 0x1052f6518>, 'hobby')
(<Element 'hobby' at 0x1052f6588>, 'hobby')
(<Element 'exam' at 0x1052f65c0>, 'exam')
(<Element 'student' at 0x1052f65f8>, 'student')
(<Element 'birthday' at 0x1052f6630>, 'birthday')
(<Element 'gender' at 0x1052f6668>, 'gender')
(<Element 'hobbies' at 0x1052f66a0>, 'hobbies')
(<Element 'hobby' at 0x1052f6710>, 'hobby')
(<Element 'hobby' at 0x1052f6780>, 'hobby')
(<Element 'exam' at 0x1052f67b8>, 'exam')
(<Element 'math' at 0x1052f6828>, 'math')
(<Element 'english' at 0x1052f6898>, 'english')
(<Element 'music' at 0x1052f6908>, 'music')
想要选择性地迭代直接子节点
for n in root.iter('teacher'): | |
print(n, n.tag) | |
(<Element 'teacher' atx100f29e48>, 'teacher') |
查找XML元素
find与findall查找xml元素
# find the first element | |
print(root.find('student')) | |
# find all elements | |
print(root.findall('student')) | |
<Element 'student' atx1034300f0> | |
[<Element 'student' atx1034300f0>, <Element 'student' at 0x1034303c8>, <Element 'student' at 0x1034305f8>] |
demo
for n in root: | |
if n.tag == 'student' and n.get('name') == 'Becky': | |
exam_node = n.find('exam') | |
for subject in exam_node: | |
print(subject.tag + " " + subject.text) |
结果
math 90
english 90
music 95
添加XML元素
p = ET.Element(tag_name)
demo
for n in root: | |
if n.tag == 'student' and n.get('name') == 'Cindy': | |
exam_node = n.find('exam') | |
exam_node.set("absence", "no") | |
for subject in ['math', 'music']: | |
p = ET.Element(subject) | |
p.text = '' | |
exam_node.append(p) | |
if os.path.exists('new.xml'): | |
os.remove('new.xml') | |
tree.write('new.xml', encoding='utf-', xml_declaration=True) |
结果
<student name="Cindy">
<birthday>2001</birthday>
<gender>female</gender>
<hobbies>
<hobby>reading</hobby>
<hobby>guitar</hobby>
</hobbies>
<exam absence="no">
<math>90</math><music>90</music></exam>
</student>
修改XML元素
demo
for n in root: | |
if n.tag == 'student' and n.get('name') == 'Cindy': | |
exam_node = n.find('exam') | |
exam_node.set("absence", "no") | |
exam_node.set("date", "-11-11") | |
for subject in ['math', 'music']: | |
p = ET.Element(subject) | |
p.text = '' | |
exam_node.append(p) | |
hobbies_node = n.find('hobbies').findall("hobby") | |
hobbies_node[].text = 'piano' | |
p = ET.Element("hobby") | |
p.set("old_hobby", 'yes') | |
p.text = 'reading' | |
n.find('hobbies').remove(hobbies_node[]) | |
n.find('hobbies').append(p) |
结果
<student name="Cindy">
<birthday>2001</birthday>
<gender>female</gender>
<hobbies>
<hobby>piano</hobby>
<hobby old_hobby="yes">reading</hobby></hobbies>
<exam absence="no" date="2022-11-11">
<math>90</math><music>90</music></exam>
</student>