目录
- PYTHON 操作 XML
- 读取XML文件
- 遍历XML元素
- 查找XML元素
- 添加XML元素
- 修改XML元素
PYTHON 操作 XML
读取XML文件
关于XML的介绍
<data> 与 </data> 是一对标签的开始与结束
<property … /> 也是一个正确的标签,以 /> 结尾,是在标签没有嵌套内容时的简写形式
name=“cat”,name是<data>标签的一个属性,cat是name属性的值
description here …是<data>标签的内容,这里是一段文本。当然也可以是xml的嵌套
<data name="cat" num=""> description here ... </data>
<property value="node" />
<country name="china">
<province name="beijing">
<school name="the sunshine school" />
</province>
</country>
准备一个demo.xml文件
<data>
<teacher name="Albert">
<birthday></birthday>
<gender>male</gender>
<subject>Math</subject>
</teacher>
<student name="Becky">
<birthday></birthday>
<gender>female</gender>
<hobbies>
<hobby>skating</hobby>
<hobby>rocks</hobby>
</hobbies>
<exam absence="no">
<math></math>
<english></english>
<music></music>
</exam>
</student>
<student name="Cindy">
<birthday></birthday>
<gender>female</gender>
<hobbies>
<hobby>reading</hobby>
<hobby>guitar</hobby>
</hobbies>
<exam absence="yes">
</exam>
</student>
<student name="Duke">
<birthday></birthday>
<gender>male</gender>
<hobbies>
<hobby>football</hobby>
<hobby>surfing</hobby>
</hobbies>
<exam absence="no">
<math></math>
<english></english>
<music></music>
</exam>
</student>
</data>
读取xml文件内容
# Read the .xml file
tree = ET.parse("demo.xml")
root = tree.getroot()
print(root)
结果
<Element 'data' at 0x102d80cf8>
遍历XML元素
for … in … 可以遍历当前元素的所有直接子节点
for n in root:
# items() returns all <key, value> pairs of the tag
print(n, n.tag , n.attrib, n.items())
结果
(<Element 'teacher' at 0x1048b9e48>, 'teacher', {'name': 'Albert'}, [('name', 'Albert')])
(<Element 'student' at 0x1048bf0f0>, 'student', {'name': 'Becky'}, [('name', 'Becky')])
(<Element 'student' at 0x1048bf3c8>, 'student', {'name': 'Cindy'}, [('name', 'Cindy')])
(<Element 'student' at 0x1048bf5f8>, 'student', {'name': 'Duke'}, [('name', 'Duke')])
想要迭代遍历当前元素的所有子节点(包括子孙节点)
for n in root.iter():
print(n, n.tag)
结果
(<Element 'data' at 0x1052f0cf8>, 'data')
(<Element 'teacher' at 0x1052f0e48>, 'teacher')
(<Element 'birthday' at 0x1052f0d30>, 'birthday')
(<Element 'gender' at 0x1052f6080>, 'gender')
(<Element 'subject' at 0x1052f60b8>, 'subject')
(<Element 'student' at 0x1052f60f0>, 'student')
(<Element 'birthday' at 0x1052f6048>, 'birthday')
(<Element 'gender' at 0x1052f6128>, 'gender')
(<Element 'hobbies' at 0x1052f6198>, 'hobbies')
(<Element 'hobby' at 0x1052f6208>, 'hobby')
(<Element 'hobby' at 0x1052f6240>, 'hobby')
(<Element 'exam' at 0x1052f62b0>, 'exam')
(<Element 'math' at 0x1052f6320>, 'math')
(<Element 'english' at 0x1052f6390>, 'english')
(<Element 'music' at 0x1052f6400>, 'music')
(<Element 'student' at 0x1052f63c8>, 'student')
(<Element 'birthday' at 0x1052f6438>, 'birthday')
(<Element 'gender' at 0x1052f6470>, 'gender')
(<Element 'hobbies' at 0x1052f64a8>, 'hobbies')
(<Element 'hobby' at 0x1052f6518>, 'hobby')
(<Element 'hobby' at 0x1052f6588>, 'hobby')
(<Element 'exam' at 0x1052f65c0>, 'exam')
(<Element 'student' at 0x1052f65f8>, 'student')
(<Element 'birthday' at 0x1052f6630>, 'birthday')
(<Element 'gender' at 0x1052f6668>, 'gender')
(<Element 'hobbies' at 0x1052f66a0>, 'hobbies')
(<Element 'hobby' at 0x1052f6710>, 'hobby')
(<Element 'hobby' at 0x1052f6780>, 'hobby')
(<Element 'exam' at 0x1052f67b8>, 'exam')
(<Element 'math' at 0x1052f6828>, 'math')
(<Element 'english' at 0x1052f6898>, 'english')
(<Element 'music' at 0x1052f6908>, 'music')
想要选择性地迭代直接子节点
for n in root.iter('teacher'):
print(n, n.tag)
(<Element 'teacher' atx100f29e48>, 'teacher')
查找XML元素
find与findall查找xml元素
# find the first element
print(root.find('student'))
# find all elements
print(root.findall('student'))
<Element 'student' atx1034300f0>
[<Element 'student' atx1034300f0>, <Element 'student' at 0x1034303c8>, <Element 'student' at 0x1034305f8>]
demo
for n in root:
if n.tag == 'student' and n.get('name') == 'Becky':
exam_node = n.find('exam')
for subject in exam_node:
print(subject.tag + " " + subject.text)
结果
math 90
english 90
music 95
添加XML元素
p = ET.Element(tag_name)
demo
for n in root:
if n.tag == 'student' and n.get('name') == 'Cindy':
exam_node = n.find('exam')
exam_node.set("absence", "no")
for subject in ['math', 'music']:
p = ET.Element(subject)
p.text = ''
exam_node.append(p)
if os.path.exists('new.xml'):
os.remove('new.xml')
tree.write('new.xml', encoding='utf-', xml_declaration=True)
结果
<student name="Cindy">
<birthday>2001</birthday>
<gender>female</gender>
<hobbies>
<hobby>reading</hobby>
<hobby>guitar</hobby>
</hobbies>
<exam absence="no">
<math>90</math><music>90</music></exam>
</student>
修改XML元素
demo
for n in root:
if n.tag == 'student' and n.get('name') == 'Cindy':
exam_node = n.find('exam')
exam_node.set("absence", "no")
exam_node.set("date", "-11-11")
for subject in ['math', 'music']:
p = ET.Element(subject)
p.text = ''
exam_node.append(p)
hobbies_node = n.find('hobbies').findall("hobby")
hobbies_node[].text = 'piano'
p = ET.Element("hobby")
p.set("old_hobby", 'yes')
p.text = 'reading'
n.find('hobbies').remove(hobbies_node[])
n.find('hobbies').append(p)
结果
<student name="Cindy">
<birthday>2001</birthday>
<gender>female</gender>
<hobbies>
<hobby>piano</hobby>
<hobby old_hobby="yes">reading</hobby></hobbies>
<exam absence="no" date="2022-11-11">
<math>90</math><music>90</music></exam>
</student>