将HTML标记转换为RTF文档 [英] Converting HTML markup to a RTF document
问题描述
我有一个XML文档,其中包含我要转换为RTF输出文件的嵌入式HTML内容.我已将要用<li>, <p>, <b>
和其他HTML标记修饰的XML元素转移到生成的RTF中.
I have an XML document containing embedded HTML content that I am attempting to convert to an RTF output file. I have the XML elements decorated with <li>, <p>, <b>
and other HTML markup, that I would like to have transferred into the generated RTF.
这是目前有效的方法:
- 以字符串形式获取XML标签内容(包含用于换行符,段落分隔符和列表的HTML标签)
- 将XML标签内容写入RTF文件.
我正在使用Python脚本来实现转换.还使用ElementTree(用于解析输入XML)PyRTF-NG(用于将HTML转换为RTF),该库用于处理表和其他特殊格式.目前,除了HTML的降价"之外,我已经设法获得了所需的一切(即将HTML格式的标签转换为实际的RTF格式).为了澄清,我的意思是,如果我的RTF转换器遇到<ol><li>
标记,它应该在RTF中创建一个有序列表,而不是随便将<ol><li>
标记吐到RTF中.
I am using Python scripts to achieve the conversion. Also being used is ElementTree (to parse input XML) PyRTF-NG (to convert from HTML to RTF), a library that handles tables and other special formatting. At the moment, I have managed to get everything I need except the 'markdown' of the HTML (i.e. translating HTML format tags into actual RTF formatting). To clarify, I mean that if my RTF convertor encounters an <ol><li>
tag, it should create an ordered list in the RTF, instead of just spitting out <ol><li>
tags into the RTF.
有人知道Python是否有允许我执行此操作的本机调用,或者是否有可能具有完成完全转换为RTF所需的其他Python库.
Does anyone know if Python has any native calls that will allow me to do this, or any other Python libraries that might have what I need to complete the full-conversion into RTF.
谢谢!
推荐答案
最好的免费转换工具是LibreOffice,可以在命令行直接在终端使用,请参见
The best free conversor is the LibreOffice, and it can be used directly by command line at termimal, see
libreoffice --convert-to
Python使用UNO桥间接调用相同的转换器,
The same conversor is indirectally called by Python using UNO bridge,
- http://api.libreoffice.org/
- http://software.opensuse.org/package/libreoffice-pyuno
- ...
这篇关于将HTML标记转换为RTF文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!