将HTML标记转换为RTF文档 [英] Converting HTML markup to a RTF document

查看:165
本文介绍了将HTML标记转换为RTF文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个XML文档,其中包含我要转换为RTF输出文件的嵌入式HTML内容.我已将要用<li>, <p>, <b>和其他HTML标记修饰的XML元素转移到生成的RTF中.

I have an XML document containing embedded HTML content that I am attempting to convert to an RTF output file. I have the XML elements decorated with <li>, <p>, <b> and other HTML markup, that I would like to have transferred into the generated RTF.

这是目前有效的方法:

  1. 以字符串形式获取XML标签内容(包含用于换行符,段落分隔符和列表的HTML标签)
  2. 将XML标签内容写入RTF文件.

我正在使用Python脚本来实现转换.还使用ElementTree(用于解析输入XML)PyRTF-NG(用于将HTML转换为RTF),该库用于处理表和其他特殊格式.目前,除了HTML的降价"之外,我已经设法获得了所需的一切(即将HTML格式的标签转换为实际的RTF格式).为了澄清,我的意思是,如果我的RTF转换器遇到<ol><li>标记,它应该在RTF中创建一个有序列表,而不是随便将<ol><li>标记吐到RTF中.

I am using Python scripts to achieve the conversion. Also being used is ElementTree (to parse input XML) PyRTF-NG (to convert from HTML to RTF), a library that handles tables and other special formatting. At the moment, I have managed to get everything I need except the 'markdown' of the HTML (i.e. translating HTML format tags into actual RTF formatting). To clarify, I mean that if my RTF convertor encounters an <ol><li> tag, it should create an ordered list in the RTF, instead of just spitting out <ol><li> tags into the RTF.

有人知道Python是否有允许我执行此操作的本机调用,或者是否有可能具有完成完全转换为RTF所需的其他Python库.

Does anyone know if Python has any native calls that will allow me to do this, or any other Python libraries that might have what I need to complete the full-conversion into RTF.

谢谢!

推荐答案

最好的免费转换工具是LibreOffice,可以在命令行直接在终端使用,请参见

The best free conversor is the LibreOffice, and it can be used directly by command line at termimal, see

libreoffice --convert-to

Python使用UNO桥间接调用相同的转换器,

The same conversor is indirectally called by Python using UNO bridge,

  • http://api.libreoffice.org/
  • http://software.opensuse.org/package/libreoffice-pyuno
  • ...

这篇关于将HTML标记转换为RTF文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆