用于多行HTML注释的正则表达式(preg_match_all) [英] Regex for multi-line HTML comments (preg_match_all)

查看:188
本文介绍了用于多行HTML注释的正则表达式(preg_match_all)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有多个注释掉的PHP数组的html文档,例如:

 <! -  Array 

[key] => 0

- >

使用PHP,我需要以某种方式解析HTML仅用于这些评论(还有其他评论需要被忽略)并提取内容。我一直在尝试使用 preg_match_all ,但是我的正则表达式技能并不高。任何人都可以指出我正确的方向吗?



任何帮助都是非常感谢!

div>

这里有三个事实:
$ b


  1. 在HTML文档中没有地方存在文字<! - 可以显示,而不是表示注释(其他地方会被转义为& amp;! - -

  2. 您似乎不希望更改文档内容,只能查找其中的位(search-and -replace很有可能破坏文档,搜索本身没有)

  3. 注释不能在HTML中嵌套(与普通的HTML标签相反) - 这使得所有区别

以上组合意味着(lo和behold)正则表达式可以用于识别HTML注释。



试试这个正则表达式:<! - Array([\s\S])*? - > 。匹配组1将包含Array之后的所有内容,直到注释的结束序列。



您可以对查找到的位进行进一步的完整性检查,以确保它们实际上是您要查找的内容。


I have an html document with multiple commented-out PHP arrays, e.g.:

<!-- Array
(
[key] => 0
)
-->

Using PHP, I need to somehow parse the HTML for only these comments (there are other comments that will need to be ignored) and extract the contents. I've been trying to use preg_match_all but my regex skills aren't up to much. Could anyone point me in the right direction?

Any help is much appreciated!

解决方案

Three facts come into play here

  1. there is no place in a HTML document where a literal "<!--" can show up and not mean a comment (everywhere else it would be escaped as "&amp;!--")
  2. you don't seem to want to change the document contents, only find bits in it (search-and-replace has a high probability of breaking the document, search alone has not)
  3. comments cannot be nested in HTML (contrary to normal HTML tags) - this makes all the difference

The above combination means that (lo and behold) regular expressions can be used to identify HTML comments.

Try this regex: <!-- Array([\s\S])*?-->. Match group one will contain everything after "Array" up to the closing sequence of the comment.

You can apply further sanity checking to the found bits to make sure they are in fact what you are looking for.

这篇关于用于多行HTML注释的正则表达式(preg_match_all)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆