关于正则表达式非捕获性分组的疑问 |
| 发布时间:2010-09-10 09:12:46 类别:WEB开发 -> JavaScript技术 |
在看人民邮电出版社出版的《JavaScript高级程序设计》
看到7.3.4 非捕获性分组这里,书上说非捕获性分组不会创建反向引用,下面是例子:
这里alert输出是空的。 到了第8章,又出现了类似以下的代码:
这里alert输出就成了0.9.4 最后那个.4是非捕获性分组捕获到的 这里就奇怪了,为什么前面的非捕获性分组不能捕获,而嵌套的非捕获性分组就又可以捕获了? 而且这里好像不用非捕获性分组照样可以达到相同的目的,为什么要用非捕获性分组呢?因为RegExp.$1取的是第一个捕获组 而 .4包含在第一个捕获组中 .4 所在的第二个组是非捕获组 当取RegExp.$2的时候.4是取不到的 而var reNumbers = /rv:(\d+\.\d+(\.\d+)?)/; RegExp.$2是.4 意思是说RegExp.$1相当于捕获的是(\d+\.\d+(\.\d+)?) 而(?:\.\d+)这里的非捕获组是对RegExp.$2的 (?:)就是不进反向引用吧 我只是把(\d+\.\d+(?:\.\d+)?)认为在做匹配的时候,反向引用变成了对(\d+\.\d+)的引用了 (?:\.\d+) 这里是非捕获组 内存不会保存你取到的值所以 RegexExp的组中应该就没有$2这一个 RegExp.$2 取到的是空的 反向引用的时候 (\d+\.\d+(?:\.\d+)?)--\1 这里是反向引用第一个捕获组,匹配--前后都一样的字符串; 而(\d+\.\d+(?:\.\d+)?)--\2 是有错误的 因为第二个组是非捕获的 内存上就没有第二个组的内容,所以引用失败。 这种说法是错误的 要了解非捕获组就要先了解捕获组,之后再了解为什么会有非捕获组的出现 简单点说,捕获组就是把(Expression)中匹配到的内容保存到一个按“(”出现的顺序编号的组里,以供后续引用,引用的方式有反向引用,或是RegExp.$number等方式,不同的语言,支持的引用方式不同 只要使用了(),默认为使用了捕获组,而这就带来一个问题,有些场景不得不使用(),但又不关心它匹配到的内容,比如写一个匹配24小时制HH:mm:ss的时间的正则如下 ([01][0-9]|2[0-3])(:([0-5][0-9])){2} 通常关心的只是整体的时间,并不关心局部的内容,这样就产生了一种副作用,将不关心的内容单独保存到内存中,只会浪费资源,降低效率 非捕获组就是为了抵消这一副作用来产生的,非捕获组只参与匹配,但不会把匹配到的内容捕获到组里 所以非捕获组根本就不参与编号,也就无从谈起它对应哪个$number 在取不存在的编号的捕获组时,有些语言会返回空字符串,有些语言会报异常 (\d+\.\d+(?:\.\d+)?)中,整体是一个捕获组,按“(”出现的顺序,编号为1,(?:\.\d+)虽然是非捕获组,也是要参与匹配的,只是不将匹配结果单独保存到组里而已 还需要说明的是,在绝大多数语言中,正则表达式整体对应的是$0,捕获组的编号是从1开始的 在有些语言中,还支持(?<name>Expression)的命令捕获组语法,所以有以下两种语法属于捕获组 (Expression) (?<name>Expression) 其余的(?...)之类的语法定义的字符序列都不属于捕获组 Non-Capturing Parentheses: (?:?) In Figure 2-3, we use the parentheses of the (\.[0-9]*)? part for their grouping property, so we could apply a question mark to the whole of \.[0-9]* and make it optional. Still, as a side effect, text matched within these parentheses is captured and saved to $2, which we don't use. Wouldn't it be better if there were a type of parentheses that we could use for grouping which didn't involve the overhead (and possible confusion) of capturing and saving text to a variable that we never intend to use? Perl, and recently some other regex flavors, do provide a way to do this. Rather than using (?), which group and capture, you can use the special notation (?:?), which group but do not capture. With this notation, the "opening parentheses" is the three-character sequence (?:, which certainly looks odd. This use of '?' has no relation to the "optional" ? metacharacter. (Peek ahead to page 90 for a note about why this odd notation was chosen.) So, the whole expression becomes: if ($input =~ m/^([-+]?[0-9]+ (?:\.[0-9]*)?)([CF])$/) Now, even though the parentheses surrounding [CF] are ostensibly the third set, the text they match goes to $2 since, for counting purposes, the (?:?) set doesn't, well, count. The benefits of this are twofold. One is that by avoiding the unnecessary capturing, the match process is more efficient (efficiency is something we'll look at in great detail in Chapter 6). Another is that, overall, using exactly the type of parentheses needed for each situation may be less confusing later to someone reading the code who might otherwise be left wondering about the exact nature of each set of parentheses. On the other hand, the (?:?) notation is somewhat unsightly, and perhaps makes the expression more difficult to grasp at a glance. Are the benefits worth it? Well, personally, I tend to use exactly the kind of parentheses I need, but in this particular case, it's probably not worth the confusion. For example, efficiency isn't really an issue since the match is done just once (as opposed to being done repeatedly in a loop). Throughout this chapter, I'll tend to use (?) even when I don't need their capturing, just for their visual clarity. 本篇来自于:代码秀 (http://www.39g.com),本文详细出处请访问以上网站. |
上一篇文章:表单提交之后,还可以执行代码吗?
|
精品推荐
阅读排行
· Visual Studio 2008下载及破解方法· 解决sql2000挂起无法安装的问题
· 如何用xmlhttp登录校内网(www.xiaonei.com)?
· 【C#学习加油站】--第三方控件下载篇
· 自己动手输入网页背景颜色代码
· [分享]OPENGL:gl.h glut.h glaux.h 安装包
· C# listview中怎么才能显示imagelist中的图片
· 请问大虾,如何才能清空数组,不是把元素清为零,而是删除数组,重新定义一个!
· ORA-01480:STR 赋值变量缺少空后缀
· 礼花背景(一个非常经典的网页背景)
· 随机数字生成器
· local function definitions are illegal是说我哪错了?
相关文章
· 关于登录便捷方式的问题· 关于.NET网站发布问题求教
· C# Marshal类的中关于内存资源的释放。求解答!
· 关于C#计时器的问题
· 关于页面内容加载的问题
· 关于list control的问题
· 关于ajax登录的问题
· [图文] 求助,还是关于php上传文件的问题(之前发了个100分的求助帖,结果给结贴了,刚..
· 关于扫描二维码识别问题
· 关于ref 以及调用方法中的参数问题
· 关于.obj文件问题
· 关于 project的问题
