.NET正则表达式中的 Bug

又发现了一个 .net 的 bug!最近在使用正则表达式的时候发现：在忽略大小写的时候，匹配值从 0xff 到 0xffff 之间的所有字符，正则表达式竟然也能匹配两个 ASCII 字符：i(code: 0x69) 和 I(code: 0x49);但是仍然不能匹配其他的 ASCII 字母和数字。

　比如以下的代码就是用来测试用正则表达式匹配从 0xff 到 0xffff 的字符。而值范围在 0 到 0xfe 的所有字符是不能被匹配的。

　　1234567891011121314151617Regex regex = new Regex(@"[\u00FF-\uFFFF]+");
　　// The characters, whoes value are smaller than 0xff, are not expected to be matched.
　　for (int i = 0; i < 0xff; i++) {
　　string s = new string(new char[] { (char)i });
　　Debug.Assert(
　　!regex.IsMatch(s),
　　string.Format("The character was not expected to be matched: 0x{0:X}!", i));
　　}
　　// However, the characters whoes value are greater than 0xfe are expected to be matched.
　　for (int i = 0xff; i <= 0xffff; i++) {
　　string s = new string(new char[] { (char)i });
　　Debug.Assert(
　　regex.IsMatch(s),
　　string.Format("The character was expected to be matched: 0x{0:X}!", i));
　　}
　　这时的运行结果是正常的，没有任何的断言错误出现。

　　然而当使用忽略大小写的匹配模式时，结果就不一样了。将上面代码中的第一行改成：

　　1Regex regex = new Regex(@"[\u00FF-\uFFFF]+", RegexOptions.IgnoreCase);
　　程序运行的时候就会有两处断言错误。它们分别是字符值为 73 和 105，也就是小写字母 i 和大写字母 I。这个 bug 非常奇怪，别的字符都很正常!而且用 javascript 脚本在 IE (版本是6.0)里面运行也同样有这么 bug 存在(比如下面这段代码)。然而在 Firefox 中运行就是没有问题的。还是 Firefox 好啊，呵呵!

　　1234567891011121314151617var re = /[\u00FF-\uFFFF]+/;
　　// var re = /[\u00FF-\uFFFF]+/i;
　　for(var i=0; i<0xff; i++) {
　　var s = String.fromCharCode( i );
　　if ( re.test(s) ){
　　alert( 'Should not be matched: ' + i + '!' );
　　}
　　}
　　for(var i=0xff; i<=0xffff; i++) {
　　var s = String.fromCharCode( i );
　　if ( !re.test(s) ){
　　alert( 'Should be matched: ' + i + '!' );
　　}
　　}

时间： 2024-11-29 13:25:27

.NET正则表达式中的 Bug

.NET正则表达式中的 Bug的相关文章

正则表达式中的特殊字符

正则表达式中的特殊字符一览

JavaScript正则表达式中的global属性的使用

JavaScript正则表达式中的ignoreCase属性使用详解

在JavaScript的正则表达式中使用exec()方法

JavaScript的正则表达式中test()方法的使用

js正则表达式中test,exec,match方法的区别介绍

Java正则表达式中的捕获组的概念及相关API使用

char-dos中的bug还是字体变大后显示的差别？