介绍一组中文处理工具函数

函数|中文

<?
/*    中文处理工具函数
--- 空格 ---
      string GBspace(string) --------- 每个中文字之间加空格
      string GBunspace(string) ------- 每个中文字之间的空格清除
      string clear_space(string) ------- 用来清除多余的空格

--- 转换 ---
      string GBcase(string,offset) --- 将字符串内的中英文字转换大小写
                              offset : "upper"   - 字符串全转为大写 (strtoupper)
                                       "lower"   - 字符串全转为小写 (strtolower)
                                       "ucwords" - 将字符串每个字第一个字母改大写 (ucwords)
                                       "ucfirst" - 将字符串第一个字母改大写 (ucfirst)
      string GBrev(string) ----------- 颠倒字符串

--- 文字检查 ---
      int GB_check(string) ----------- 检查字符串内是否有 GB 字,有会返回 true,
                                         否则会返回false
      int GB_all(string) ------------- 检查字符串内所有字是否有 GB 字,是会返回 true,
                                         否则会返回false
      int GB_non(string) ------------- 检查字符串内所有字并不是 GB 字,是会返回 true,
                                         否则会返回false
      int GBlen(string) -------------- 返回字符串长度(中文字只计一字母)

--- 查找、取代、提取 ---
      int/array GBpos(haystack,needle,[offset]) ---- 查找字符串 (strpos)
                              offset : 留空 - 查找第一个出现的位置
                                       int  - 由该位置搜索出现的第一个位置
                                       "r"  - 查找最后一次出现的位置 (strrpos)
                                       "a"  - 将所有查找到的字储存为数组(返回 array)

      string GB_replace(needle,str,haystack) -- 查找与取代字符串 (str_replace)
      string GB_replace_i(needle,str_f,str_b,haystack) -- 不检查大小写查找与取代字符串
                                         needle - 查找字母
                                         str - 取代字母 ( str_f - 该字母前, str_b 该字母后)
                                         haystack - 字符串

      string GBsubstr(string,start,[length]) -- 从string提取出由开始到结尾或长度
                                                  length的字符串。
                                                  中文字只计一字母,可使用正负数。
      string GBstrnear(string,length)         -- 从 string提取最接近 length的字符串。
                                                   length 中 中文字计2个字母。

--- 注意 ---
      如使用由 Form 返回的字符串前,请先替字符串经过 stripslashes() 处理,除去多余的 \ 。

      用法:在原 PHP 代码内加上:
      include ("GB.inc");
      即可使用以上工具函数。
*/

function GBlen($string) {
    $l = strlen($string);
    $ptr = 0;
    $a = 0;
    while ($a < $l) {
        $ch = substr($string,$a,1);
        $ch2 = substr($string,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $ptr++;
            $a += 2;
        } else {
            $ptr++;
            $a++;
        } // END IF
    } // END WHILE

    return $ptr;
}

function GBsubstr($string,$start,$length) {
    if (!is_int($length) && $length != "") {
        return "错误:length 值错误(必须为数值)。<br>";
    } elseif ($length == "0") {
        return "";
    } else {
    $l = strlen($string);
    $a = 0;
    $ptr = 0;
    $str_list = array();
    $str_list2 = array();
    while ($a < $l) {
        $ch = substr($string,$a,1);
        $ch2 = substr($string,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $str_list[$ptr] = $a;
            $str_list2[$ptr] = $a+1;
            $ptr++;
            $a += 2;
        } else {
            $str_list[$ptr] = $a;
            $str_list2[$ptr] = $a;
            $ptr++;
            $a++;
        } // END IF
    } // END WHILE

    if ($start > $ptr || -$start > $ptr) {
        return;
    } elseif ($length == "") {
        if ($start >= 0) { // (text,+)
            return substr($string,$str_list[$start]);
        } else { // (test,-)
            return substr($string,$str_list[$ptr + $start]);
        }
    } else {

        if ($length > 0) { // $length > 0

            if ($start >= 0) {  // (text,+,+)
                if (($start + $length) >= count($str_list2)) {
                    return substr($string,$str_list[$start]);
                } else { //(text,+,+)
                    $end = $str_list2[$start + ($length - 1)] - $str_list[$start] +1;
                    return substr($string,$str_list[$start],$end);
                }

            } else { // (text ,-,+)
                $start = $ptr + $start;
                if (($start + $length) >= count($str_list2)) {
                    return substr($string,$str_list[$start]);
                } else {
                    $end = $str_list2[$start + ($length - 1)] - $str_list[$start] +1;
                    return substr($string,$str_list[$start],$end);
                }
            }

        } else { // $length < 0
            $end = strlen($string) - $str_list[$ptr+$length];
            if ($start >= 0) {  // (text,+,-) {
                return substr($string,$str_list[$start],-$end);
            } else { //(text,-,-)
                $start = $ptr + $start;
                return substr($string,$str_list[$start],-$end);
            }

        } // END OF LENGTH > / < 0

    }
    } // END IF
}

function GB_replace($needle,$string,$haystack) {
    $l = strlen($haystack);
    $l2 = strlen($needle);
    $l3 = strlen($string);
    $news = "";
    $skip = 0;
    $a = 0;
    while ($a < $l) {
        $ch = substr($haystack,$a,1);
        $ch2 = substr($haystack,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            if (substr($haystack,$a,$l2) == $needle) {
                $news .= $string;
                $a += $l2;
            } else {
                $news .= $ch.$ch2;
                $a += 2;
            }
        } else {
            if (substr($haystack,$a,$l2) == $needle) {
                $news .= $string;
                $a += $l2;
            } else {
                $news .= $ch;
                $a++;
            }
        } // END IF
    } // END WHILE
    return $news;
}

function GB_replace_i($needle,$str_f,$str_b,$haystack) {

    $l = strlen($haystack);
    $l2 = strlen($needle);
    $l3 = strlen($string);
    $news = "";
    $skip = 0;
    $a = 0;
    while ($a < $l) {
        $ch = substr($haystack,$a,1);
        $ch2 = substr($haystack,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            if (GBcase(substr($haystack,$a,$l2),"lower") == GBcase($needle,"lower")) {
                $news .= $str_f . substr($haystack,$a,$l2) . $str_b;
                $a += $l2;
            } else {
                $news .= $ch.$ch2;
                $a += 2;
            }
        } else {
            if (GBcase(substr($haystack,$a,$l2),"lower") == GBcase($needle,"lower")) {
                $news .= $str_f . substr($haystack,$a,$l2) . $str_b;
                $a += $l2;
            } else {
                $news .= $ch;
                $a++;
            }
        } // END IF
    } // END WHILE
    return $news;
}

function GBpos($haystack,$needle,$offset) {
    if (!is_int($offset)) {
        $offset = strtolower($offset);
        if ($offset != "" && $offset != "r" && $offset != "a") {
            return "错误:offset 值错误。<br>";
        }
    }
    $l = strlen($haystack);
    $l2 = strlen($needle);
    $found = false;
    $w = 0; // WORD
    $a = 0; // START

    if ($offset == "" || $offset == "r") {
        $atleast = 0;
        $value = false;
    } elseif ($offset == "a") {
        $value = array();
        $atleast = 0;
    } else {
        $value = false;
        $atleast = $offset;
    }
    while ($a < $l) {
        $ch = substr($haystack,$a,1);
        $ch2 = substr($haystack,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40") && $skip == 0) {
            if (substr($haystack,$a,$l2) == $needle) {
                if ($offset == "r") {
                    $found = true;
                    $value = $w;
                } elseif ($offset == "a") {
                    $found = true;
                    $value[] = $w;
                } elseif (!$value) {
                    if ($w >= $atleast) {
                        $found = true;
                        $value = $w;
                    }
                }
            }
            $a += 2;
        } else {
            if (substr($haystack,$a,$l2) == $needle) {
                if ($offset == "r") {
                    $found = true;
                    $value = $w;
                } elseif ($offset == "a") {
                    $found = true;
                    $value[] = $w;
                } elseif (!$value) {
                    if ($w >= $atleast) {
                        $found = true;
                        $value = $w;
                    }
                }
            }
            $a++;
        }
        $w++;
    } // END OF WHILE
    if ($found) {
        return $value;
    } else {
        return $false;
    }
//    } // END OF WHILE

}

function GBrev($text) {
    $news = "";
    $l = strlen($text);
    $GB = 0;
    $a = 0;
    while ($a < $l) {
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40") && $skip == 0) {
            $a += 2;
            $news = $ch . $ch2 . $news;
        } else {
            $news = $ch . $news;
            $a++;
        }
    }
    return $news;
}

function GB_check($text) {
    $l = strlen($text);
    $a = 0;
    while ($a < $l) {
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            return true;
        } else {
            return false;
        }
    }
}

function GB_all ($text) {
    $l = strlen($text);
    $all = 1;
    $a = 0;
    while ($a < $l) {
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $a += 2;
        } else {
            $a++;
            $all = 0;
        }
    }
    if ($all == 1) {
        return true;
    } else {
        return false;
    }
}

function GB_non ($text) {
    $l = strlen($text);
    $all = 1;
    $a = 0;
    while ($a < $l) {
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $a += 2;
            $all = 0;
        } else {
            $a++;
        }
    }
    if ($all == 1) {
        return true;
    } else {
        return false;
    }
}

function GBcase ($text,$case) {
    $case = strtolower($case);
    if ($case != "upper" && $case != "lower" && $case != "ucwords" && $case != "ucfirst") {
        return "函数用法错误。 $case";
    } else {
    $ucfirst = 0;
    $ucwords = 0;
    $news = "";
    $l = strlen($text);
    $GB = 0;
    $english = 0;

    $a = 0;
    while ($a < $l) {

    $ch = substr($text,$a,1);
    if ($GB == 0 && ord($ch) >= HexDec("0x81")) {

            $GB = 1;
            $english = 0;
            $news .= $ch;
            $ucwords = 0;
    
    } elseif ($GB == 1 && ord($ch) >= HexDec("0x40") && $english == 0) {
            $news .= "$ch";    
            $ucwords = 0;
            $GB = 0;

    } else {
        if ($case == "upper") {
            $news .= strtoupper($ch);
        } elseif ($case == "lower") {
            $news .= strtolower($ch);
        } elseif ($case == "ucwords") {
            if ($ucwords == 0) {
                $news .= strtoupper($ch);
            } else {
                $news .= strtolower($ch);
            }
            $ucwords = 1;
        } elseif ($case == "ucfirst") {
            if ($ucfirst == 0) {
                $news .= strtoupper($ch);
                $ucfirst = 1;
            } else {
                $news .= strtolower($ch);
                $ucfirst = 1;
            }
        } else {
            $news .= $ch;
        }
        if ($ch == " " || $ch == "\n") {
            $ucwords = 0;
        }
        $english = 1;
        $GB = 0;
    
    }

    $a++;
    
    } // END OF while
    return $news;
    } // end else
}

function GBspace ($text) {

    $news = "";
    $l = strlen($text);
    $GB = 0;
    $english = 0;

    $a = 0;
    while ($a < $l) {

    $ch = substr($text,$a,1);
    $ch2 = substr($text,$a+1,1);
    if (!($ch == " " && $ch2 == " ")) {
    if ($GB == 0) {
        if (ord($ch) >= HexDec("0x81")) {

            if ($english == 1) {
                if ((substr($text,$a-1,1) == " ") || (substr($text,$a-1,1) == "\n")) {
                    $news .= "$ch";
                } else {
                    $news .= " $ch";
                }
                $english = 0;
                $GB = 1;
            } else {
                $GB = 1;
                $english = 0;
                $news .= $ch;
            }
        } else {
            $english = 1;
            $GB = 0;
            $news .= $ch;
        }
        
    } else {
        if (ord($ch) >= HexDec("0x40")) {
            if ($english == 0) {
                if ((substr($text,$a+1,1) == " ")|| (substr($text,$a+1,1) == "\n")) {
                    $news .= "$ch";    
                } else {
                    $news .= "$ch ";
                }
            } else {
                $news .= " $ch";
            }
        } else {
            $english = 1;
            $news .= "$ch";
        }
        $GB = 0;
    }
    }
    $a++;
    } // END OF while

    // Chk 1 & last is space

    $l = strlen($news);
    if (substr($news,0,1) == " ") {
        $news = substr($news,1);
    }
    $l = strlen($news);
    if (substr($news,$l-1,1) == " ") {
        $news = substr($news,0,$l-1);
    }
    return $news;
}

function GBunspace($text) {
    $news = "";
    $l = strlen($text);
    $a = 0;
    $last_space = 1;
    while ($a < $l) {
    
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        $ch3 = substr($text,$a+2,1);
        if (($a + 1) == $l ) {
            $last_space = 1;
        }
        if ($ch == " ") {
            if ($last_space == 0) {
            if (ord($ch2) >= HexDec("0x81") && ord($ch3) >= HexDec("0x40")) {
                if ($chi == 0) {
                    $news .= " ";
                    $last_space = 1;
                }
                $chi=1;

            } elseif ($ch2 != " ") {
                $news .= " ";
                $chi = 0;
                $last_space = 1;
            }
            }
        } else {
            if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
                $chi = 1;
                $a++;
                $news .= $ch . $ch2;
                $last_space = 0;

            } else {
                $chi = 0;
                $news .= $ch;
                $last_space = 0;
            }

        }
    $a++;
    }
    // Chk 1 & last is space

    $l = strlen($news);
    if (substr($news,0,1) == " ") {
        $news = substr($news,1);
    }
    $l = strlen($news);
    if (substr($news,$l-1,1) == " ") {
        $news = substr($news,0,$l-1);
    }
    return $news;

} // END OF Function

function GBstrnear($text,$length) {

$tex_len = strlen($text);
$a = 0;
$w = "";
while ($a < $tex_len) {
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (GB_all($ch.$ch2)) {
            $w .= $ch.$ch2;
            $a=$a+2;
        } else {
            $w .= $ch;
            $a++;
        }
        if ($a == $length || $a == ($length - 1)) {
            $a = $tex_len;
        }
}
return $w;
} // END OF FUNCTION

function clear_space($text) {
    $t = "";
    for ($a=0;$a<strlen($text);$a++) {
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if ($ch == " " && $ch2 == " ") {
        } else {
            $t .= $ch;
        }
    }
    return $t;
}

?>

时间: 2024-12-07 21:43:10

介绍一组中文处理工具函数的相关文章

中文处理工具函数

<? /*    中文处理工具函数 --- 空格 ---       string GBspace(string) --------- 每个中文字之间加空格       string GBunspace(string) ------- 每个中文字之间的空格清除       string clear_space(string) ------- 用来清除多余的空格 --- 转换 ---       string GBcase(string,offset) --- 将字符串内的中英文字转换大小写   

jQuery之工具函数

  jquery为我们提供了操作数组和对象的工具函数,方便和简化了我们对它们的操作.今天我们就进入jQuery的工具函数的复习.   jQuery给我们提供了主要有5类工具函数: URL 字符串操作 数组和对象操作 测试操作 浏览器 1:URL操作: $.param(obj) 返回 :string: 说明:将jquery对象按照name/value 或者key/value序列化为URL参数,用&连接. 示例: var obj ={name:zh,age:20};  alert(jQuery.pa

从零开始学习jQuery (九) jQuery工具函数_jquery

一.摘要 本系列文章将带您进入jQuery的精彩世界, 其中有很多作者具体的使用经验和解决方案,  即使你会使用jQuery也能在阅读中发现些许秘籍. 我们经常要使用脚本处理各种业务逻辑, 最常见的就是数组和对象的操作. jQuery工具函数为我们操作对象和数组提供了便利条件. 二.前言 大部分人仅仅使用jQuery的选择器选择对象, 或者实现页面动画效果. 在处理业务逻辑时常常自己编写很多算法. 本文提醒各位jQuery也能提高我们操作对象和数组的效率. 并且可以将一些常用算法扩充到jQuer

jQuery 工具函数学习资料_jquery

URL 字符串操作 数组和对象操作 测试操作 浏览器 1:URL操作: $.param(obj) 返回 :string: 说明:将jquery对象按照name/value 或者key/value序列化为URL参数,用&连接. 示例: var obj ={name:zh,age:20}; alert(jQuery.param(obj)); //alert "name=zh&age=20";   2:字符串操作: jQuery.trim(str) 返回:string: 说明

php自定义中文字符串截取函数substr_for_gb2312及substr_for_utf8示例_php技巧

本文实例讲述了php自定义中文字符串截取函数substr_for_gb2312及substr_for_utf8用法.分享给大家供大家参考,具体如下: /* *gb2312中文字符串截取 */ function substr_for_gb2312($str,$start,$len=null) { $totlelength = strlen($str); //特例情况 if ($len == null) $len = $totlelength; if ($len ==0) return ""

js实现的光标位置工具函数示例_javascript技巧

本文实例讲述了js实现的光标位置工具函数.分享给大家供大家参考,具体如下: 这里介绍的一款textarea中光标位置工具函数的例子. html代码: <p>文本框:</p> <textarea name="" id="textarea" cols="30" rows="10"> sASASADASDasfaDFDsfsDFAfdFADf </textarea> <butto

我的 Java 工具函数汇总

源码在:http://git.oschina.net/sp42/ajaxjs/blob/master/ajaxjs-base/src/com/ajaxjs/util/ 字符串工具函数 是否空字符串 assertTrue(isEmptyString("")); assertTrue(isEmptyString(" ")); assertTrue(isEmptyString(null)); Java String 有 split 却没有 join,这里实现一个 asse

以前编写JSP网站时写的一些工具函数

js|函数 初学JSP时,写了一些工具函数因为不太会用JAVA下的正则表达式也只能这么写啦!发出来让大家批评批评提点意见!有几个函数不算是自己写的希望爱挑剌的朋友嘴下留情!我是新手我怕谁,脸皮不行的人水平也上不去呀.嘻嘻.. package mxzc.web.strctrl;public class StringCtrl{/********************************************public synchronized String HTMLcode(String

一个阿拉伯数字转中文数字的函数

函数|中文 最近因需要,写了个"阿拉伯数字转中文数字的函数".搜索了精华区只见到一个类似的.感觉到我的算法不错,所以贴出来共享一下如果要用于金额的转换,对小数部分的处理要做一下修改<?phpfunction ch_num($num,$mode=true) { $char = array("零","壹","贰","叁","肆","伍","陆",