本节主要内容:
php中文支持函数:中文截取字符串、中文字符串翻转。
例1,
复制代码 代码示例:
<?php
/**
* 将一个字串中含有全角的数字字符、字母、空格或'%+-()'字符转换为相应半角字符
*
* @access public
* @param string $str 待转换字串
*
* @return string $str 处理后字串
*/
function make_semiangle($str)
{
$arr = array('0' => '0', '1' => '1', '2' => '2', '3' => '3', '4' => '4',
'5' => '5', '6' => '6', '7' => '7', '8' => '8', '9' => '9',
'A' => 'A', 'B' => 'B', 'C' => 'C', 'D' => 'D', 'E' => 'E',
'F' => 'F', 'G' => 'G', 'H' => 'H', 'I' => 'I', 'J' => 'J',
'K' => 'K', 'L' => 'L', 'M' => 'M', 'N' => 'N', 'O' => 'O',
'P' => 'P', 'Q' => 'Q', 'R' => 'R', 'S' => 'S', 'T' => 'T',
'U' => 'U', 'V' => 'V', 'W' => 'W', 'X' => 'X', 'Y' => 'Y',
'Z' => 'Z', 'a' => 'a', 'b' => 'b', 'c' => 'c', 'd' => 'd',
'e' => 'e', 'f' => 'f', 'g' => 'g', 'h' => 'h', 'i' => 'i',
'j' => 'j', 'k' => 'k', 'l' => 'l', 'm' => 'm', 'n' => 'n',
'o' => 'o', 'p' => 'p', 'q' => 'q', 'r' => 'r', 's' => 's',
't' => 't', 'u' => 'u', 'v' => 'v', 'w' => 'w', 'x' => 'x',
'y' => 'y', 'z' => 'z',
'(' => '(', ')' => ')', '〔' => '[', '〕' => ']', '【' => '[',
'】' => ']', '〖' => '[', '〗' => ']', '“' => '[', '”' => ']',
'‘' => '[', '’' => ']', '{' => '{', '}' => '}', '《' => '<',
'》' => '>',
'%' => '%', '+' => '+', '—' => '-', '-' => '-', '~' => '-',
':' => ':', '。' => '.', '、' => ',', ',' => '.', '、' => '.',
';' => ',', '?' => '?', '!' => '!', '…' => '-', '‖' => '|',
'”' => '"', '’' => '`', '‘' => '`', '|' => '|', '〃' => '"',
' ' => ' ','$'=>'$','@'=>'@','#'=>'#','^'=>'^','&'=>'&','*'=>'*');
return strtr($str, $arr);
}
例2,php中文截取字符串
复制代码 代码示例:
<?php
/*
Utf-8、gb2312都支持的汉字截取函数
cut_str(字符串, 截取长度, 开始长度, 编码);
编码默认为 utf-8
开始长度默认为 0
* edit: www.jb200.com
*/
function cut_str($string, $sublen, $start = 0, $code = 'UTF-8') {
if ($code == 'UTF-8') {
$pa = "/[x01-x7f]|[xc2-xdf][x80-xbf]|xe0[xa0-xbf][x80-xbf]|[xe1-xef][x80-xbf][x80-xbf]|xf0[x90-xbf][x80-xbf][x80-xbf]|[xf1-xf7][x80-xbf][x80-xbf][x80-xbf]/";
preg_match_all ( $pa, $string, $t_string );
if (count ( $t_string [0] ) - $start > $sublen)
return join ( '', array_slice ( $t_string [0], $start, $sublen ) ) . "...";
return join ( '', array_slice ( $t_string [0], $start, $sublen ) );
} else {
$start = $start * 2;
$sublen = $sublen * 2;
$strlen = strlen ( $string );
$tmpstr = '';
for($i = 0; $i < $strlen; $i ++) {
if ($i >= $start && $i < ($start + $sublen)) {
if (ord ( substr ( $string, $i, 1 ) ) > 129) {
$tmpstr .= substr ( $string, $i, 2 );
} else {
$tmpstr .= substr ( $string, $i, 1 );
}
}
if (ord ( substr ( $string, $i, 1 ) ) > 129)
$i ++;
}
if (strlen ( $tmpstr ) < $strlen)
$tmpstr .= "...";
return $tmpstr;
}
}
$str = "abcd需要截取的字符串";
echo cut_str ( $str, 1, 0, 'gb2312' );
?>
例3,字符串的载取
复制代码 代码示例:
<?
/**
* 字符截取 支持UTF8/GBK
* @param $string
* @param $length
* @param $dot
*/
function str_cut($string, $length, $charset = 'utf-8', $dot = '...') {
$strlen = strlen($string);
if($strlen <= $length) return $string;
$string = str_replace(array(' ',' ', '&', '"', ''', '“', '”', '—', '<', '>', '·', '…'), array('∵',' ', '&', '"', "'", '“', '”', '—', '<', '>', '·', '…'), $string);
$strcut = '';
if(strtolower($charset) == 'utf-8') {
$length = intval($length-strlen($dot)-$length/3);
$n = $tn = $noc = 0;
while($n < strlen($string)) {
$t = ord($string[$n]);
if($t == 9 || $t == 10 || (32 <= $t && $t <= 126)) {
$tn = 1; $n++; $noc++;
} elseif(194 <= $t && $t <= 223) {
$tn = 2; $n += 2; $noc += 2;
} elseif(224 <= $t && $t <= 239) {
$tn = 3; $n += 3; $noc += 2;
} elseif(240 <= $t && $t <= 247) {
$tn = 4; $n += 4; $noc += 2;
} elseif(248 <= $t && $t <= 251) {
$tn = 5; $n += 5; $noc += 2;
} elseif($t == 252 || $t == 253) {
$tn = 6; $n += 6; $noc += 2;
} else {
$n++;
}
if($noc >= $length) {
break;
}
}
if($noc > $length) {
$n -= $tn;
}
$strcut = substr($string, 0, $n);
$strcut = str_replace(array('∵', '&', '"', "'", '“', '”', '—', '<', '>', '·', '…'), array(' ', '&', '"', ''', '“', '”', '—', '<', '>', '·', '…'), $strcut);
} else {
$dotlen = strlen($dot);
$maxi = $length - $dotlen - 1;
$current_str = '';
$search_arr = array('&',' ', '"', "'", '“', '”', '—', '<', '>', '·', '…','∵');
$replace_arr = array('&',' ', '"', ''', '“', '”', '—', '<', '>', '·', '…',' ');
$search_flip = array_flip($search_arr);
for ($i = 0; $i < $maxi; $i++) {
$current_str = ord($string[$i]) > 127 ? $string[$i].$string[++$i] : $string[$i];
if (in_array($current_str, $search_arr)) {
$key = $search_flip[$current_str];
$current_str = str_replace($search_arr[$key], $replace_arr[$key], $current_str);
}
$strcut .= $current_str;
}
}
return $strcut.$dot;
}
例4,PHP中文字符串翻转
翻转一个字符串可以使用strrev()函数即可。
但有时需要处理是字符串是含中文的,这样用strrev就会出现乱码。
这里自定义一个函数来处理含中文的字符。
复制代码 代码示例:
<?php
/**
* 中文字符串翻转
* by www.jb200.com
*/
function cstrrev($str)
{
$len = strlen($str);
for($i = 0; $i < $len; $i++)
{
$char = $str{0};
if(ord($char) > 127)
{
$i++;
if($i < $len)
{
$arr[] = substr($str, 0, 2);
$str = substr($str, 2);
}
}
else
{
$arr[] = $char;
$str = substr($str, 1);
}
}
return join(array_reverse($arr));
}
#使用方法:
$str = '中文.look!';
echo cstrrev($str);
#结果输出:!kool.文中
附5,
复制代码 代码示例:
<?php
function str_replace_cn($needle, $str, $haystack, $charset = "utf-8"){
$re['utf-8'] = "/[x01-x7f]|[xc2-xdf][x80-xbf]|[xe0-xef][x80-xbf]{2}|[xf0-xff][x80-xbf]{3}/";
$re['gb2312'] = "/[x01-x7f]|[xb0-xf7][xa0-xfe]/";
$re['gbk'] = "/[x01-x7f]|[x81-xfe][x40-xfe]/";
$re['big5'] = "/[x01-x7f]|[x81-xfe]([x40-x7e]|xa1-xfe])/";
preg_match_all($re[$charset], $haystack, $match_haystack);
preg_match_all($re[$charset], $needle, $match_needle);
for($i = 0; $i < count($match_needle); $i ++){
if(!in_array($match_needle[0][$i], $match_haystack[0]))return $haystack;//无匹配
}
$match_haystack = $match_haystack[0];
$match_needle = $match_needle[0];
for($i = 0; $i < count($match_haystack); $i ++){
if($match_haystack[$i] == "")
continue;
if($match_haystack[$i] == $match_needle[0]){
if(count($match_needle) == 1){//如果只一个字符
$match_haystack[$i] = $str;
}else{
$flag = true;
for($j = 1; $j < count($match_needle); $j ++){
if($match_haystack[$i + $j] != $match_needle[$j]){
$flag = false;
break;
}
}
if($flag){//匹配
$match_haystack[$i] = $str;
for($j = 1; $j < count($match_needle); $j ++){
$match_haystack[$i + $j] = "";
}
}
}
}
}
return implode("", $match_haystack);
}