php与python批量去除UTF-8文件BOM头

发布时间:2020-07-31编辑:脚本学堂
如何批量去除utf-8文件的bom头信息?分享二个脚本,一个是php函数去除bom头信息的方法,一个是python版本的,需要的朋友参考下。

批量去除utf-8文件中bom头信息的例子。

1,php版本函数:
 

复制代码 代码示例:
function checkBOM ($filename) {
 global $auto;
 $contents = file_get_contents($filename);
 $charset[1] = substr($contents, 0, 1);
 $charset[2] = substr($contents, 1, 1);
 $charset[3] = substr($contents, 2, 1);
 if (ord($charset[1]) == 239 && ord($charset[2]) == 187 && ord($charset[3]) == 191) {
     if ($auto == 1) {
  $rest = substr($contents, 3);
  rewrite ($filename, $rest);
  return ("<font color=red>BOM found, automatically removed.</font>");
     } else {
  return ("<font color=red>BOM found.</font>");
     }
 }
 else return ("BOM Not Found.");
}

2,一个 python 版本的脚本:
 

复制代码 代码示例:
#!/usr/bin/env python
#-*- coding: utf-8 -*-
 
import os
 
def delBOM():
    file_count = 0
    bom_files  = []
 
    for dirpath, dirnames, filenames in os.walk('.'):
 if(len(filenames)):
     for filename in filenames:
  file_count += 1
  file = open(dirpath + "/" + filename, 'r+')
  file_contents = file.read()
 
  if(len(file_contents) > 3):
      if(ord(file_contents[0]) == 239 and ord(file_contents[1]) == 187 and ord(file_contents[2]) == 191):
   bom_files.append(dirpath + "/" + filename)
   file.seek(0)
   file.write(file_contents[3:])
   print bom_files[-1], "BOM found. Deleted."
  file.close()
 
    print file_count, "file(s) found.", len(bom_files), "file(s) have a bom. Deleted."
 
if __name__ == "__main__":
    delBOM()