正则替换掉网页中所有超链接
代码如下 | 复制代码 |
<?php $content = file_get_contents('test.html'); $url = 'http://www.111cn.net'; //要换成的新网址 $preg = '/[s]href=("|')[S]*("|')/i'; $replace = ' href="' . $url . '"'; $content = preg_replace($preg, $replace, $content); //正则替换 create_log('newhtml', $content); //生成新文件 ?> |
下面是写文件操作
代码如下 | 复制代码 |
function create_log($filename, $text) { if ( strtolower(substr($filename, -4)) != 'html' ){ $filename .= '.html'; } $filename = dirname ( __FILE__ ) . '/' . $filename; if (!file_exists ( $filename )) { exec( 'touch '. $filename); exec( 'chmod 777 '. $filename); } $handle = fopen ( $filename, "w+b" ); $text .= "rn"; fwrite($handle, $text); fclose ( $handle ); } |
下面我一个采集的功能
代码如下 | 复制代码 |
$url ='http://www.111cn.net'; $body=@file_get_contents($url); preg_match_all('/href=['"]?([^'"]*)['"]?>(.*)/i',$body,$b); $nums = array(); foreach($b[1] as $u){ if(in_array($u,$nums)){ continue; } $nums[]=$u; $title=strip_tags($u); echo $title."</br>"; } |
时间: 2024-10-26 07:24:44