关于php抓取asp网页 20
我需要抓取的网页是:http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL...
我需要抓取的网页是:
http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table
总不能成功。
有用过file_get_contents(),下面是源代码:
<?php
$url='http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table';
$lines_string=file_get_contents($url);
echo htmlspecialchars($lines_string);
?>
结果出来的是:Warning: file_get_contents(http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 500 Internal Server Error in F:\WWW\library.php on line 5
还有用过curlur,下面是源代码:
<?php
$url = "http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
echo $contents;
?>
结果出来的是:
Object moved to here.
我也尝试了snoopy插件,也是不行。
我抓取简单的php网页时就可以,例如:www.lingxiren.com
<?php
$url='http://www.lingxiren.com';
$lines_string=file_get_contents($url);
echo htmlspecialchars($lines_string);
?>
我有改allow_url_fopen,也这样user_agent=”Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)”,user_agent前面的分号我也去掉了。
希望有人可以帮到我,试验成功后,给我一份代码,感激不尽。 展开
http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table
总不能成功。
有用过file_get_contents(),下面是源代码:
<?php
$url='http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table';
$lines_string=file_get_contents($url);
echo htmlspecialchars($lines_string);
?>
结果出来的是:Warning: file_get_contents(http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 500 Internal Server Error in F:\WWW\library.php on line 5
还有用过curlur,下面是源代码:
<?php
$url = "http://lib.gdin.edu.cn/search/searchresult.aspx?ANYWORDS=4444&dt=ALL&cl=ALL&dept=ALL&sf=M_PUB_YEAR&ob=DESC&page=1&dp=20&sm=table";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$contents = curl_exec($ch);
curl_close($ch);
echo $contents;
?>
结果出来的是:
Object moved to here.
我也尝试了snoopy插件,也是不行。
我抓取简单的php网页时就可以,例如:www.lingxiren.com
<?php
$url='http://www.lingxiren.com';
$lines_string=file_get_contents($url);
echo htmlspecialchars($lines_string);
?>
我有改allow_url_fopen,也这样user_agent=”Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)”,user_agent前面的分号我也去掉了。
希望有人可以帮到我,试验成功后,给我一份代码,感激不尽。 展开
1个回答
推荐律师服务:
若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询