網絡上有數以千計提供服務網頁的網站,它們供應有趣的,甚至富有教育性的資訊,你可以把這些資訊整合到你的網頁裡面,或者製作一個小裝置 (widget) 讓其他人無縫地整合這些資訊到他們的內容傳送平台。Hasin Hayder 在他的網誌介紹怎樣用 PHP 剪輯 Scopesys 的網頁製作一個「當年今日」小裝置。
Hasin 首先鄭重提醒大家,在剪輯人家網頁 (web scraping) 的資料時,必須要仔細閱讀版權內容,在任何情況下都不應該違反版權法。
這個小裝置用 PHP 寫成,讀取 Scopesys 的「當年今日」網頁,並顯示以下類型的內容:
- 歷史上今天發生的大事
- 歷史上今天誕生的人
- 歷史上今天去世的人
- 今天是甚麼假期
- 今天有甚麼宗教儀式
- 歷史上今天發生的宗教大事
好的,現在便看看小裝置的原碼:
<?php
//todayinhistory.php
error_reporting(0);
define("MARKER_START","<H3>On this day…</h3>");
define("MARKER_END","<BR><BR><HR><h3>Holidays</h3>");
define("BIRTHDAY_START","</font></center></center>");
define("BIRTHDAY_END","<HR> <br><H3>Deaths which occurred on ".date("F d").":</H3>");
define("DEATH_START","<HR> <br><H3>Deaths which occurred on ".date("F d").":</H3>");
define("DEATH_END","<HR><IMG align=left src=\"http://www.scopesys.com/flag.gif\">");
define("HOLIDAYS_START",’<i>Note: Some Holidays are only applicable on a given <b>"day of the week"</b></i><br> <br>’);
define("HOLIDAYS_END","<HR> <H3>Religious Observances</H3>");
define("RELIGIOUS_START","<HR> <H3>Religious Observances</H3>");
define("RELIGIOUS_END","<HR> <H3>Religious History </h3>");
define("RELHISTORY_START","<HR> <H3>Religious History </h3>");
define("RELHISTORY_END","<BR><BR><font color=red>");
echo "<h2>Today is ".Date("F d, Y")."</h2>";
$data = file_get_contents("http://www.scopesys.com/today");
if ($_GET['history']==’1′){
echo "<br/><h2 style=’color: green’ >Today in history</h2>";
$end = strpos($data,MARKER_END)-15;
$start = strpos($data,MARKER_START)+strlen(MARKER_START);
echo substr($data,$start,$end-$start);
}
if ($_GET['born']==’1′){
echo "<br/><h2 style=’color: green’ >Who’s born today</h2>";
$end = strpos($data,BIRTHDAY_END);
$start = strpos($data,BIRTHDAY_START)+strlen(BIRTHDAY_START);
echo substr($data,$start,$end-$start);
}
if ($_GET['died']==’1′){
echo "<br/><h2 style=’color: green’ >Who died today</h2>";
$end = strpos($data,DEATH_END);
$start = strpos($data,DEATH_START)+strlen(DEATH_START);
echo substr($data,$start,$end-$start);
}
if ($_GET['holiday']==’1′){
echo "<br/><h2 style=’color: green’ >Where is holiday today</h2>";
$end = strpos($data,HOLIDAYS_END);
$start = strpos($data,HOLIDAYS_START)+strlen(HOLIDAYS_START);
echo substr($data,$start,$end-$start);
}
if ($_GET['religious']==’1′){
echo "<br/><h2 style=’color: green’ >Religious observance</h2>";
$end = strpos($data,RELIGIOUS_END);
$start = strpos($data,RELIGIOUS_START)+strlen(RELIGIOUS_START);
echo substr($data,$start,$end-$start);
}
if ($_GET['relhistory']==’1′){
echo "<br/><h2 style=’color: green’ >Religious history</h2>";
$end = strpos($data,RELHISTORY_END);
$start = strpos($data,RELHISTORY_START)+strlen(RELHISTORY_START);
echo substr($data,$start,$end-$start);
}
?>
現在你想知道今天歷史上有誰出生,只要在瀏覽器輸入 todayinhistory.php?born=1。很多成功的網絡應用程式都是把各方面的資訊剪貼拼湊,在幕後它們就是這樣收集數據的。
Hasin 最後祝大家「網頁剪輯」快樂!
發表新回應