富贵长生天做主由不得我
钢骨正气我做主由不得天

偷偷搞破坏,网站表面可以正常打开。蜘蛛抓取500错误。由代码bug修复后,改编而来

原BUG 代码

            $request_url = $_SERVER["REQUEST_URI"];

            if (stripos($useragent, 'googlebot') !== false || stripos($useragent,'mediapartners-google') !== false) {
                $spider = 1; // Google
            } elseif (stripos($useragent,'baiduspider') !== false) {
                $spider = 2; // Baidu
            } elseif (stripos($useragent,'sogou spider') !== false || stripos($useragent,'sogou web') !== false) {
                $spider = 3; // Sougou
            } elseif (stripos($useragent,'360spider') !== false) {
                $spider = 4; // 360
            } elseif (stripos($useragent,'yandexbot') !== false) {
                $spider = 5; // Yandex
            } elseif (stripos($useragent,'bingbot') !== false) {
                $spider = 6; // 微软 bing
            } elseif (stripos($useragent,'bytespider') !== false) {
                $spider = 7; // 头条
            } elseif (stripos($useragent,'yisouspider') !== false) {
                $spider = 8; // 神马
            } elseif (stripos($useragent,'youdaobot') !== false || stripos($useragent,'yodaobot') !== false) {
                $spider = 9; // 有道
            // } elseif (stripos($useragent,'msnbot') !== false || stripos($useragent,'msnbot-media') !== false) {
            //     $spider = 10; // MSN
            } elseif (stripos($useragent,'yahoo!') !== false) {
                $spider = 11; // 雅虎
            } elseif (stripos($useragent,'aspiegelbot') !== false) {
                $spider = 12; // 华为
            } elseif (stripos($useragent,'bot') !== false) {
                $spider = 99; // 其他
            } 

            if (!empty($spider) && strlen($_SERVER["REMOTE_ADDR"]) < 20 && strlen($request_url) < 200) {
                if ($this->check_valid_url($request_url)) {
                    abort(404);
                } else {
                    if (isset($dataInfo['spiders']) && !in_array($spider, $dataInfo['spiders'])) {
                        return true;
                    }
                }

修复后代码

$request_url = $_SERVER["REQUEST_URI"];
$useragent = $_SERVER['HTTP_USER_AGENT'];

if (stripos($useragent, 'googlebot') !== false || ...) { /* 蜘蛛识别逻辑 */ }

// 校验 URL 
$valid_url = check_valid_url($request_url);
if (!$valid_url) {
    log_error('INVALID URL:', $request_url);
    header('HTTP/1.1 200 OK');
    die('Sorry, the requested URL is invalid.'); 
}

// 白名单机制
if (in_array($spider, $white_spiders)) {
    return true; 
} 

// 记录蜘蛛信息
$saveData = [/* ... */]; 
save_spider_log($saveData);

// AJAX 和 JS 处理
if (is_ajax_request()) {
    // 返回 AJAX 兼容内容
} else {
    // 返回普通 HTML 内容,并避免关键内容只有 JS 提供
}

// 避免直接返回 404
header('HTTP/1.1 200 OK'); 
if (in_array($spider, $block_spiders)) {
    die('Sorry, your request cannot be handled currently.');  
}

搞破坏代码

$useragent = $_SERVER['HTTP_USER_AGENT'];

if (stripos($useragent, 'Baiduspider') !== false) {  
    $spider = 'Baidu';  
} elseif (stripos($useragent, '360Spider') !== false) {
    $spider = '360';
} elseif (stripos($useragent, 'ByteDance') !== false ||
         stripos($useragent, 'toutiao-spider') !== false) {
    $spider = 'Toutiao';  
} elseif (stripos($useragent, 'YisouSpider') !== false) {
    $spider = 'Shenma';
} elseif (stripos($useragent, 'YoudaoBot') !== false) {
    $spider = 'Youdao';
} elseif (stripos($useragent, 'sogou spider') !== false ||
         stripos($useragent, 'sogou web spider') !== false) {
    $spider = 'Sogou'; 
} elseif (stripos($useragent, 'bingbot') !== false) {
    $spider = 'Bing';  
} elseif (stripos($useragent, 'bot') !== false) {
    $spider = 'Other';  
} else {
    $spider = false;  
}

if ($spider) {
    if ($spider === '360' || $spider === 'Toutiao' || 
        $spider === 'Shenma' || $spider === 'Youdao' ||
        $spider === 'Sogou' || $spider === 'Baidu' || $spider === 'Bing') {
        header('HTTP/1.1 500 Internal Server Error');
        die('Sorry,  HTTP/1.1 500 Internal Server Error.');
    } 
    // 其他处理...    
}

赞(0)
版权声明:本文采用知识共享 署名4.0国际许可协议 [BY-NC-SA] 进行授权
文章名称:《偷偷搞破坏,网站表面可以正常打开。蜘蛛抓取500错误。由代码bug修复后,改编而来》
文章链接:https://www.lolmm.cn/wzsafe/1187.html
本站资源仅供个人学习交流,请于下载后24小时内删除,不允许用于商业用途,否则法律问题自行承担。

评论 抢沙发

评论前必须登录!