文章/答案/技术大牛

发布

社区首页 >问答首页 >Joomla noindex，遵循PHP代码

问Joomla noindex，遵循PHP代码
EN

Stack Overflow用户

提问于 2011-05-16 14:48:43

回答 3查看 1.7K关注 0票数 0

我有一个基于joomla的新闻网站，有一吨的无用页面显示在搜索引擎索引。至少作为一个快速修复，直到我可以考虑从头开始重建站点之前，我想实现一个NOINDEX，在除以.html结尾的主页和文章页面之外的所有页面上遵循元标记

在处理here和elsewhere的各种代码片段时，我想出了以下方法：

<?php
if ((JRequest::getVar('view') == "frontpage" ) || ($_SERVER['REQUEST_URI']=='*.html' ))    {
echo "<meta name=\"robots\" content=\"index,follow\"/>\n";
} else {
echo "<meta name=\"robots\" content=\"noindex,follow\"/>\n";
}
?>

我仍然是php编程的新手，我确信我肯定犯了几个错误，所以我想知道是否一个好心的灵魂能够给我的代码再看一遍，让我知道在我不小心破坏我的网站之前是否可以使用它。

谢谢,

汤姆

php

joomla

seo

回答 3

Stack Overflow用户

发布于 2011-05-16 15:39:38

对此使用robots.txt文件不是更好吗？

一些主要的爬虫程序支持Allow指令，该指令可以抵消后面的Disallow指令。当用户不允许使用整个目录，但仍然希望对该目录中的一些HTML文档进行爬行和索引时，这是非常有用的。虽然通过标准实现，第一个匹配的robots.txt模式总是获胜，但谷歌的实现的不同之处在于，允许在指令路径中具有相等或更多字符的模式胜过匹配的不允许模式。Bing使用Allow或Disallow指令，这是最具体的。

为了与所有robots兼容，如果希望允许单个文件位于其他不允许的目录中，则必须首先放置allow指令，然后放置Disallow指令，例如：

Allow: /folder1/myfile.html Disallow: /folder1/

本例将禁止/folder1/中除/folder1/myfile.html之外的任何内容，因为后者将首先匹配。然而，对于谷歌来说，顺序并不重要。

票数 1

Stack Overflow用户

发布于 2011-05-16 15:01:02

这永远不会匹配：

$_SERVER['REQUEST_URI']=='*.html'

==是文字比较，不解析通配符。您可以使用substr检查字符串的结尾：

substr($_SERVER['REQUEST_URI'], -5) == '.html'

或者，您可以使用正则表达式：

//This will match when .html is enywhere inside the string
preg_match('/\.html/', $_SERVER['REQUEST_URI'])

//This will match when .html is at the end of the string, but the
//substr solution is faster in that case
preg_match('/\.html$/', $_SERVER['REQUEST_URI'])

票数 0

Stack Overflow用户

发布于 2011-05-18 17:57:29

从这里的帖子和一位朋友那里得到的建议，我想出了这个：

您需要转到/public_html/libraries/joomla/document/html并编辑html.php

替换

//set default document metadata
     $this->setMetaData('Content-Type', $this->_mime . '; charset=' . $this->_charset , true );
     $this->setMetaData('robots', 'index, follow' );

使用

//set default document metadata
$this->setMetaData('Content-Type', $this->_mime . '; charset=' . $this->_charset , true );

$queryString = $_SERVER['REQUEST_URI'];
if (( $queryString == '' ) || ( $queryString == 'index.php/National-news' ) || ( $queryString == 'index.php/Business' ) || ( $queryString == 'index.php/Sport' ) || ( substr($queryString, -5 ) == '.html' )) {
$this->setMetaData('robots', 'index, follow' );
}else {
$this->setMetaData('robots', 'noindex, follow' );
}

这将更新站点上每个页面上的meta robots标签，从搜索引擎中删除所有乱七八糟的内容，只留下我们希望在索引中找到的内容。

在接下来的几天里，我将尝试在测试服务器上运行它，并返回报告。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/6014008

复制

相似问题

问Joomla noindex，遵循PHP代码
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Joomla noindex，遵循PHP代码EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Joomla noindex，遵循PHP代码
EN