TP-Docs
HTML5 Icon HTML5 Icon HTML5 Icon
TP on Social Media

Recent

Welcome to TinyPortal. Please login or sign up.

December 09, 2024, 07:10:29 PM

Login with username, password and session length
Members
Stats
  • Total Posts: 195,443
  • Total Topics: 21,252
  • Online today: 125
  • Online ever: 6,457 (November 30, 2024, 02:40:09 PM)
Users Online
  • Users: 0
  • Guests: 144
  • Total: 144

Robots meta tag and noindex

Started by Putoguiri, October 29, 2012, 05:49:10 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Putoguiri

As you may or may not know, SMF nowadays adds a robots meta tag with the value "noindex" for pages it considers "duplicates" such a mobile/wap versions, etc.

This is a well made feature and really useful in order not to spam the indexers and for our bandwidth.

But, for us TP users this is a problem, since for smf index.php (main site) and index.php?action=forum (forum view) is the same, the forum view gets a noindex tag! This means for forums running TP, the main forum view NEVER gets indexed by crawlers.


I don't know if this is known or if there is a fix for it but I think this is definitely not the behavior I would expect it to have.

IchBin

I have no clue when it comes to robot stuff. But I do wonder how you think that index.php is the same as index.php?action=forum Those are different pages right? If you want me to understand the problem though, you'll have to get into some good details so I know what you are talking about.

Putoguiri

Yes, for us who use TP index.php != index.php?action=forum but for a normal SMF user those are the same.

And exactly that's the problem.

From the index.template.php

   
// Please don't index these Mr Robot.
   if (!empty($context['robot_no_index']))
      echo '
   <meta name="robots" content="noindex" />';
 


Since for the forum (plain smf) both are the same site, the meta tag will be shown on index.php?action=forum but not for index.php. Or atleast those are my conclusions after observing this behaviour.


As for the robot stuff, it's a simple tag in order to prevent crawlers from indexing that site. SMF does this in order to prevent the same site being indexed multiple times (I checked on their forums).

A quick fix would be excluding index.php?action=forum from the "noindex" list, wherever this is located. But I think TP should do this by default on installation.

Right now, the forum index page won't be indexed/crawled by bots with this configuration.

Edit: I've checked the source code for tinyportal.net, the same thing happens, the main forum view has the noindex tag active.


IchBin

Ok, so to make sure I understand you. I understand what content="noindex" is for. But you are saying that TP's frontpage does not get indexed. And this is most likely because TP does not set anything for these types of things. It does not modify the code for SMF for any crawler options.

You can possibly add to the TPortal.php file in the doTPfrontpage() function a piece of code to prevent it if you want to test it. It probably depends on when things are loaded, and at the moment I do not know without looking deep into the code.

Just add this line in the function right after the first return; you see.
$context['robot_no_index'] = '';

Putoguiri

No, the TP frontpage gets indexed just fine, the one that doesn't get indexed is the main forum view. (Which, without TP was previously the output of calling index.php without any parameters)

IchBin

SMF checks for any $_GET parameters in the URL on board index. If there is any, it will set robot_no_index to true. I do not want to modify any SMF code, as I'm trying to remove as many edits from the SMF files as possible. If you have any ideas I am all ears.

Putoguiri

By no means am I telling you what to do, what to code or how to code it.

I just posted here what I think is a bug or an improper behaviour. I told you what happens, what is expected to happen and even why I think this error/bug happens. For you it is to decide if it really is and if it needs fixing.

So please don't take anything I posted as an offense because it wasn't meant that way.


Edit:

In my forums I added something like:

if (isset($_GET['action'])) {
if($_GET['action']=='forum'){
$context['robot_no_index'] = '';
}
}


to the index.template.php as a quick fix.

IchBin

I know you weren't telling me how to code or anything. I was sincerely asking if you had any ideas on how to fix it, in particular without editing SMF code. Of course you can modify it like you did, and that works fine for a work around. Just putting it out there to see if anyone can think of something I haven't is all.

Putoguiri

The only way to fix it would be editing smf files, I guess. Either editing the template as a quick fix or searching for the "blacklist", the list of duplicate sites smf has to determine which have to be noindexed or not. Both involve editing smf files.

But, from what I read on smf's forums, it's a matter they don't want the user to touch.

IchBin

It will probably have to wait until SMF 2.1 as I think there are hooks in there that load much later, and could possibly use one of the hooks to set the robots index parameter at that point. In the mean time, people can just use your work around if they are concerned with it. Thanks for bringing attention to this.