TP-Docs
HTML5 Icon HTML5 Icon HTML5 Icon
TP on Social Media

Recent

Welcome to TinyPortal. Please login or sign up.

May 19, 2024, 03:23:37 AM

Login with username, password and session length
Members
  • Total Members: 3,886
  • Latest: Grendor
Stats
  • Total Posts: 195,189
  • Total Topics: 21,220
  • Online today: 70
  • Online ever: 3,540 (September 03, 2022, 01:38:54 AM)
Users Online
  • Users: 0
  • Guests: 97
  • Total: 97

[Auto-FIXED] [Xhtml Error] - Front-Page Posts - Strange Cause

Started by Elrond, November 21, 2007, 12:19:22 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Elrond

My apologies if this has been posted elsewhere, looked for it and didn't find this issue listed anywhere.

Search parameters (make it easy to find for those looking for info on this issue): { Fixed Errors, Xhtml validation, Front-page blocks, Articles, Recent Topics } - I'm such a geek. :2funny:

I have completed conversion of my TP + SMF to Xhtml 1.1. It is valid even on admin pages served as application/xhtml+xml but at the moment is served as text/html except for on one special theme I use to test application/xhtml+xml mimetype.

Well anyway I had heart palpitations today when I validated my page and all of the sudden I was like ACK! Validation error: 4 errors! Good thing I wasn't supporting the application/xhtml+xml theme by default as that would have had some...interesting consequences to say the least.

Problem source was the character truncation limit, which one would think usually works fine, except for the fact that if you truncate it after it's turned from bbcode back into html (xhtml), you might be truncating an element before it is sufficiently closed and/or opened.

The tag that was truncated was
 - it was truncated to <br - and it is obvious what that does in any validation, html 4.01 strict or xhtml 1.0 transitional, strict, 1.1 as text/html, and 1.1 as application/xhtml+xml. It is also important to note what that would mean in a feed to a post that is truncated after parsing the bbcode to html.

My maximum post length on front page setting is set to 500 characters, well something like that. Point is, at the last character, it cuts off the /> from the
 tag.

The fix to this is pretty simple. Knowing how to fix this requires:

Sources/TPortal.php
Knowledge of the max, strpos, and substr functions, the latter two being, quite obviously, string functions.

Open: {root to your forum/tp installation}/Sources/TPortal.php (aka $scripturl.'/Sources/TPortal.php)

Find:

// The first space or line break. (<br />, etc.)
$cutoff = max(strrpos($row['body'], ' '), strrpos($row['body'], '<'));

if ($cutoff !== false)
$row['body'] = substr($row['body'], 0, $cutoff);
$row['body'] .= '...';


Replace with:

// The first space or line break. (<br />, etc.)
$cutoff = max(strrpos($row['body'], '>'), strrpos($row['body'], '<'));

if ($cutoff !== false)
$row['body'] = substr($row['body'], 1, $cutoff);
$row['body'] .= '...';


The first change is documented as follows:

$cutoff = max(strrpos($row['body'], ' '), strrpos($row['body'], '<'));
...to...
$cutoff = max(strrpos($row['body'], '>'), strrpos($row['body'], '<'));

The lines are nearly the same, except in the modified line (which is the second one), '>' replaces ' '. Why is this? Well if that is not included, it is possible, however unlikely in most posts, that an element may be truncated. In the case of html code put in posts by an administrator, this problem may be magnified if the element is block level and not a single line entry (such as img (soon to be deprecated in xhtml2),
 (soon to be replaced by <l> and </l> as block-level line elements in xhtml2), and input elements for example (to be replaced, along with the rest of the form layout in xhtml2 by xforms).

That problem would look like this:


<div>
....* content here *...
* line truncated here at a certain character limit, set in the admin panel under tp settings *

Now in this case the html bbcode wouldn't be parsed at all - I'm not quite sure what happens then, I'll have to experiment there. I think it would pretty much be spit out as tag soup in utf-8 entities or something (similar to converting html code to entities using htmlspecialchars in php). So that problem may not even be of any concern to begin with, especially given the "disabled-after" feature in bbcode, or whatever it's called (too lazy to look that up at the moment, particularly since it is not actually relevant in the context of this subject).

For the most part, the modification should cover the front page posts tag truncation error, so that it will not be as confusing to convert pages to xhtml. An error such as <br... would show up in html 4.01, html 4.01 strict, xhtml 1.0t+f+s, xhtml 1.1t/h, and xhtml 1.1a/xhtml+xml (the latter-most resulting in an xml parsing error that would render app/xhtml+xml-served site useless.


But anyway, hey, no big deal. Just wanted to bring that to anyone's attention who may have (or may in the future) have the same issues. I've been working on some tinyportal sites for 2 weeks and 3 days so far and have found very few difficulties with anything. As none of us are perfect, it is to be expected that there will be something like that that comes up now and then. So far, TP is the best portal software that can be found for SMF. Installed in an instant, which was wow, amazing. Package manager was always an awesome feature too because of that. And from there not very many problems. The settings were well-thought out I can see, and very well laid out. My other admins found it easy to use as well and that's rare to find, so awesome work all the way!