TP-Docs
HTML5 Icon HTML5 Icon HTML5 Icon
TP on Social Media

Recent

Welcome to TinyPortal. Please login or sign up.

Members
  • Total Members: 3,963
  • Latest: BiZaJe
Stats
  • Total Posts: 195,917
  • Total Topics: 21,308
  • Online today: 884
  • Online ever: 8,223 (February 19, 2025, 04:35:35 AM)
Users Online
  • Users: 1
  • Guests: 444
  • Total: 445
  • @rjen

Word cloud

Started by JPDeni, August 04, 2006, 08:34:56 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

JPDeni

Actually, it should be
AND board.permission_mode = 0
which will eliminate announce-only boards.

gijs

Quote from: JPDeni on October 25, 2006, 02:12:54 PM

BTW, when you posted this, it was 1 am where I was. I haven't been ignoring you. I've been asleep.

thanx so far, I'll give it a try asap. BTW, I just figured that you might not follow all ongoing discussions, I wasn't impatient :-)

JPDeni

I understand. :)

I was looking at this again and found that most of my words were the same size, or close enough to the same so it didn't make a nice cloud. I changed the printout a bit.

echo '<div style="text-align: center">';
// Build the cloud's HTML
  foreach ($random as $value) {
    $fsize = intval($word_list[$value] / $low_count) *4;
    $fsize = $fsize + 4;
    if ($fsize > 20) { $fsize= 20; }
    echo '<span style="font-size:' . $fsize . 'pt;">' . $value . '</span> ';
  }
echo '</div>';

Gives much more variation, while making the range of sizes still between 8 and 20.

gijs

I would be great if the words in the cloud are links that link to a search on that words in the forum (with 'search order'=most recent topics first).

Think I can't do that by simply adding s/t to the url that is linked to?

JPDeni

That's way beyond the scope of this little snippet and would require a complete rewrite.As in starting from scratch. The only thing you could do, I suppose, would be to link to a search for that word. I'd have to investigate to be able to figure out how to do it, but it most likely could be done.

gijs

Quote from: JPDeni on October 25, 2006, 04:31:12 PM
The only thing you could do, I suppose, would be to link to a search for that word.

I must have explained it wrong, but that's exactly what I was looking for. For example the word 'paris' in the cloud should link to a search on the word 'paris'..


JPDeni

#26
Oh. I thought you wanted a link directly to the most recent post with that word in it. I guess I'm still not quite awake yet. :)

First, you need to change the first line to
global $db_prefix,$scripturl;
Then change the printout section to
echo '<div style="text-align: center">';
// Build the cloud's HTML
  foreach ($random as $value) {
    $fsize = intval($word_list[$value] / $low_count) *4;
    $fsize = $fsize + 4;
    if ($fsize > 20) { $fsize= 20; }
    echo '<span style="font-size:' . $fsize . 'pt;">' . '<a href="',$scripturl,'?action=search2;search=',$value,';sort=ID_MSG|desc">',$value . '</a></span> ';
  }
echo '</div>';


gijs

#27
 ;D I just amazed myself ...

Here is what I was looking for:

global $db_prefix;
$number_of_words = 40;
$min_length = 5;

// This is the list of words to exclude from your cloud
 $exclude_words = array(
   '@http://@',
   '@ about @',
   '@ also @',
   '@ because @',
   '@ been @',
   '@ cant @',
   '@ could @',
   '@ didnt @',
   '@ doesnt @',
   '@ dont @',
   '@ even @',
   '@ from @',
   '@ going @',
   '@ have @',
   '@ havent @',
   '@ here @',
   '@ http_request @',
   '@ into @',
   '@ its @',
   '@ just @',
   '@ like @',
   '@ look @',
   '@ make @',
   '@ many @',
   '@ more @',
   '@ much @',
   '@ must @',
   '@ need @',
   '@ should @',
   '@ shouldnt @',
   '@ some @',
   '@ someone @',
   '@ such @',
   '@ the @',
   '@ take @',
   '@ that @',
   '@ their @',
   '@ then @',
   '@ there @',
    '@ theres @',
  '@ these @',
   '@ they @',
   '@ this @',
   '@ this @',
   '@ want @',
   '@ well @',
   '@ were @',
   '@ what @',
   '@ when @',
   '@ where @',
   '@ which @',
   '@ will @',
   '@ with @',
   '@ without @',
   '@ would @',
   '@ wouldnt @',
   '@ your @',
   '@ youre @'
 );

// Various punctuation that should be filtered from the cloud
 $exclude_symbs = array('@[0-9]@','@\.@','@\,@','@\:@','@"@','@\?@','@\(@','@\)@','@\!@','@\/@','@\&@');
 $apostrophe = '&#'. '39;';
 $exclamation = '&#'. '33;';
 $nbsp = 'nb' . 'sp;';
 $quot = 'qu' . 'ot;';

// Reset our class globals and other variables
 $cloudy = '';
 $word_list = array();
 $cnt = 0;
 $high_count = 0;
 $low_count = 0;
 $totalwords = '';

 $query = db_query(
   "SELECT body
    FROM {$db_prefix}messages AS mess
    LEFT JOIN {$db_prefix}boards AS board
    ON mess.ID_BOARD = board.ID_BOARD
    ORDER BY posterTime DESC
    LIMIT 30", __FILE__, __LINE__);

 while ($row = mysql_fetch_assoc($query))
 {
   $words = $row['body'];
   $words = parse_bbc($words,1);
   $words = strip_tags($words); // Clean HTML tags
   $words = strtolower($words); // Make all words lower case
   $words = str_replace($apostrophe,'',$words); // remove apostrophes
   $words = str_replace($exclamation,'',$words); // remove exclamations
   $words = str_replace($nbsp,'',$words); // remove non-breaking space
   $words = str_replace($quot,'',$words); // remove quote
   $words = preg_replace($exclude_symbs, ' ', $words); // Strip excluded symbols
   $words = preg_replace($exclude_words, ' ', $words); // Strip excluded words
   $words = preg_replace('/\s\s+/', ' ', $words); // Strip extra white space
   $totalwords .= $words;
 }
 $words = '';
 $wordslist = explode(' ', $totalwords); // Turn it back into an array
 $word_count = array_count_values($wordslist); // Count word usage

// Clear out the big array of words.
 arsort($word_count); // Sort the array by usage count

// Here we build our smaller array of words that will be used.
 foreach ($word_count as $key => $val) {
   if (strlen($key) >= $min_length) {
     if ($high_count == 0)
       $high_count = $val;
     $word_list[$key] = $val;
     $cnt++;
   }
   if ($cnt >= $number_of_words) {
     $low_count = $val;
     break;
   }
 }


// Get the high and low, and calculate the range.
// This is used to weight the size of the words

 $range = ($high_count - $low_count) / 5;

// start form
echo '<script language="JavaScript" type="text/javascript">
<!--
function searchfromcloud ( selectedtype )
{
 document.scl.search.value = selectedtype ;
 document.scl.submit() ;
}
-->
</script>
<div style="text-align: center"><form name="scl" action="http://www.xxxx.com/portal/index.php?action=search2" method="post">
<input type="hidden" name="advanced" value="0"><input type="hidden" name="sort" value="ID_MSG|desc"><input type="hidden" name="search">

';

// Sort the array randomly for the cloud
 $random = array_rand($word_list, $number_of_words);
// Build the cloud's HTML
 foreach ($random as $value) {
   $fsize = intval($word_list[$value] / $low_count) *4;
   $fsize = $fsize + 3;
   if ($fsize > 17) { $fsize= 17; }
echo '<a href="javascript:searchfromcloud(\''.$value .'\')" style="font-size:' . $fsize . 'pt;">' . $value . '</a> ';
 }
echo '</form></div>';


Check the last part, with the javascripty in there. Linking to the most recent post of the search directly, would be even better. But now you see what I mean  :coolsmiley:

JPDeni

I guess you can do the javascript thing, but the code I posted above will give a link to a search.

QuoteLinking to the most recent post of the search directly, would be even better.
And would require a complete rewrite of the block. There's no way to connect a word to a message number with the way it's written now and doing a search for each word as it's printed out would be very server-intensive. People complain that it takes too long to run this as it is. :)

gijs

and you amazed me  :D way simpler that way!

Only thing I added, was ;maxage=100 to the searchstring, to make it faster (and you don't click on a link in the cloud to find an ancient post)

This website is proudly hosted on Crocweb Cloud Website Hosting.