A collection of opinions, thoughts, tricks and misc. information.
Hey everyone. I'm back... I've just been too busy, and have thus ignored this blog. But, I have a good present for you today, a very simple relevance-based full-text search engine for your python based programs... If you've looked into tsearch2, but are on a hosting provider that might not let you install it, you've had to go looking around (or program one, if you're like me). The stemmer module is not something I wrote. All credits are located in the stemmer module's source.
Here's the source for this one:
search.tar.gzUpdate: For best results, modify the frequency ranking to rank less over time for many multiple mentions of a term... The way I'm using is a simple 1/x^2 scale. This will prevent some issues with abuse my repetition of a term again and again, but you might want to try hitting up sort of limiting logarithmic function or something along those lines to be just a bit more careful.Let's say you have a forum that you'd like to search with a post table similar to the following:
id | content # Name: forum_posts
That oversimplifies it, but anyway...
#populate the search database
import search
search.populate('forum_posts', 'id', 'content')
So you've populated with the posts that you've already got. Now when you want to add another post:
# Add two posts, id 1034 and 1035 with the text specified.
import search
search.add(1034, 'This is functional, I hope.')
search.add(1035, 'This as well, maybe.')
And when you want to search the database for a post:
import search
res = search.search('functional')
Which will return a list ordered by relevance desc of the post ids that match
the search... There are no boolean-type searches, etc... Everything is done via a simple (incomplete but functional) relevance function... You can edit it if you'd like. It works now, but I plan to tweak it in the future.
Anyway, have fun!
James