Tidbits @ Kassemi

A collection of opinions, thoughts, tricks and misc. information.

Sunday, April 16, 2006

 

Python based full-text search for postgresql...

Hey everyone. I'm back... I've just been too busy, and have thus ignored this blog. But, I have a good present for you today, a very simple relevance-based full-text search engine for your python based programs... If you've looked into tsearch2, but are on a hosting provider that might not let you install it, you've had to go looking around (or program one, if you're like me). The stemmer module is not something I wrote. All credits are located in the stemmer module's source.

Here's the source for this one:

search.tar.gz

Update: For best results, modify the frequency ranking to rank less over time for many multiple mentions of a term... The way I'm using is a simple 1/x^2 scale. This will prevent some issues with abuse my repetition of a term again and again, but you might want to try hitting up sort of limiting logarithmic function or something along those lines to be just a bit more careful.

Let's say you have a forum that you'd like to search with a post table similar to the following:

id | content # Name: forum_posts

That oversimplifies it, but anyway...


#populate the search database
import search
search.populate('forum_posts', 'id', 'content')


So you've populated with the posts that you've already got. Now when you want to add another post:


# Add two posts, id 1034 and 1035 with the text specified.
import search
search.add(1034, 'This is functional, I hope.')
search.add(1035, 'This as well, maybe.')


And when you want to search the database for a post:


import search
res = search.search('functional')


Which will return a list ordered by relevance desc of the post ids that match
the search... There are no boolean-type searches, etc... Everything is done via a simple (incomplete but functional) relevance function... You can edit it if you'd like. It works now, but I plan to tweak it in the future.

Anyway, have fun!

James

Comments: Post a Comment



<< Home

Archives

August 2005   September 2005   October 2005   November 2005   December 2005   January 2006   February 2006   March 2006   April 2006   June 2006   July 2006   August 2006   September 2006   October 2006   November 2006  

This page is powered by Blogger. Isn't yours?