Tidbits @ Kassemi

A collection of opinions, thoughts, tricks and misc. information.

Sunday, April 16, 2006

 

Python based full-text search for postgresql...

Hey everyone. I'm back... I've just been too busy, and have thus ignored this blog. But, I have a good present for you today, a very simple relevance-based full-text search engine for your python based programs... If you've looked into tsearch2, but are on a hosting provider that might not let you install it, you've had to go looking around (or program one, if you're like me). The stemmer module is not something I wrote. All credits are located in the stemmer module's source.

Here's the source for this one:

search.tar.gz

Update: For best results, modify the frequency ranking to rank less over time for many multiple mentions of a term... The way I'm using is a simple 1/x^2 scale. This will prevent some issues with abuse my repetition of a term again and again, but you might want to try hitting up sort of limiting logarithmic function or something along those lines to be just a bit more careful.

Let's say you have a forum that you'd like to search with a post table similar to the following:

id | content # Name: forum_posts

That oversimplifies it, but anyway...


#populate the search database
import search
search.populate('forum_posts', 'id', 'content')


So you've populated with the posts that you've already got. Now when you want to add another post:


# Add two posts, id 1034 and 1035 with the text specified.
import search
search.add(1034, 'This is functional, I hope.')
search.add(1035, 'This as well, maybe.')


And when you want to search the database for a post:


import search
res = search.search('functional')


Which will return a list ordered by relevance desc of the post ids that match
the search... There are no boolean-type searches, etc... Everything is done via a simple (incomplete but functional) relevance function... You can edit it if you'd like. It works now, but I plan to tweak it in the future.

Anyway, have fun!

James

Sunday, April 02, 2006

 

Matchstd.com Hits CNN!

Hello everyone,

My first big site ever, http://www.matchstd.com, has made CNN headline news this morning... It's great to see that happen... I only wish that I had been informed. Late into last night I was programming when I had an urge to eat. So I went to the kitchen, opened the frigde, pulled out a block of swiss, a tortilla and a hard-boiled egg... I cut the swiss into tiny blocks, placed it in the middle of the tortilla, peeled the shell off of the egg, and crumpled it's contents on top of the cheese... Opened the microwave, shoved it in, set it for a minute, and turned the TV on to HLN... I usually watch that and then a few hours of C-SPAN before going to bed...

The microwave beeped and I went to get my food, as I heard over the speakers: "... new service allows people with certain medical attentions to find each other, free, and anonymous..." I rushed over to the TV, hit the rewind button on the Tivo, and sure has hell we'd made the news!

So I ran into my roommate's room, and yelled at her to get up. We watched the segment, took a look at our membership numbers, and found that we'd gotten about 40 members since we had checked earlier in the morning! We're experiencing a heavy load at the moment... At JUST the wrong time, too. You heard me ramble on about the new Zippo for mod_python... Well, that's not ready yet. I was hoping to get a new site rolled out by Wed that would handle that kind of load, but now I'm really rushing it...

Anyway, good news.... Just not great timing. mpZippo will be running a version of the site by Wednsday, hopefully. It seems to be a lot faster with certain tasks (like displaying raw data), and a little slower with others (continuations, etc)... But it's a hell of a lot more stable all the way around, and we won't be giving users any nasty little service drops...

Take it easy everyone,
James

Saturday, April 01, 2006

 

April Fool's on /.

Hahahahaha!

I still can't stop laughing. This has got to be the funniest thing I've seen in a while now... It's probably because I'm real tired, but before it disappears into the dark abyss of internet history, here's slashdot today:



Happy April fool's day everyone!

James

Oh heck, almost forgot... My room-mate/business partner just completed a few intervies today for the matchstd.com site, and we made the news! We were on channel 7 and channel 4 news here in Albuququerque. As soon as I get some video I'll be sure to post it.

Archives

August 2005   September 2005   October 2005   November 2005   December 2005   January 2006   February 2006   March 2006   April 2006   June 2006   July 2006   August 2006   September 2006   October 2006   November 2006  

This page is powered by Blogger. Isn't yours?