Dav Yaginuma;
Husband, Father, Hacker, Thinker, Maker;
San Francisco.

Twitter Updates

    follow me on Twitter

    Dav's bookshelf: read

    Star Wars: Han Solo
    liked it
    tagged: graphic-novels
    See you at the 7: Stories From the Bay Area's Last Original Mile House
    it was amazing
    There's a little dive pub (turns out actually not a dive anymore) I'd been meaning to go to for years, and finally stopped by a couple of weeks back. I love checking out the old San Francisco spots that persist through the decades and ha...
    The Undefeated
    really liked it
    Wonderful poem and great illustrations.

    goodreads.com
    Blog powered by Typepad

    « mobster | Main | AkuAku makeover »

    Comments

    Feed You can follow this conversation by subscribing to the comment feed for this post.

    Dan

    Came here via your link in the boingboing comments. Interesting stuff! In a quick look over the document at headmap you linked, two things stand out:

    1) Just as the propagation of significance through a word is scaled down by how common (document-diverse) the word is, propagation through a document should be scaled down by how word-diverse the document is. This would help maintain specificity.

    2) Instead of dividing by the square root of word occurrences as a scaling factor, I'm recklessly guessing based on info. theory that it should probably be something related to -log (probability of word occurrence). Same-but-reversed for documents: divide by -log (number of indexed words in document / total indexed words). I know, I know, math first, then post. Sorry.

    Dan

    Did I say divide by -log(P)? I meant multiply.

    Maciej Ceglowski

    You can check out my Search::ContextGraph module (http://www.cpan.org) for an example of how to add local and global term weighting into the model. It's Perl, but the weighting code will be analogous in Java or C#.

    Dav

    Dan, thanks I'll try this out...

    And Maciej, I'll also look at your perl module, thanks!

    Verify your Comment

    Previewing your Comment

    This is only a preview. Your comment has not yet been posted.

    Working...
    Your comment could not be posted. Error type:
    Your comment has been posted. Post another comment

    The letters and numbers you entered did not match the image. Please try again.

    As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

    Having trouble reading this image? View an alternate.

    Working...

    Post a comment

    Your Information

    (Name is required. Email address will not be displayed with the comment.)