ParaCite Abstract Searching Announced

The Paracite article locating resource now supports searching via abstracts, as well as the existing functionality for location by citation. This new interface is available from the Paracite front page:

http://paracite.eprints.org

The abstract search uses a common word discovery technique as a very simple (yet very effective) approach to identifying documents, with an article being reduced to 10 keywords after eliminating the most common English words. These keywords can be stored as a persistent representation of the document, or passed into other search engines (initial tests with Google were very successful, with a search on the 10 keywords frequently returning the full article as the top result).

The Paracite interface offers a direct link to the Google search once an abstract has been processed, but also uses a database of preparsed documents to provide possible immediate matches. This database currently contains a large portion of the archives based at the University of Southampton, and this will be expanded to provide wider coverage of existing open repositories.

The next release of the ParaTools Perl modules will include the code for the generation of the keywords from documents, and the web service will also be updated to allow for searching by keyword signature.