Borort

The sky’s the limit…

Archive for the ‘Search Engines’ Category

Freebase Paralax

with 2 comments

Written by borort

December 4, 2008 at 4:07 am

Solr Post.jar – post to different Solr port other than 8983

with 2 comments

What if you want to post data to Solr instance whose http port is different from 8983 using post.jar which comes along Solr package?

For example, your Solr http address is : http://localhost:8080/solr

To post data to this Solr index: java -Durl=http://localhost:8080/solr/update -jar post.jar *.xml

That’s all… :)

Written by borort

November 19, 2008 at 5:14 pm

Posted in Search Engines, Solr

Select/delete all items in Solr

with one comment

To select all items for a field in Solr you can use the query : some_item:[* TO *], but if this field is missing from some documents you will not select those documents.

To select all documents you can use the id defined in /conf/schema.xml, for example: <uniqueKey>solr_id</uniqueKey> you can use solr_id:[* TO *].

Now that you have all documents selected you can delete them :D

To delete all documents in Solr use this update xml:

<delete><query>solr_id:[*TO *]</query></delete>

and of course you have to commit:

<commit />

That’s all !

Source: http://blog.tremend.ro/2007/03/02/selectdelete-all-items-in-solr/

Written by borort

July 13, 2008 at 2:17 pm

Posted in Search Engines, Solr

How Search Engine Work

with one comment

 

“Spiders” take a Web page’s content and create key search words that enable online users to find pages they’re looking for.

Source: http://computer.howstuffworks.com/search-engine1.htm

Written by borort

June 17, 2008 at 4:01 am

Facets and Tagging

without comments

Written by borort

June 11, 2008 at 5:15 pm

RawSugar Faceted Search

without comments

Written by borort

June 11, 2008 at 5:01 pm

Re-usable metadata, re-usable content

without comments

Written by borort

May 27, 2008 at 3:21 pm

Carrot2 – Open Source Framework for Building Search Clustering Engines

without comments

Carrot2 is an Open Source Search Results Clustering Engine. It can automatically organize (cluster) search results into thematic categories:

 Search results clustered with Carrot2 (live demo)

Carrot2 provides an architecture for acquiring search results from various sources (YahooAPI, GoogleAPI, MSN Search API, eTools Meta Search, Alexa Web Search, PubMed, OpenSearch, Lucene index, SOLR), clustering the results and visualising the clusters. Currently, 5 clustering algorithms are available that are suitable for different kinds of document clustering tasks.

Thanks to its flexible architecture, high quality and a friendly BSD-like license, Carrot2 has been successfully used in a number of commercial and research applications and resulted in a number of interesting publications. To get started, please have a look at live demos and the downloads section. If you have any questions or comments about Carrot2, please let us know.

For consulting services, installation, maintenance and text mining expertise, please contact the Carrot2 spin-off company called Carrot Search. Carrot Search offers Lingo3G — the third generation high-performance document clustering engine featuring hierarchical clustering, ontologies, synonyms and advanced tuning capabilities.

Source: http://project.carrot2.org/index.html

Written by borort

May 18, 2008 at 6:32 pm