Full text search in JRuby with ActiveLucene

ActiveLucene is a Lucene interface similar to the one used by ActiveRecord and ActiveModel.

This means that you can generate a scaffold in a Rails application, go to the model, replace ActiveRecord::Base with ActiveLucene::Document and everything should be still working with the difference that the model should be being saved in a Lucene index rather than in a relational database.

As the documents are now being saved in a Lucene index, we can find them using the Lucene query syntax, without forgetting that as ActiveLucene has a similar interface to ActiveRecord, we can also find them by an id, find the first, find the last, etc.

The base class of ActiveLucene is called Document because as the documents of another systems, it doesn’t have a defined structure and all the attributes are dynamic, so you do not have to worry about them.

ActiveLucene was extracted from Lunr, a not-enterprise search server from which I will speak in a future post, but as today, it can also be used in applications where having a relational database doesn’t make much sense and especially in the ones where you want to search documents by text.

The restriction of ActiveLucene is that it only works with JRuby, since Lucene is a Java library that runs on the JVM. Case this isn’t a problem, using ActiveLucene should be much simpler and lighter than using a solution such as Sphinx or Solr, depending of the case this may result in great benefits.

Lucene has a lot of features and ActiveLucene doesn’t support them all, but they will be added to the project as needed. As today, there is support for highlighting and a little for paging, as can be seen in the following video, along with a basic demonstration of the tool:

Full text search in JRuby with ActiveLucene from Diego Carrion on Vimeo.

If you appreciate this work, please consider to recommend me at Working With Rails.

ThinkingSphinx exits, enters ActsAsSolrReloaded

I used to work with ThinkingSphinx until the day I needed to index documents with dynamic attributes. As Sphinx indexes data from the result of an SQL query, the goal didn’t seem possible.

I decided then to take another look at Solr. Solr, differently from Sphinx, is an HTTP server and indexes data from posted XML documents. Each document can have a different structure, so it fits perfectly with the model of dynamic attributes.

Thiago Jackwin, aka RailsFreaks, created a plugin that integrates Rails with Solr called acts_as_solr. The plugin is very good, but Thiago disappeared from the map some time ago, he lost the domain, left GitHub and doesn’t answer emails any more. As a result of this, different forks and forks of forks have been created and the Git tree became a mess.

Annoyed with the situation of the project, I decided to fork the fork I liked the most and created a new repository called acts_as_solr_reloaded, with new features. This way, I hope the project gets easier to be found and that it gives more trust. I’m also compromising myself to keep the repository up to date and to pull contributions.

As today, the new features acts_as_solr_reloaded comes with are:

  • support for dynamic attributes
  • geo-localization or geo-spatial search
  • integration with acts-as-taggable-on
  • highlighting
  • relevance ranking

To support geo-localization in Solr, it needed to be updated to the version 1.4 .

To make easier the experience of working with dynamic attributes and geo-localization, a few generators that setup the database were added to the project. You can use them like this:

script/generate dynamic_attributes_migration
script/generate local_migration

You can after define your model this way:

class Document < ActiveRecord::Base
  acts_as_solr :dynamic_attributes => true,
               :spatial => true,
               :taggable => true
end

Note that with :taggable => true you dont need to define your model as acts_as_taggable_on :tags, it’s done automatically.

To better demonstrate the new features in acts_as_solr_reloaded, I recorded a small video of five minutes showing the functionalities in action, hope you like it:

New features in ActsAsSolrReloaded from Diego Carrion on Vimeo.

Note that in the video I used acts_as_taggable_on :tags and :taggable => true, at the time of the recording this both declarations were necessary, not anymore.

If you appreciate this work, please consider to recommend me at Working With Rails.