MongoDB has good support for indexing and search, including prefix matching for AJAX completion lists
I have been spoiled by great support for indexing and search in relational databases (e.g., Sphinx, native search in PostgreSQL and MySQL, etc.)
I was pleased to discover, after a little bit of hacking this morning, how easy it is to do indexing and search using the MongoDB document-centered database. I have two common use cases for search, and MongoDB seems to handle both of them fairly well:
Search for words inside of text fields
Efficient word prefix search to support AJAX "suggest" style lists
My approach does require combining search results for multiple search terms in application code, but that is OK. Assuming the use of MongoRecord, here is a code snippet:
class Recipe < MongoRecord::Base
collection_name :recipes
fields :name, :directions, :words
def to_s
"recipe: #{name} directions: #{directions[0..20]}..."
end
def Recipe.make collection, name, directions
collection.insert({:_id => Mongo::ObjectID.new, :name => name,
:directions => directions,
:words => (name + ' ' + directions).split.uniq})
end
end
host = 'localhost'
port = Mongo::Connection::DEFAULT_PORT
MongoRecord::Base.connection = Mongo::Connection.new(host,port).db('mongorecord-test')
db = MongoRecord::Base.connection
coll = db.collection('recipes')
coll.remove({})
coll.create_index(:words, Mongo::ASCENDING)
Recipe.make coll, 'Rice Soup', 'Cook the rice, then add extra water to thin it out.'
Recipe.make coll, 'Cheese and Rice Crackers', 'Slice the cheese and layer on top of crackers.'
puts "\nSimple find"
puts Recipe.find_by_name(:name => 'Rice Soup').to_s
puts "\nFind recipe by regular expression (ignoring case) in array of words /water/i"
Recipe.find(:all, :conditions => {:words => /^water/i}).each { |row| puts row.to_s }
According to the MongoDB documentation, a regular expression match like /^water/i will use an index just as a relational database match in the form like 'water%' does.
I am still in a learning mode with MongoDB, so I would appreciate any comments on improving this aproach.