Mark Watson’s Artificial Intelligence Books and Blog

Share this post

Great Java open source project: Nutch search engine

markwatson.com

Great Java open source project: Nutch search engine

Mark Watson
Oct 31, 2005
Share

Nutch is an Apache licensed open source search engine project that I have been keeping an eye on for a while. One thing that makes this project especially compelling is that the author of the (fabulous) Lucene search library Doug Cutting is also a principle designer and implementer of Nutch. You can grab the source code using subversion:

svn co http://svn.apache.org/repos/asf/lucene/nutch/

Nutch now contains two new modules: the Nutch Distributed File System (patterned after the Google File System) and a Java version of MapReduce (patterned after Google's MapReduce). So far, I have only been looking at the source code (no builds and playing with it yet!) but this stuff looks really good. Anyone want to start a search engine company? :-)

Share
Comments
Top
New

No posts

Ready for more?

© 2023 Mark Watson
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing