Mark Watson’s Artificial Intelligence Books and Blog

Share this post

Nutch: a platform is born

markwatson.com

Nutch: a platform is born

Mark Watson
Jan 2, 2007
Share
Share this post

Nutch: a platform is born

markwatson.com

I have used Nutch for two contracting jobs and Lucene for many jobs. Until today, I have viewed Nutch simply as:

  • Quick to configure for target websites to spider and to administer spidering

  • Trivial to run search web application

  • Web service provider (OpenSearch API)

Today however I started looking more closely at the underlying Hadoop architecture (like the distributed Google file system and their map reduce client library) and at both the available plugins and the plugin architecture. New opinion: Nutch is a platform for building more complex web applications and knowledge management applications.

Share
Share this post

Nutch: a platform is born

markwatson.com
Comments
Top
New

No posts

Ready for more?

© 2023 Mark Watson
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing