Apache Nutch®

From Oxxus Wiki

Jump to: navigation, search

The Apache Nutch® is an Open source developed web-search software project.
It provides all its strength if configured to crowl in local mode and post its results to Apache Solr or it can be completely based on Hadoop.

Getting started with Apache Nutch®

To start, please first obtain the latest, Stable 1.3, release. It's available in binary or source releases. Once desired release is
downloaded, it has to be unpacked at desired hosting destination. As it's developed in Java, it has classes for Command line options for /bin/nutch.

Each configuration, as local mode crowler or Hadoop project based, is explained in details at Notch with Solr or Hadoop based.

It can be assumed as a subproject in both configurations, as the results are being posted to Solr or Hadoop for final processing.

Contact About Us Support Network Servers Java Hosting Oxxus.net Order Now! Dedicated Servers VPS Hosting Tomcat Hosting Java Hosting Money Back Guarantee Privacy Policy Oxxus.net Terms of 
Service Contact About Us Servers Networks Support Domain Names SSL Certificates Java Wiki Tutorials E-learning 
Platforms