October 6, 2009 4:41 AM PDT
Just a quick update to let everyone know that this project is not dead. It may go on hiatus from time-to-time but we'll never stop trying to build the world's first non-profit search engine.
Last week we moved the Nutch server into the future home of our community radio station. The room is cleaner, quieter, and cooler (sometimes) than the rest of the ILC. We also installed Windows Server 2003, instead of Linux, on the server. We were having a lot of Java errors when doing big crawls with Linux so we're hoping that the new OS will magically solve some of those errors. We'll eventually move back to a Linux based system but, for now, we'll use Windows Server.
Currently the server is crawling a list of 2.2 million websites. It's on day 3 of about a 5 day crawl. We'll know on Friday whether or not it's successful. If it is then we'll try for 10 million URLs.
k