Forums » News and Announcements

The Little Search Engine That Could

    • 3727 posts
    March 7, 2009 6:56 PM PST
    Ever since Ian (hillscott) arrived in Nong Khai we've been busy rebuilding and improving Isara's search engine. Ian has been a HUGE help and we're fairly confident we can get the search engine to provide relevant results. It won't be nearly as robust or accurate as Google, but it will be a nice alternative. Which is all we're shooting for.

    Currently we have our servers crawling the Internet Movie Database and Wikipedia. Those sites have a total of 5,000,000 pages so it's going to take a few days to crawl and index. While that's happening Ian is getting the pagination ( << 1 2 3 4 >> ) to work properly and also a "Did you mean?" function for spelling suggestions.

    Ever since the "ACORN" controversy from the last US election we've been struggling to think of a new name for our little search engine project. If you have any ideas let us know. Until then it will just be called Isara Search.

    Our previous blog was named "acorn" so I'm in the process of creating another blog which will document the continuing adventures of building a search engine. Will post a link once it's up.
    • 841 posts
    March 7, 2009 8:15 PM PST
    DOUBLE POST ALERT! DOUBLE POST ALERT! FOR SHAME, PK! :D

    I agree that a name change is in order, just to avoid unwanted connections. I will think about it and let you know if I come up with anything. Glad to hear the project is coming along nicely. :)

    Keep up the great work!
    • 3727 posts
    March 9, 2009 12:39 AM PDT
    During the last US election there was a group that called itself ACORN. They were registering a lot of people to vote. The problem was some of the people existed and some didn't (i.e. Mickey Mouse). Also, they typically only wanted you to vote for the candidate they supported. The tv news talked about ACORN and the controversy 24/7.

    It was then we decided the name ACORN had become tainted. So, for now, the search engine project will just be called Isara Search. Pretty creative, I know.

    We have moved the old blog posts over to a new system. It has a new look and a new attitude.
    http://www.isara.org/blog/

    :)
    • 3727 posts
    March 30, 2009 11:08 PM PDT
    The search engine now has a new home, a dedicated internet connection, and some new features (thanks, Ian!).

    You can check out the latest updates at http://www.isara.org/blog/ .
    • 5130 posts
    March 31, 2009 6:25 AM PDT
    Not sure about posting a follow-up question here or to the link.  But is a very small window unit a/c an option Isara would consider for the server room to be used only extreme heat?  I dont know what the technical consequences of running that kind of equip in temps in excess of 100F.  ???
    • 3727 posts
    March 31, 2009 7:17 AM PDT
    We'll definitely need to get air conditioning for that room or else the equipment will have a very short life. I haven't seen any window units for sale over here. All they have are ones similar to what was in your condo (one unit inside and one outside). They cost about $600. Since the room is so small and sealed fairly well, it shouldn't take a lot of energy to cool it down.
    • 5130 posts
    March 31, 2009 7:26 AM PDT
    I wonder if Julian over at MutMee has any contacts to equipment providers.  I never saw them but I know he has a couple rooms that have a/c.  Not sure what kind of unit he uses, but being in the business he is, maybe he can put you in touch with a supplier.  Just thinking.
    • 3727 posts
    October 6, 2009 4:41 AM PDT
    Just a quick update to let everyone know that this project is not dead. It may go on hiatus from time-to-time but we'll never stop trying to build the world's first non-profit search engine.

    Last week we moved the Nutch server into the future home of our community radio station. The room is cleaner, quieter, and cooler (sometimes) than the rest of the ILC. We also installed Windows Server 2003, instead of Linux, on the server. We were having a lot of Java errors when doing big crawls with Linux so we're hoping that the new OS will magically solve some of those errors. We'll eventually move back to a Linux based system but, for now, we'll use Windows Server.

    Currently the server is crawling a list of 2.2 million websites. It's on day 3 of about a 5 day crawl. We'll know on Friday whether or not it's successful. If it is then we'll try for 10 million URLs.

    k
    • 3727 posts
    November 21, 2009 10:03 PM PST
    Just a quick update. - In a little over a month we were able to increase our database to 23 million pages and are adding about 1 million pages per day. - Still thinking of a name for our search engine. :( - After months of saving (and researching) we finally started building a new server, to better handle a 100+ million page database. The new server has a Quad-core 2.83GHz CPU, 8gb RAM, and 3tb of storage. We also got a really cool (literally) case so the hardware doesn't burn up in the Thai heat. Our students like the blue lights. (So do I. :))
    • 5130 posts
    November 25, 2009 4:14 PM PST
    Kirk (PK) said:
    - Still thinking of a name for our search engine. :(
    How about atgoat.com All The Good Ones Are Taken LOL!