Jimmy Wales, the co-founder of Wikipedia, announced today that his for-profit community hosting site Wikia has deepened its investment in developing an open-source Web search engine. Wikia purchased Grub, a company that makes distributed a Web crawling program. Instead of having a single set of computers index the Web — as Google and other search engines do — Grub passes out the indexing work to computers across the globe.
You can download the Grub client to make your own computer pitch in on the indexing work. While you’re not using it, the machine will scan the Web and send back its index to a central server; your scan, combined with input from others running the Grub client, will form the index that will power Wikia’s open-source search engine.
[Submitted by Imran Asad]
Wales, who was speaking at the O’Reilly Open Source Convention in Portland, Ore., announced that Wikia has turned Grub into an open-source program; the company hopes for input from developers all over the world.
Unlike Wikipedia, Wikia’s search engine will run as a for-profit venture. Gil Penchina, the CEO of Wikia, has said that the company hopes to one day reach 5 percent of the search market — a number that sounds small but that could be quite lucrative. But because the project is open-source, anyone else could build a competing search engine — whether for-profit or non- — based on the same index, Wales pointed out to me this afternoon in the briefest of phone conversations (we had some kind of cell-phone issue).
Wikia sets out several guidelines for its open engine: It will be transparent — the algorithms determining how results are ranked will be visible to all. Google and other engines invest huge sums to develop these algorithms, and they guard them extremely closely. But that’s precisely why Wales believes we need an open search engine — the world, he says, must have an alternative to a Web that’s ranked by “invisible rules inside an algorithmic black box.”
But Wales isn’t looking for transparency for transparency’s sake: the project rests on the idea that community involvement will actually improve upon today’s search results. Whether that’s possible seems a gamble; Wikia has not announced a timeline for the project’s debut. A search engine is a huge undertaking, and there’s something nearly crazy about the idea of doing it with volunteers. But then, so too does Wikipedia and every open-source project seem somewhat impossible; that all those people could make something together doesn’t seem likely. Miraculously, though, these projects work — and the same thing could happen for search.
Update: WALES HAS clarified, first, that the open search engine will not only take contributions for its source code, but that community members will also be actively involved in the editorial process governing search engine results.
“The idea would be a wiki-like process where the community can whitelist URLS, blacklist URLs, control for spam, block users who are being bad, that kind of thing,” Wales says.
Wales says that Wikia will have the front end for the search engine built by the end of this year — a place where people can “enter a search term and get some results,” very simple. “We expect that it probably won’t be very good at that point, and we’ll probably have to put a big disclaimer on the site, ‘We know this isn’t very good, please help us to make it better.'”
WHEN asked Wales if it’s possible he’s too late in starting this — is Google too entrenched to beat? “Sure,” he said. “I could fail. I have no idea. But I’m going to have fun trying.”
RECENT UPDATE:Wikia, the commercial site led by Wikipedia founder Jimmy Wales, is taking the next step toward launching its open source, human-assisted Web search tool, Search Wikia.
Speaking at the OÃ¢â‚¬â„¢Reilly Open Source Convention (OSCON), Wales announced that Wikia has acquired Grub, the distributed search spidering technology previously owned by LookSmart, and will begin using it to build an index for the Search Wikia project.
Grub, which LookSmart bought in January 2003 for $1.4 million, is a distributed crawling service that LookSmart had implemented as a screensaver that would use idle CPU time on a user’s PC to crawl the Web. The data was used to supplement its own centralized crawler’s indexing efforts.
Wikia will immediately release Grub to the open source community, and make both the crawler and source code available at Grub.org. Users who download the application can run it either as a screensaver or a background process while other applications are running.
Specifics of the deal were not revealed, though it is part of a larger advertising deal between Wikia and LookSmart which was announced last week.
Under the deal, LookSmart will provide text and display ads in Wikia’s freely hosted wiki communities, and eventually on the Search Wikia site, Wales said. Ads will be sold by Wikia on either a cost-per-click (CPC) or cost-per-thousand impressions (CPM) model. Inventory not sold by Wikia will be back-filled by ads from LookSmart’s distributed ad network.