[NZLUG] a distributed search engine node in auckland

Robin Paulson robin at bumblepuppy.org
Sun Jun 16 18:43:34 NZST 2013


On Sun, Jun 16, 2013 at 05:13:58PM +1200, Adrian Mageanu wrote:
> Just gave it a try. It works. Slow, but returns results.

yeah, it's on a residential connection unfortunately, and the hardware is a core 
2 duo with 2GB ram, so nothing amazing

> I read a bit on the YaCy.net web page, not being familiar with it, but
> didn't go past the home page for now, is it a domain specific search
> engine or a general purpose one? I noticed the tags on the right hand
> side of the page in [2], that's why I ask.

it's general purpose. a node admin feeds it a list of domains, then it crawls
and indexes them, and shares the index with other nodes.

it can be set up to index only one domain and/or not share results, but i'm not 
keen on that idea.

the list of domains is fairly large, the places which host things i'm interested 
in, so mainly free software, politics, philosophy and maker spaces, but there 
are more general sites in there too

> The results are somehow restricted and less relevant, when compared with
> other search engines. I did a search on "linux one liners" and got only
> 19 results, none of them useful, or at least not what I expected.

yep, yacy has several hundred nodes, google and bing et al have several hundred 
thousand

more nodes, more nodes. i'm doing some experiments/research into cheap arm 
boards at the moment, like this [1]. it crossed my mind i may be able to run 
yacy on them at the same time.

> Will its use increase its efficiency, like with wolframalpha in its

not as far as i know, although i may be wrong. of course, if you have some 
expertise here, maybe you can contribute code?

> beginnings? Or is there something users can do to improve it? Or its
> returned results?

host a node, that's the best way to assist [2]. even if the node doesn't allow 
searches to be made on it, it will still transfer its index to other nodes when 
necessary.

this page shows transfers happening in real time (and is also rather hypnotic):
http://101.98.128.100:8090/Network.html

it also lists the number of nodes, pages indexed, indexing speed and other stats 
porn

[1] 
http://www.aliexpress.com/item/MK802-II-Mini-Android-4-0-PC-Android-TV-Box-A10-Cortex-A8-1GB-RAM-4G/952998239.html

[2] importantly, yacy, as with a lot of free software, blows apart the 
distinction between users, coders, admins and hosters, by potentially (skills 
and hardware permitting) allowing anyone to do either. so, a user can also be 
someone who hosts a node. ipv6 will help with this by doing away with the 
requirement for NAT [*], at least that's how i understand it

[*] https://en.wikipedia.org/wiki/Ipv6#Privacy


More information about the NZLUG mailing list