BotSpot has been updated!
Go to http://www.botspot.com to find what you need

BotSpot® Bi-Weekly Newsletter

April 25, 2001

Entering the Biosphere

By Brian Proffitt

Darkness, with flashes of light.

Its world is one of speed. It is constantly on the move from the beginning of its short life to the end. It knows only one thing: hunting for that which it needs to survive. It has some stores of food to draw upon as it searches for more, but it shares those stores with its siblings and parents, so it has to find more to live.

It is just born, yet it already knows what it needs to find to live. Speeding along the pathway, it suddenly finds a likely source of energy. But it is not--one of its kind has been here before and taken what it needs.

It is growing weaker by the second. Some clues exist in the carcass left by the last one here to find another potential source of energy. It leaps on to another path, mindlessly trying to find what it needs.

Another source is found. The source is exactly what it needs-giving it more than enough energy. Different prey is nearby, filling the hunter with an abundance of energy, more than enough to produce offspring. This new generation is born in the space of a millisecond and flashes away from the parent, now in the race for more energy.

The cycle continues.

If, for one moment, you thought you had received a nature newsletter by mistake, you can be forgiven the confusion. But the scenario described above is not a part of the life cycle found in some remote jungle or ocean: it is actually the life cycle exhibited by artificial components called InfoSpiders.

The similarity between the behavior of InfoSpiders and most kinds of life forms is definitely no accident. InfoSpiders' creator Dr. Filippo Menczer planned it this way from the very start.

On the surface, InfoSpiders, which is the core technology of the Internet search technology MySpiders, may seem like the standard form of Web crawler. Give them a search term and then turn them loose on the Internet to see what they can find. But the similarities between an ordinary Web crawler and an InfoSpider is like that between a Model T Ford and a Porsche 911--they're both cars and that's about it.

According to Menczer, InfoSpiders are a population of agents working to solve a problem while competing with each other. This model of behavior is governed by what Menczer calls a genetic algorithm.

The genetic algorithm, a mathematical form of Darwin's survival of the fittest, lets the InfoSpiders improve themselves in incremental ways so they can ultimately survive longer and ideally produce offspring to continue to solve the problem with the parent.

A bird's eye view of the process reveals a fairly straightforward set of parameters. Upon receiving a problem, usually a search term, the InfoSpider will begin to seek instances of the term on the Internet along with its compatriots. When a Web page with the appropriate terms are located, the InfoSpider must check the page it has found for duplicates the other InfoSpiders have located, which are found in a central cache. If this is a unique page, the URL is registered in the cache and the InfoSpider is given a reward of "energy" from the cache.

It is this energy that the InfoSpider must have to survive. If it cannot find a page, or if all the pages it does find have already been located by InfoSpiders in this search instance, it will eventually run out of energy and die.

If the InfoSpider is very successful, it will be able to reproduce-creating almost-clones of itself that will continue to search for the term. The "almost" is there because the InfoSpiders also have the capability to perform selective query expansion. At the moment of reproduction, the new generation can look at instances of the search term on the page and see how they relate with other words.

For instance, Menczer explained, if the InfoSpiders are looking for pages with the term "intelligent agents," the new offspring may note that "artificial intelligence" seems to appear on the pages with regularity as well. So, the new generation will add this new term to the list of terms to find.

All of this lifelike criteria was integrated into the InfoSpider for one purpose: to encourage diversity in results. If a group of InfoSpiders unleashed on the Web returned the same results over and over, it would hardly do a user any good.

Currently, the InfoSpiders are a prototype technology under development by Menczer's students at the University of Iowa. The MySpiders implementation is a scaled-down Java-based client of the same technology. Menczer hopes that further experimentation will determine if InfoSpiders will be useful in building specialized search engines or working with more traditional search engines to augment their results with pages that a search engine might not find at first.

Scientists have found life in every corner of the planet Earth. From frigid Arctic climes to boiling hot vents under miles of water, life can be found everywhere. With such a hardy and efficient model, is it any wonder that technology is beginning to emulate the processes of life? As any programmer knows, you build your work based on the best example you can find.

News Stories

Botfolio/diegeldseite: Not Quite Gold Yet
April 24, 2001--Artifical Life has combined chat bot and stock bot technology to bring you two financial management Web sites. But are they ready for prime time?

ChangeTracker Gets Fresh Information
April 20, 2001--WisoSoftCom's lastest creation is an app that can monitor Web site changes efficiently.

Robot Rock Critic: Edgy But Accessible
April 20, 2001--It's got a beat and you can dance to it. Something for fun on a fine Spring day.

KarlBot: Next Generation of Virtual Agent
April 18, 2001--In a stunning marriage of LifeFX and Lingubot technology, this virtual representation of Kiwilogic's CEO has to be seen to be believed.

Update: Paula
April 16, 2001--Okay, it's got a one-track mind. But when you get past the innuendo, chat bot Paula is shaping up to be a pretty fast learner.

MySpiders
April 12, 2001--Students from the University of Iowa have put together a real-time set of Web crawlers that have real potential.