Thursday, December 26, 2024

Bright Planet, Deep Web

Share

www.allwatchers.com and www.allreaders.com are web sites in the sense that a file is downloaded to the user’s browser when he or she surfs to these addresses. But that’s where the similarity ends. These web pages are front-ends, gates to underlying databases. The databases contain records regarding the plots, themes, characters and other features of, respectively, movies and books.

Every user-query generates a unique web page whose contents are determined by the query parameters.The number of singular pages thus capable of being generated is mind boggling. Search engines operate on the same principle – vary the search parameters slightly and totally new pages are generated. It is a dynamic, user-responsive and chimerical sort of web.

These are good examples of what www.brightplanet.com call the “Deep Web” (previously inaccurately described as the “Unknown or Invisible Internet”). They believe that the Deep Web is 500 times the size of the “Surface Internet” (a portion of which is spidered by traditional search engines). This translates to c. 7500 TERAbytes of data (versus 19 terabytes in the whole known web, excluding the databases of the search engines themselves) – or 550 billion documents organized in 100,000 deep web sites. By comparison, Google, the most comprehensive search engine ever, stores 1.4 billion documents in its immense caches at www.google.com. The natural inclination to dismiss these pages of data as mere re-arrangements of the same information is wrong. Actually, this underground ocean of covertintelligence is often more valuable than the information freely available or easily accessible on the surface. Hence the ability of c. 5% of these databases to charge their users subscription and membership fees. The average deep web site receives 50% more traffic than a typical surface site and is much more linked to by other sites. Yet it is transparent to classic search engines and little known to the surfing public.

It was only a question of time before someone came up with a search technology to tap these depths (www.completeplanet.com).

LexiBot, in the words of its inventors, is…

“…the first and only search technology capable of identifying, retrieving, qualifying, classifying and organizing “deep” and “surface” content from the World Wide Web. The LexiBot allows searchers to dive deep and explore hidden data from multiple sources simultaneously using directed queries. Businesses, researchers and consumers now have access to the most valuable and hard-to-find information on the Web and can retrieve it with pinpoint accuracy.”

It places dozens of queries, in dozens of threads simultaneously and spiders the results (rather as a “first generation” search engine would do). This could prove very useful with massive databases such as the human genome, weather patterns, simulations of nuclear explosions, thematic, multi-featured databases, intelligent agents (e.g., shopping bots) and third generation search engines. It could also have implications on the wireless internet (for instance, in analysing and generating location-specific advertising) and on e-commerce (which amounts to the dynamic serving of web documents).

This transition from the static to the dynamic, from the given to the generated, from the one-dimensionally linked to the multi-dimensionally hyperlinked, from the deterministic content to the contingent, heuristically-created and uncertain content – is the real revolution and the future of the web. Search engines have lost their efficacy as gateways. Portals have taken over but most people now use internal links (within the same web site) to get from one place to another. This is where the deep web comes in. Databases are about internal links. Hitherto they existed in splendid isolation, universes closed but to the most persistent and knowledgeable. This may be about to change. The flood of quality relevant information this will unleash will dramatically dwarf anything that preceded it.

Sam Vaknin ( http://samvak.tripod.com ) is the author of Malignant
Self Love – Narcissism Revisited and After the Rain – How the West
Lost the East. He served as a columnist for Central Europe Review,
PopMatters, Bellaonline, and eBookWeb, a United Press International
(UPI) Senior Business Correspondent, and the editor of mental health
and Central East Europe categories in The Open Directory and
Suite101.

Until recently, he served as the Economic Advisor to the Government
of Macedonia.

Visit Sam’s Web site at http://samvak.tripod.com

Table of contents

Read more

Local News