Virtual Library Cat's Eye View: Deep Web Research 2010

Posted by Jennifer Eidelman | 22 Feb, 2010

Deep Web Research 2010 is an updated guide on how to find information on the deep or invisible web. These are sites that current search engines cannot find or have trouble accessing. The guide is divided into the following sections:

Articles, papers, forums, audio and videos
Presentations
Cross Database articles
Resources - Deep Web Research
Cross Database Search Services
Resources - Semantic Web Research
Cross Database Search Tools
BOT Research, Resources and Sites
Peer to Peer, File Sharing, Grid/Matrix search engines
Subject Tracer Information blogs

There is even more on the web than you find with casual searching. Go for it!

From Virtual Library Cat's Eye View: Deep Web Research 2010 

Federated Search Engine Technology

Posted by Jennifer Eidelman | 7 Dec, 2009

Federated Search technology will take you to the deep web

Read the full article at: http://www.osti.gov/fedsearch

If you’ve been searching for science information using popular search engines such as Google, Yahoo! and MSN, you may be missing out on the research you need.

That’s because popular search engines generally cannot search in the deep web where most research is found.

The Deep Web is Huge
The deep web is huge – by some estimates, the deep web is more than 500 times the size of the surface web where popular search engines "crawl" [exit federal site].

To get to the deep web, you need federated search tools, such as Science Accelerator, Science.gov, and WorldWideScience.org.

With these federated search tools you can find the richest content – the results of billions of dollars worth of government-sponsored scientific research.

In one query, you can search multiple databases at one time, sort through the information in ways that are useful to you, and rapidly return relevant results to your desktop.

Semantic Search Engines

Posted by Jennifer Eidelman | 2 Nov, 2009

This post is from the site: http://www.semanticsearchengine.org/.

What is a semantic search engine? Generally speaking, it delivers results based on the science of meaning in language. Originally, many search engines tried to use phrase matching to deliver results, but were supplanted by Google's use of links and other factors to deliver accurate search results to the user. Unfortunately, it is still possible to get vague (or ambiguous) results for certain terms. A semantic engine uses rules of disambiguation to either find out the context of your search, or present you with different options. For instance, a search on "rock" might get you results for music, for stones of the non rolling variety, and for an insurance company logo. A semantic search engine might ask you what you are looking for, or show results around the context of other words in your search query, or present the most common items based on past experience in searching.

Some of the top known semantic engines in 2009 are:

Wolfram Alpha - A newly launched search engine.

Hakia - An engine you can ask quesitons, which lets you limit queries to "credible sites" as well as news and images.

Microsoft "Kumo" - A yet to be released replacement for Live Search

Powerset - Which processes Wikipedia results and can answer questions based on the "facts" therein.

Google Wonder Wheel - Which (at this date) doesn't even give its own result when you search on it!  

So far, a general Google, Yahoo, or even MSN search still brings up pretty good results compared to these engines. Wolfram Alpha isn't really a search engine because it does not bring up any other web pages, and it is a bit clunky. These engines may be missing the point for the average surfer who is likely looking for things outside the field of "knowledge." If you ever want to see what PG rated things people are searching for, check out Google Hot Trends and after you weep for the future of humanity, you might note that there aren't too many semantic-friendly queries in the list. Semantic engines are usually popular among academics, and we aren't talking about those people that got a BA degree in distributed studies. If you want to solve a linear equation, or get the mass of the planet so you can calculate orbital velocities, or see your age in minutes, then semantic engines are the way to go. Those of us who got English degrees and had to read Hard Times by Charles Dickens and totally hated it can still remember 20 years later that there was an educator in the book called Mr. Gradgrind, and he only cared about teaching facts. So far, Google is not going anywhere, because it can deliver choices (many of which include the requisite Wikipedia listing) and people still want that.

One last dirty secret for fans of semantic search engines: Searchers aren't that smart. As someone with a background in search, I can tell you that the average query on a search engine is for a website name. Why? Because people know the name of the website, but don't know how to type it into the URL bar. This is probably why Yahoo prefers not to publish an unfiltered list of search queries, because people will go to Yahoo to search for Google. Therefore, when people use a semantic engine and don't get the listing for the site they already typed in, they will not use the engine again. QED.

 

 

DeepDyve

Posted by Jennifer Eidelman | 2 Nov, 2009

DeepDyve is a specialised search engine for technical, scientific and medical information. It now offers a service for renting articles for as little as $0.99 each.  The company's database covers thousands of journals and some 30 million articles, many of which are available for free. Click here to read an interesting and informative article on what DeepDyve has to offer.

The Invisible web

Posted by Jennifer Eidelman | 30 Oct, 2009

There is a vast amount of information on the world wide web that cannot be found using general purpose search engines like Google. General purpose search engines provide only the tip of the iceberg of what is available on the web. There are alternatives to general purpose search engines:

  • Subject Specific Search Engines - see links to Scirus and Science below.
  • Subject Directories - see links to INFOMINE, Librarians Internet Index and DMOZ below.
  • Search Engines that will search the invisible web - see links to CompletePlanet and INCYWINCY below.

Subject Specific Search Engine: SCIRUS - Scirus concentrates on Scientific Information only. It is the most comprehensive scientific research tool on the web. With over 350 million scientific items indexed at last count, it allows researchers to search for not only journal content but also scientists' homepages, courseware, pre-print server material, patents and institutional repository and website information.

Subject Specific Search Engine: SCIENCE - Science.gov searches over 40 databases and 1,950 selected websites, offering 200 million pages of authoritative U.S. government science information, including research and development results. It allows you to Explore Selected Science Websites by Topic.

Subject Directory: INFOMINE - Provides scholarly internet resource collections. It allows you to browse or search by subject category.

Subject Directory: Librarians' Internet Index - One of the best known general directories on the web. "Librarians' Internet Index (LII) is a publicly-funded website and weekly newsletter serving California, the nation, and the world." Click here for more information.

Subject Directory: DMOZ - The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors. "The web continues to grow at staggering rates. Automated search engines are increasingly unable to turn up useful results to search queries. The small paid editorial staffs at commercial directory sites can't keep up with submissions, and the quality and comprehensiveness of their directories has suffered. Link rot is setting in and they can't keep pace with the growth of the Internet." For more information click here.

Search Engines that will search the Invisible Web: CompletePlanet.
Discover over 70,000+ searchable databases and specialty search engines. "There are hundreds of thousands of databases that contain Deep Web content. CompletePlanet is the front door to these Deep Web databases on the Web and to the thousands of regular search engines — it is the first step in trying to find highly topical information. By tracing through CompletePlanet's subject structure or searching Deep Web sites, you can go to various topic areas, such as energy or agriculture or food or medicine, and find rich content sites not accessible using conventional search engines." For more information click here.

Search Engines that will search the Invisible Web: INCYWINCY -

* 200 million pages spidered and indexed
* hundreds of thousands of search engines indexed and searchable
* runs on a cluster of NRS servers running the Linux Operating System
* user listings, premium keyword purchase, and custom website spidering
* personalized search with relevancy boosting for particular content categories
* search types: web, directory, forms, search engines, images, metasearch
* search and page alerts, bookmarks, and mail accounts
For more information click here

 

 

Helping to Unravel the Hidden Web of Neuroscience Information

Posted by Jennifer Eidelman | 29 Oct, 2009

UC San Diego Researchers Debut Neuroscience Information Framework NIF 2.0 at “Neurosciences 2009” 

Decades of investment in neuroscience by agencies such as the National Institutes of Health has led to an explosion of new information about the brain. Global communications networks promote around-the-clock collaborations, in a world saturated with information.

Ironically, the tidal wave of data makes it harder for researchers to locate relevant neuroscience resources, such as data, tools and materials, because information is scattered across thousands of databases and billions of web pages.