ComputerBits July 1998 rev 04/2016 

The Needle in the Haystack

Simple tricks to supercharge your Net searches ... by Sal Towse

Like a huge library with book stacks, card catalogs, and helpful reference librarians, the Internet has almost any information you might need.

Unfortunately, you, the user, can't just "thumb through" the Net or browse the Net's book stacks. Instead, you must know precisely what you want or follow a link from another page -- or track down a reference using a search engine.

Search engines, the Net's answer to card catalogs, aren't always straightforward and easy-to-use. They can be downright mystifying or, more likely, they can be under-utilized, their more powerful capabilities remaining untouched.

Does searching the Net seem to you about as easy as searching for the proverbial needle in a haystack? Make it easier to find your needle. Understand how search engines work. Learn the most-used commands and some simple tricks to supercharge your searches.

Be Specific

Search engines rely on specialized software called "robots" or "spiders" to crawl all over the Internet in search of Web pages. The title and text of those pages are downloaded, indexed, and stored in a database. Some search engines also index text found in META tags, used by Web designers to specify search terms not found in the actual site text.

When searching its database for sites relevant to your query, most search engines give preference to Web sites that contain your keywords in the title, in the META tags, near the beginning of the site text, or close to each other. They also give preference to sites that contain multiple instances of your keywords.

Knowing this, be as specific as possible. Search for terms most likely to be near the beginning of the site text or mentioned repeatedly. A Web site devoted to tulip planting might not use the word "gardening," but would probably contain "tulip" and "plant" or "planting." So if you want information on planting tulips, don't use general terms like "gardening" or "flowers" or "bulbs" in your search. Pop planting tulips into Google and voila! (Google, like many search engines, ranks the sites it finds and lists first those it deems the most relevant to your search.)

Being specific will also limit the number of sites returned. Use the least common words most likely to be on the page or pages you want to find. Use unambiguous search terms to avoid wading through a slew of off-topic sites. A search for socks may bring up sites for Socks the Cat, foot socks, and bar fights.

Uppercase? When?

Most search engines search for all forms of a word if the word is submitted in lowercase. ford will bring up Web sites devoted to Ford Broncos and river crossings as well as fan sites devoted to Harrison (or Henry!) Ford. Limit your results by using uppercase letters if you are, for example, searching for a personal name or need information on NeXT or Java. For some search engines, it may make a difference.

Wildcards

Learn to use wildcards, a feature available with most search engines. The most frequently used wildcard is the asterisk (*), which stands for "any set of numbers or letters." If you are interested in archaeology and want sites about archaeologists or archaeology or anything archaeological, search for archaeolog* and hunt down all alternatives in one fell swoop.

Be careful using wildcards, though. Don't truncate too much. arch* will bring up Web sites dedicated not only to archaeology but also to architects, the Golden Arches, Archibald MacLeish, and Roman aqueducts.

Keywords Or Phrases?

Most search engines allow phrase searching. Looking for information on ship models? Search for "ship models" enclosed in quotation marks. Entering a search for ship models will process each word as a search term and will return Web sites on ship models as well as sites about ships and sites featuring supermodels.

A friend complained that babe sites kept popping up whenever he searched for information about his model ship hobby. He solved his problem by enclosing his search terms in quotation marks. Use the same technique when your grade-schooler is looking for information on sperm whales!

Boolean Operators

Many, but not all, search engines let you refine your search using AND (&), OR ( | ), and NOT (!) qualifiers. Some search engines use the plus sign (+) for AND and the minus sign (-) for NOT. Check each search engine's help pages for specifics.

How can AND, OR, and NOT help you refine your search?

Use AND (&) when you want both terms to be on every Web site listed. For engines that use +, a + before a search term means the term must be on each site on the hit list. Proper use of AND can screen out irrelevant sites. If you want information on models of aircraft carriers, search for model* AND "aircraft carrier*" or +model* +"aircraft carrier*". Requiring both search terms decreases the likelihood of your results being cluttered with babe sites and sites dedicated to the USS Carl Vinson.

OR ( | ) requests sites containing either term. Use OR if either term might be used. Curious about the next millennium? Search for millennium OR millenium and you'll locate millennium sites -- even those maintained by folks who can't spell.

Sites with the given term will not be returned if NOT (!) (-) is used. Need to know what other dogs have called the White House home? Search for +"White House" +dog -Buddy. Whoops! 246 hits. I forgot the movie Wag The Dog. Change that search to +"White House" +dog -Buddy -"Wag The Dog". Now we have fewer hits and most are about Presidential pets.

If your search engine uses NOT rather than the minus sign (as do Lycos Pro and AltaVista Advanced Search), use AND to insure some terms are included and AND NOT to exclude others. The syntax for the previous search would be "White House" AND dog AND NOT Buddy AND NOT "Wag The Dog".

Simple, no?

Look through the first few site descriptions returned. Because most search engines prioritize results, you may find your answer there. If you don't, either identify a common unwanted feature and rework your search to exclude that term using the NOT command or refine your search by adding another search term.

Search engines do not think like humans. They are very literal, tremendously powerful, enormously fast, and extraordinarily stupid. You must tell them exactly what you want. Slowly. Use sock puppets if necessary!

More handy-dandy Google search tips here.

Placement Qualifiers

Some search engines use placement qualifiers such as NEAR (~), ADJ (for "adjacent"), FAR, FOLLOWED BY and BEFORE.

The most common placement qualifier is NEAR. Different engines implement NEAR differently. NEAR finds Web sites with the second word (or phrase) within n words of the first. Swapping the order of the search terms may give different results. copyright NEAR law would retrieve sites about "copyright law" and also sites which contain the word "law" with n words following "copyright."

FOLLOWED BY and BEFORE are similar search qualifiers.

Google has an undocumented operator called AROUND(n) for finding web pages that include words or phrases which are near to each other. Google's AROUND operator lets you specify the maximum number of words that separate the two words. /"huey lewis" AROUND(5) news/ f'rex will bring back 400K+ results and most of them will be Huey Lewis and the News sites (or Huey Lewis & the News). The AROUND(n) operator MUST BE IN CAPS. The number n sets the max distance between the two terms. If Google doesn't find matches within that limit, it will return its regular (sans AROUND(n)) results. This qualifier is useful when you don't know whether a site uses "and" or "&" as the Huey Lewis search shows.

Non-Text Searches

Some search engines let you search Web sites for non-text items. Check the help pages to see what your search engines offer. You may be able to further refine your search with one of the following commands to search for sites

Command Nesting

Search engines offering nesting describe this feature on their help pages. Briefly, suppose you wanted to find a hymn to Pan written by Aleister Crowley, H.P. Lovecraft, and/or Percy Bysshe Shelley. Bing allows you to search for hymn AND Pan AND (Shelley OR Lovecraft OR Crowley). This search is equivalent to combining the results of three different searches: hymn AND Pan AND Shelley and hymn AND Pan AND Lovecraft and hymn AND Pan AND Crowley.

Surfing

You need more than WWW search skills to get the most from the Net. Learn to surf. Poke around. Explore.

Many search engines have subject-classified annotated URL lists for various subject areas. For immediate information on a specific subject, the search engine is preferable, but if you're interested in genealogy or quilting or skydiving, a browse through the "select" Web sites for your interest can lead you to wondrous places.

Some search engines return a hit list with "more like this" links. If you find a site close to what you want, explore where the "more like this" link takes you.

Learn to move back up the URL if you reach a "404 Not Found" site. I recently came across a reference to a document that looked promising, http://members.theglobe.com/damocles1/wise.html, but when I tried to access it, I was told the page couldn't be located. By backing up one level in the address to http://members.theglobe.com/damocles1/, I found the site's index page and was able to identify the misquoted URL as http://members.theglobe.com/damocles1/projects/wise.html.

Use this technique when you find a really helpful page. Back up a level (or two or three) in the URL and find the site's home page. Many times the home page will lead you to other interesting pages on the site and, if you're lucky, a page full of links to similar sites.

Read The Help Pages

This quick spin through search engine features cannot begin to cover the capabilities of search engines. Each search engine has a different way of locating Web sites, a different method of indexing those sites, a different search technique, and different commands. Read the help files for the search engines you use and take a peek at Search Engine Watch's roundup (as of Feb 2016) of14 Alternative (i.e. not Google) Search Engines.. Each engine has its strengths and weaknesses. Find several to use regularly and learn which is best for a given type of search.

Play around. Wander. Get caught in the Web!