This is for sure one of the most important parts of searchlores,
and you would be well advised to try some of the incredibly powerful search engines listed below.
If you limit yourself to google (that in november 2004 suddenly doubled it's index to 8 billions documents)
or to Yahoo (that in July 2005 claimed to have indexed almost 20 billion documents) you 'll just cover less than
one half of the visible web (and not even 1/100 of the hidden one).
Just copy this page onto your harddisk as
c:\main.htm (or whatever), and then bookmark it there and
use it (after having edited or thrown away anything you fancy)
in order to perform effective searches on the web
using any main search engine and starting from an unpolluted jumping off place, a page that has as few frills as
possible and as many useful forms as we know of. A page that you can modify -and ameliorate- yourself (feedback, in that case,
would be appreciated).
The main reason you should use more than one main search engine is that
search engines' results overlap FAR less than you would think. Recent studies
point out that around 3/4 of the results of a given search are UNIQUE for each search engine.
Remember that search engines list only the first part of any BIG DOCUMENT:
the size varies.
Google had a famous limit of 101K, which was abolished in January 2005, the new limit should be around 150K. These
limits are very annoying when dealing with large documents (or on-line books).
Note also that just because one, hundred, or thousand pages from a given site are crawled and
made searchable trough one of the main search engines, this does not guarantee that
every page from an indexed site has really been crawled and indexed. This shortcoming
hits not only 'new' pages, that can take MONTHS to be indexed: beehives
of spiders harvesting
a site often MISS whole subdirectories, old and new. Useful material may
be all but invisible to those that only use 'main' search tools to seek.
Moreover anyone that uses regularly google (for instance, but other search engines are
not that different) will have noticed how polluting commercial sites results nowadays
a search engine introduce a new, simple "please hide all commercial sites form your SERPs" (Search
Engines Result Pages) option, or switch, or slide, it would probably become king of the hill in a couple of months.
Therefore, seen the commercial-oriented pollution of the web, you
well advised to use regional engines, usenet and other
specialized or targeted search tools and combing
techniques and also to rely on your own bots as well, when searching your various targets.
[Only 400 results viewable]
(search for links to 'text')
(search for links with the description 'text')
(search for given text in the url)
(search files within 'targetdomain')
(search files on 'hostname')
(search 'text' inside the title tags)
(search Java applets named 'text')
(search images with such 'filename')
Read the Altavista
in depth page! Spammed as if there were no tomorrow &
very badly commercialized. The idiots behind altavista's marketing managed to
ruin the best search engine of the middle nineties. It is still THE ONLY
search engine which is TRULY BOOLEAN, hence offering truly amazing opportunities to real seekers...
once you have taken care yourself of the spam.
main drawback is that they are very easy to spam, so you'll
get most useless results in the
positions: "hic alta, hic salta" (a seekers' proverb)...
experienced searchers mostly
jump directly in the middle of altavista's
Altavista is the 'dead links
champion' among the 'main' search engines.
Use the Simple search (which defaults to OR) ONLY if you
really know what you are doing :-)
A "Graphical" search engine, rather interesting result clusters. Here follows the text search form,
but by all means try the cartographic interface
Staggering results... once upon a time... now stale and blocked
"MySearch" did let the user register his/her interested terms.
The system will automatically search in the new-page database every day
and notify the user of matches if any of the registered queries are matched.
A Whats'new system was (once upon a time)
The powerful chinese Google alternative... with CACHE!
"...the world's second largest independent search engine..."
(a compound engine with some own and blog results)
IceRocket uses innovative metasearch technology to search the Internet's top search engines, including WiseNut,
Yahoo, MSN, Teoma, Altavista, Alltheweb, Lycos,
and many more.... Based in Dallas, so beware :-)
(hard to say if this is useful or not)
"Save, search and share your Personal Web. Furl it"
"Furl saves a personal copy of any page on the Web and lets you to find it again instantly, from any computer.
Share the sites you find, and discover useful new sites. Become a member to start building your Personal Web"
Fact is you can use some of the 'comments' this s.e. will dig.
This is -for some queries- a very useful search engine, check it!
The Wayback machine
This is not only a -powerful- search engine,
but also an incredible stalking tool! Explore the Net as it was!
Yahoo recognized the tragical mistake of going commercial and
went 'back to basic' in late 2002 (better late than never) it
seems to be gaining momentum as part of the inktomi factories :-)
Note that yahoo recently bought the wondrous fast/alltheweb search engine (and promptly killed it :-(
Yahoo is now one of the three "big players" (google, MSN and
Yahoo) and claims to index 19 billion sites (against google's 8 billion).
[Only 4011 results viewable]
AND,OR,(),NOT,,", Excite is a classical
example of just another
'ignoble corporate merge'. Just click on rthe link above and look at it! See?
Idiotical & useless, obsolete (late-ninety)
'portal' approach. As a consequence
it ceased to be a major player in January 2002 when Infospace killed it injecting tons of
paid search results. This applies to all merges btw: attempts to escape
the fate of all pyramide schemes
that always forebode catastrophes. Recently the Italians and Germans at Tiscali
have try to revamp this engine on the sunset boulevard. It is still full of
pay-per-click crap, so
noone in his right mind uses it.
Visit the ad hoc GOOGLE page
WARNING: Google has been moved to its specific page, where you will find a
wealth of information. Here only a few masks:
Advanced GOOGLE (only 3% of users take advantage of it, poor 97% zombies :-)
On 10/NOV/2004, probably as a counter to Microsoft's MSN new beta "super" search, google
*doubled* its indexed pages, claiming now a total of up
to 8 billion pages, which should correspond, approximately, to
1/4 of the web (around 35 billions pages according to our own data). One wonders
where did they hid all these billions pages until november 2004 :-)
LYCOS [As many results viewable as you get!]
"Part Man, Part Machine" ~ Open Directory & DMOZ used. Uses
index, with updates at greater intervals than FAST. Major sin: Has closed the VERY useful Trondheim
Mighty'raw' access to Inktomi's data (pointed out by Shally)
Visit the ad hoc FAST section
WARNING: Fast is being killed by Yahoo¡ (March 2004)
Fast knowledge has been moved to its specific page, where you will find a
wealth of information. Here only the mask: ANDO
Pointed out by Nemo
AndoSearch, at Alexa, tries to have exactly the same query syntax as google's,
the biggest difference is the field restriction options.
It has a wealth of parameters and a huge database (modify the
stop word threshhold to 100 and count to 1000 for instance... slow but fine!)
This used to be "Go To", the commercial clowns changed the name because this
"reinforces our leadership in performance-based search", haha :-) Uses Inktomi, like Hotbot.
Ranks results by how much a company is willing to pay for listings and is heavily
infested :-( The mask below is relatively "clean", you
should NEVER use ouverture's site own mask to perform your searches
(has visitor tracking
sniffing annoying logging options aplenty).
[Only 300 results viewable]
default to AND
phrase searching: use ""
use - for NOT
use + to force
WiseNut is a "Korean/Japanese" new 'main' search engine. has good customization feature and one single huge database of indexed
Web pages. It
lacks almost all advanced search capabilities, yet it seems useful because it gives results
that you will not find elsewhere.