www.searchlores.org main search engines: Google ~
back to main
Version 04,04 ~ April 2005

VERY useful findings!!
Google Web APIs    Google in depth
An easy way to search Google's cache

Introduction     Query parameters     Advanced operators     Various about google     Useful findings     Googlette

The aim of this section, simply stated, is to offer for free, to anyone, a "summa" of all possible knowledge about google. This should make it completely useless -for anyone- to buy books called "google hacks", "google hacking" and similar crap.
My hope, and joy, is to put out of business (and hopefully gently push towards suicide) some of the wankers that stole the free content of some good sites in order to scrap some money out of it.

Ah, yes... also don't believe for a moment that this page will remain as it is, even if it is already quite powerful. It will be costantly updated, enlarged, ameliorated... it is, like most things in life, a work "in fieri".


Since most search engines were just keen on making money no matter how, Google represented a breath of fresh air, and (mostly) held the promise of delivering high relevancy results without all the extraneous and often ridicolous and annoying 'services' of the larger portals. They understand as first that the real money was to be made as aggregators of information, not as a search engine. Of course they do use for profit all the data they gather. Every search engine does it. But at least they deliver speedy and often useful results.

Unfortunately with success came problems. The beastly SEO spammers ("search engine optimizers", people that routinely push irrelevant sites up among the front results, tricking and tweaking the search engines' algorithms) tried and try their worst in order to push up their own (and their customers') foetid commercial crap sites. In order to fool Google's listings they have used everything, and then some, from "link farms" to "Trackback", from auto-citation features, to blog noise and whatsnot.

Confronted with such concerted attacks from SEO spammers and webloggers Google has taken drastic remedial action, almost abandoning its PageRank algos and even installing at one time (october-november 2003) some brutal emergency filters.
These filters were (and in part still are) activated -mostly- ONLY for the typical zombies' one-word queries, hence making any search just a little more complex, adding any filetype or site-based searchparameter, or for instance just excluding a nonsense string la -"qwjjqw" (a term that until today did not exist on google's index), would and will deactivate such brutal anti-spammers algo.

Note also that a (small) part of the following has been purposedly "ripped back" from some books about "google hacking", written by young lamers. After having ripped free knowledge sites for years, some of them didn' even have the decency to cite them. So I'll serve them back. Such books are full of severe errors, that, of course, have been corrected in the following. The joy of the web is that you can easily find copycatters and then punish them, exposing their copycatting (and incompetence) at the same time.

Google Query parameters 

q $query (your query)the search query, your target
Start0 -- MAX hitsThe point in the search results where Google should start. Result 0 is the first result on the first page
num maxResults1 -- 100Number of results presented per page (MAX 100)
filterO or 1false or true? If true (=1) they will "omit some entries very similar to those already displayed" and tell you that "If you like, you can repeat the search with the omitted results included" (thus setting the filter to zero)
restrict"restrict code"
for instance countryAF (Afghanistan) countryAR (Argentina) countryAU (Australia) countryBE (Belgium) countryBM (Bermuda)...
Restrict results to a specific country (using country specific IP addresses... google is notoriously unreliable in this). Google also has four topic restricts: US. Government unclesam; GNU-Linux linux; Macintosh mac; FreeBSD bsd
hlinterface language codeAt the moment the language codes for interface language are: af, sq, am, ar, az, eu, be, bn, bh, xx-bork, bs, br, bg, ca, zh-CN, zh-TW, hr, cs, da, nl, xx-elmer, en, eo, et, fo, tl, fi, fr, fy, gl, ka, de, el, gn, gu, xx-hacker, iw, hi, hu, is, id, ia, ga, it, ja, jw, kn, xx-klingon, ko, ky, la, lv, lt, mk, ms, ml, mt, mr, ne, no, nn, oc, or, fa, xx-piglatin, pl, pt-BR, pt-PT, pa, ro, ru, gd, sr, sh, st, si, sk, sl, es, su, sw, sv, ta, te, th, ti, tr, tk, tw, uk, ur, uz, vi, cy, xh, yi, zu.
lrlanguage restrictLanguage restrict. Only display pages written in this language. Codes: Arabic lang_ar; Chinese (S) lang_zh-CN; Chinese (T) lang_zh-TW; Czech lang_cs; Danish lang_da; Dutch lang_nl; English lang_en; Estonian lang_et; Finnish lang_fi; French lang_fr; German lang_de; Greek lang_el; Hebrew lang_iw; Hungarian lang_hu; Icelandic lang_is; Italian lang_it; Japanese lang_ja; Korean lang_ko; Latvian lang_lv; Lithuanian lang_lt; Norwegian lang_no; Portuguese lang_pt; Polish lang_pl; Romanian lang_ro; Russian lang_ru; Spanish lang_es; Swedish lang_sv; Turkish lang_tr
ieUTF-8The input encoding of Web searches. Google suggests UTF-8
oeUTF-8The output encoding of Web searches. Google suggests UTF-8
as_epqExact phraseAdvanced search: "with the exact phrase". The value is submitted as an exact phrase. It's no more necessary to surround the phrase with quotes.
as_fti = include file type;
e = exclude file type a file extension
Advanced search: File format: Only | Don't.... Include or exclude the file type indicated by as_filetype (see below) 
as_filetipefile extensionAdvanced search: File Format: ....return results of the file format. Include or exclude this file type as indicated by the value of as_ft (see above)
as_qdrm3 = past 3 months; m6 = past 6 months; y = past year Advanced search: Date Return web pages updated in the.... Locate pages updated within the specified timeframe
as_nlolow numberFind numbers between as_nlo and as_nhi 
as_nhihigh numberFind numbers between as_nlo and as_nhi
as_oqa list of wordsFind at least one among the words of the list
as_occtany = anywhere; title = title of page; body = text of page; url = in the page URL; links = in links to the page Advanced search: Occurrences Return results where my terms occur.... Find search term in a specific page location
as_dti = only include site or domain;
e = exclude site or domain
Advanced search: Domain: Only | Don't.... Include or exclude searches from the domain specified by as_sitesearch (see below) 
as_sitesearchdomain or siteAdvanced search: Domain: ...return results from the site or domain. Include or exclude this domain or site as specified by as_dt (see above)
safeactive = enable SafeSearch off = disable SafeSearchEnables or disables "safe search" (Autocensoring)
as_rqURLLocate pages similar to this URL
as_lqURLLocate pages that link to this URL.

Google Advanced operators (Cfr google) 

Note that the syntaxes are often case-sensitive: phonebook, not "Phonebook"
Note that there can be no space between the "operator:" and the following word
Search within the title of a page. Title text is not limited to the TITLE HTML tag. A Web pages document can be generated in any number of ways, and in some cases, a Web page might not even have a title at all. The thing to remember is that the title is the text that appears at the top of the Web page, and you can use intitle to locate text in that spot. When using intitle, its important to pay attention to the syntax of the search string, since the word or phrase following the word intitle is considered the search phrase. Other terms may be found anywhere in the page. Allintitle, on the contrary, tells Google that every single word or phrase that follows is to be found in the title of the page. Therefore putting "intitle:" in front of every word in your query is equivalent to putting "allintitle:" at the front of your query.
Search text within a given URL. This gives you the opportunity to search for specific directories or folders. Extremely useful operator, together with the site and fyletipe operators. Just like the allintitle search, allinurl tells Google that every single word or phrase that follows is to be found only in the URL of the page. inurl: works only on words , not URL components. In particular, it ignores punctuation and uses only the first word following the "inurl:" operator. To find multiple words in a result URL, use the inurl: operator for each word. Note: Putting inurl: in front of every word in your query is equivalent to putting allinurl: at the front of your query.
Searches for pages that end in a particular file extension. The file extension is the part of the URL following the last period of the filename but before the question mark that begins the parameter list. Here some of the thousand possible extensions: Adobe Portable Document Format: Pdf; Adobe PostScript: Ps; Lotus 1-2-3: wk1, wk2, wk3, wk4, wk5, wki, wks, wku; Lotus WordPro: Lwp; MacWrite: Mw; Microsoft Excel: Xls; Microsoft PowerPoint: Ppt; Microsoft Word: Doc; Microsoft Works: wks, wps, wdb; Microsoft Write: Wri; Rich Text Format: Rtf; Shockwave Flash: Swf; Text ansi: txt
Locates a string within the text of a page. The allintext operator is perhaps the simplest operator to use since it performs the function that search engines are most known for: locating a string within the text of the page. Although this advanced operator might seem too generic to be of any real use, it is handy when you know that the text you're looking for should only be found in the text of the page. Use allintext as a type of shorthand for "find this string anywhere except in the title, the URL, and links". Since this operator starts with the word all, every search term provided after the operator is considered part of the operator's search query
Narrows a search to specific sites. A subset of inurl and allinurl. Parameters to Googles site operator must end in a valid top-level domain name (org, com, etc).
Companion to inanchor. The link operator allows you to search for pages that link to other pages. Instead of providing a search term, the link operator requires a URL or server name as an argument. It can include not only basic URLs but complete URLs that include directory names, filenames, parameters, and the like. The syntax must be a correct URL syntax, however. When an invalid link: syntax is provided, Google treats the search term not as a link, but as a phrase search
Companion to link. The inanchor operator searches the text representation of a link, not the actual URL. For instance inanchor:webbits would search links like this one: webbits (that actually points to rabbits.htm)
Search for pages published within a certain date range. Google designed the as_qdr field, for its advanced searching mask, to help you locate pages that have been updated within a given time frame (3 months, six months or one year). For example, to find pages that have been updated within the past six months and that contain the word fravia, use the query http://www.google.com/search?q=fravia&as_qdr=m6. Ritz has developed a full-fledged daterange mask for searchlores.
The numrange operator requires two parameters, a low number and a high number, separated by a dash. As the name suggests, numrange can be used to find numbers within a range. For example, to locate the number 3008, a query such as numrange:3007-3009 will work just fine. When searching with numrange Google ignores symbols such as currency markers and commas, making it much easier to search for numbers on a page.
Instead of using the numrange operator, you can of course provide a query with two numbers separated by two periods. The shortened version of the query just mentioned would be 3007..3009. Notice however the difference between numrange and "double periods" queries: with the last the two limits (here 3007 and 3009) seem to have priority over the included values (here 3008).
Used to get to google's cached link of the results page, cache:http://www.fravia.com or cache:http://www.yahoo.com. Just as with the link operator, passing an invalid hostname or URL as a parameter to cache will submit the query as a phrase search. A
The info operator shows the summary information for a site and provides links to other Google searches that might pertain to that site. The parameter to this operator must be a valid URL or site name: info:www.searchlores.org. You can achieve this same functionality by supplying a site name or URL as a search query. Just as with the link and cache operators, passing an invalid hostname or URL as a parameter to info will submit the query as a phrase search.
The related operator displays sites that Google has determined are related to a site. The parameter to this operator is a valid site name or URL. You can achieve this same functionality by clicking the Similar Pages link from any search results page or by using the "Find pages similar to the page" portion of the advanced search form
searches for business and residential phone listings (only for the United States). For instance you may search a guy named "buster" in Alabama: buster al. Note that google's phonebook stops digging at 600 results (like its search engines stops digging at 1000). Wildcards don't work either. To do a reverse search, just enter the phone number with area code. Lookups without area code will not work: phonebook: (334) 636-2580. Google's "phonebook" is however a very poor way to find data about a specific person, or to stalk someone. See the ad hoc section of searchlores for more effective ways to find a telephonn number or an address.  
White pages: residential phone listings (only for the United States). Wildcards don't work.
Yellow pages: business phone listings (only for the United States). Wildcards don't work. Then again, they're not needed; the Google phonebook does all the wildcarding for you. For example, if you want to find shops in New York with "Coffee" in the title, don't bother trying to envision every permutation of "Coffee Shop," "Coffee House," and so on. Just search for bphonebook:coffee new york ny and you'll get a list of any business in New York whose name contains the word "coffee."
Usenet searching. The author operator will allow you to search for the author of a newsgroup post on usenet. The parameter to this option consists of a name or an e-mail address
Usenet searching. This operator allows you to search the title of Google Groups posts for search terms. This is one of the operators that is very compatible with wildcards. For example, to search for groups that have a suffix "comp", a search such as group:comp* works well.
Usenet searching. Locate a group post by message ID The msgid operator refers to a specificb group message identifier, a unique string that identifies a newsgroup post. The format is something like comp-sys-concurrent-intro-1-1061190083@gweep.ca, and you can see it only checking the complete header of a given message through the "show original" option in groups. Note however that this operator does not work reliably any more in google groups.
Usenet searching. Insubject: search google groups subject lines (like intitle:)
Search for stock information. Allows those that like to play pyramide schemes to search for information about a particular stock market company. The parameter for this operator must be a valid stock abbreviation (stock ticker). If you provide an invalid stock ticker abbreviation, you will be taken to a YAHOO screen (sic) that allows further searching for a correct ticker symbol,
Show the definition of a term Returns definitions for a search term. Arguments to this operator may be a word or phrase. For instance: gross. Very anglophonic-centric feature.

Various about google 

Google Web APIs Reference (Must read)
serend_1.htm: Serendipity (an easy way to search Google's cache) by Shoki
The Anatomy of a Large-Scale Hypertextual Web Search Engine, by Sergey Brin and Lawrence Page

Queries are now limited to 32 words (not to 10 any more). So you can now break your search into a series (two or three) of independent "main" searches that the boolean OR (to avoid the default AND) will held together.
Here a silly, but useful, example:
("wares" OR "warez" OR "appz" OR "gamez" OR "abandoned" OR "pirate" OR "war3z") ("download" OR "ftp" OR "index of" OR "cracked" OR "release" OR "full") ("nfo" OR "rar" OR "zip" OR "ace")

A splendid searchform by Ritz: daterange.htm will allow you to search specific time-"slices" inside google
(and since it is in javascript, you may use it wherever :-)

With engines like google you can forget wacky unstable hyperlinks: just find your target pages selecting a set of very peculiar words that uniquely identify a given page, and just use a google query for those words in order to find that page in the future NOT the URL. So we could link to my tadimens.htm page using: "This has of course to do with both the vastness of the web and the fact that people do not know how to search".
Clearly in this case, since we are using this here as an example, you'll fetch this very page as well.

Bye bye, fragile links! (Of course you can do the same with all good search engines :-)

Useful findings 

4) GOOGLE'S BIAS:     5) "GoogleRanking" bookmarklet     6) GOOGLE's WILDCARDS
10) GOOGLE's MOST LINKED     11) GOOGLE's CACHE mysteries     12) IS GOOGLE DANCING?
13) DIFFERENT IPs, DIFFERENT DATACENTERS     14) GOOGLE direct     15) GOOGLE index is stale?
16) GOOGLE's simple success secret     17) GOOGLE cache     18) GOOGLE oddities: the AND operator
19) GOOGLE site ranking:     20) GOOGLE PRINT starting (soon)     21) GOOGLE Newsletters (full of crap, yet with some info snippets)

  1. "ARCHEOLOGICAL" DIGGING (using daterange)
    For instance: fravia daterange:2452275-2452639 (1 Jan 2002 - 31 Dec 2002)

    The Julian date is calculated by the number of days since January 1, 4713 BC. Julian dates (abbreviated JD) are simply a continuous count of days and fractions since noon Universal Time on January 1, 4713 BCE (on the Julian calendar). Almost 2.5 million days have transpired since this date. Julian dates are widely used as time variables within astronomical software. Typically, a 64-bit floating point (double precision) variable can represent an epoch expressed as a Julian date to about 1 millisecond precision.

       year  month day    hr min sec  
    CommonEra BeforeCommonEra      
       Julian date    weekday
      calculation type
      JD date   

  2. More GOOGLE daterange  (Nemo's useful knowledge):

    Well our webmasterworld 'frends' are having one interesting 'discussion' about this subject:


    they are too miser to share with you their findings... but the bread crumbs are still very revealing: I've just accidentally discovered a way of getting results from Google with no query string..

    Well!, well!... so it's possible! I've tried it some time ago whithout any sucecess... I looked once more to the available special syntax:

    site:, link:, inurl:, allinurl:, intitle:, allintitle:, intext:, allintext:, filetype:, ext:, inanchor:, allinanchor:, phonebook:, rphonebook:, bphonebook:, daterange:

    and I saw that I forgot to play a little whit daterange:, so I tried the following query at Google:


    and bingo! I hited the jackpot! I think this query is better than the following one (see also "Google's most linked", below):


    because, I bet that in the second one, the keyword density of http should play an important role.

  3. GOOGLE'S HIGHLIGHTING TRICK  (Mordred's useful knowledge):

    A trick to make google highlight important parts of the summary with careful usage of asteriscs. For example if we're looking for a big (as possible) recording of rain, we'd use the "index of" trick like that:

    "index+of/" "rain.wav"

    60 results - but... which one to choose?
    Here is a better way to gather relevant info:

    "index+of/" "rain.wav******"

    Index of /rmx/impregnation
    Index of /rmx/impregnation. ... cymb.wav 14-Mar-2003 08:14 975k hh.wav 14-Mar-2003 08:14
    780k kick.wav 14-Mar-2003 08:14 780k rain.wav 14-Mar-2003 08:15 4.2M sample1 ...
    chronofixion.free.fr/rmx/impregnation/ - 4k - 26 Mar 2003 - Cached - Similar pages

    Mmmm... the six asterisks put the info we need in bold. See the size? 4.2M :)

    Pagerank seems to tend to bias against newly-created pages.
    Google seems to pay a lot of attention to the text in a links anchor when deciding the relevance of a target page.

  5. "GoogleRanking" bookmarklet:
  6. GOOGLE's WILDCARDS  (Shally Steckler's tip):
    The * happens to be a wildcard that replaces an entire word. This is not a documented Google command but if you use the * connected by any of the characters Google ignores like - = , ; \ / < and > then it acts as a place setting for "any word" like this: three-*-cats or   nice=*=spring or   fravia<*>site and so on. An interesting thing is that if you use another connector and another asterisk then it returns results with two words between the first and the last term like this: fravia-*-*-site.
  7. TIRED OF CLICKING GOOGLE? (1) (google viewer)
    google viewer: "bettie page" and you wont have to move a finger :-)

    JANUARY 2005: The googleviewer does not seem to work anymore.
    The address was: http://labs.google.com/gviewer.html
    As usual you won't find any explanation whatsoever from google on the reasons for killing this useful feature.
    Use your keyboard to navigate trough google's results (unfortunately it does not work with Opera :-(
  9. GOOGLE's WEBQUOTES  (can you find some use for this?):
    advanced searching
  10. GOOGLE's MOST LINKED  (can you find some use for this?):

  11. GOOGLE's CACHE mysteries  (can you find some use for this?):
    If you search for <html> you'll find a list of pages that have the character before the first html tag. Check google's cache of these pages.
    The two characters have the hexadecimal code FE and FF: "Furthermore, to maximize chances of proper interpretation, it is recommended that documents transmitted as UTF-16 always begin with a ZERO-WIDTH NON-BREAKING SPACE character (hexadecimal FEFF, also called Byte Order Mark (BOM)) which, when byte-reversed, becomes hexadecimal FFFE, a character guaranteed never to be assigned. Thus, a user-agent receiving a hexadecimal FFFE as the first bytes of a text would know that bytes have to be reversed for the remainder of the text.

    More info here
  12. IS GOOGLE DANCING?  (this is completely useless):
    Every month Google updates its indexes. These updates are known as "Google dances". The indexes are quite large and the calculations take account of a shifting palette of algorithms, and take several days to complete. During this period, the search results are not stabilized and may vary from a minute to another. That's google's "dance" (towards the end of each month, usually between the 20th and the first days of the following month, the 25th is a good bet). When the dance starts the 'linkers' (the number of sites pointing to a specific site) are different on the three googles: www, www2, and www3 (these two point to the San Jose datacenter, which is supposed to be the 'frishest).
    Since Sommer 2003, Google dances MUCH less wildly. The indexes are now updated every week, mostly around monday.

    Check if google is dancing: www    www2    www3

    They have recently introduced a redirection from www2 and www3 to the www.google.
  13. DIFFERENT IPs, DIFFERENT DATACENTERS  (even if it does not dance: hiccups):
    You will often receive different results from google depending on the ip you search from (try the same search with a couple of proxies). When you access google from a different name server you may be sent to a different google data centre. This may also happen when you repeat a search over time.

    Today (1 february 2003):  using a proxy (4840)    searching directly (4860) (see next finding) (4850)
  14. GOOGLE direct  (whenever you are 'stuck' on a crappy MSIE-cookied PC):

    http://ww.google.com (Note: only two "w")

    This was for a while very useful whenever you were 'stuck' on a national-flawed google.
    Now you have to use http://www.google.com/intl/en/ to bypass national crap-googles.
    Note that you can put a specific line into your host file (" g") or (" g") and *THAT datacenter* of google will show up whenever you type "g" into the location bar.
    Note that you can use the same trick with everything else, check the lore of the HOSTS scrolls.

    Direct googles

    They follow patterns: a lot have 64.233, 66.120 or 216.239, then one ODD number, and a second one that is around 100 (98/99/100/102/103/104/105/106/107)

    Note that you can always choose between, say,

    The 64.233s... ~  Checked 2/10/2004 ~  Checked 2/10/2004 ~  Not working 2/10/2004 ~  Not working 2/10/2004 ~  Checked 2/10/2004 ~  Not working 2/10/2004 ~  Not working 2/10/2004 ~  Checked 2/10/2004 ~  Not working 2/10/2004 ~  Not working 2/10/2004 ~  Checked 2/10/2004 ~  Not working 2/10/2004

    The 66.102s... ~    Checked 20/03/2004 ~    Checked 2/10/2004 ~    Checked 20/03/2004 ~  Checked 20/03/2004

    The 216.239s... ~ www-ex.google.com ~  Not working 20/03/2004 ~ www-sj.google.com ~  Not working 20/03/2004 ~ www-va.google.com ~  Not working 20/03/2004 ~  Checked 20/03/2004 ~ www-dc.google.com ~  Not working 20/03/2004 ~  Checked 20/03/2004 ~ www-ab.google.com ~  Not working 20/03/2004 ~  Checked 27/03/2004 ~  Checked 2/10/2004 ~  Checked 2/10/2004 ~ www-in.google.com ~  Not working 20/03/2004 ~  Checked 20/03/2004 ~ www-cw.google.com ~  Not working 20/03/2004 ~  Checked 20/03/2004 ~  Checked 2/10/2004 ~  Checked 2/10/2004 ~  Checked 2/10/2004 ~  Checked 20/03/2004

  15. GOOGLE index is stale?:
    Sure is that updating once every month does not help a lot re: freshness :-(
  16. GOOGLE's simple success secret:
    Google is THE ONLY main search engine without crap paid results inside SERPs :-)
  17. GOOGLE cache  (useful knowledge :-):
    Hi Fravia,
    Here is a short note concerning the use of Google's cache that may be of interest...
    When searching for files using Google, perhaps using the +"index of /" trick, you often run into the problem that the files cannot be accessed because you do not have permission... need a passwored or so.
    Sometimes yet, Google has cached the "forbidden" page that lists the file you want, before access restrictions were placed on the URL. By checking the cache, you will then see the page with the file you want listed on it.
    I have found that a surprising number of times, I can simply download the file from Google's cache page. Why? Because the permissions were set on the directories only, and not on every single file within the directory!
    This is especially true for images, but it works for musicz too.
    For example, say you are interested in mp3sby Beck,and Google lists the following site:
    Attempt to access this site directly and you will be denied access.
    Fine.Have a look at Google's cache and you will see a single mp3 listed: "Beck - Loser.mp3". You will find that this file is downloadable.

  18. GOOGLE oddities: the AND operator  (doesn't kick in):

    search AND tips: 5,540,000
    search tips: 5,930,000

    This difference (not only quantitative but also qualitative) means that the AND operator forces an exact phrase search and, contrarily to google's statements, that it is not provided by default.
  19. GOOGLE site ranking:

    Spammers (that call themselves SEOs) are investing incredible amounts of work in order to 'rank' in the first positions in google. Seekers could not care less (actually a good reason to spring the first 200 places in google, now heavily spammed, following the old proverb for spammed altavista, that once upon a time was the best engine around: Hic alta, hic salta) but there are possibilities for checking quickly where you are on a given search, using google's API. Hey you may even use the 'googlette' at the bottom of this page, or visit http://www.googlerankings.com/, where you will find following form:

    Keyword(s) to list the sites for:  
    Domain or URL of your website:  
    eg.: google.com or geocities.com/mysite

    For faster results, You may limit your search to the

    The process may take up to 15 seconds

    This is based on this php script
  20. GOOGLE PRINT starting  (soon):

    Google print

  21. GOOGLE Newsletters  (full of crap, yet with some info snippets):




Back to the main search engines

to basic
Bk:flange of myth 
(c) III Millennium: [fravia+], all rights reserved, coupla wrongs reversed