|
Free Open Book
Google Hacks |
Hack 8 Mixing Syntaxes
What combinations of search syntaxes will and will not fly in your Google search? There was a time when you couldn't "mix" Google's special syntaxes [Section 1.5]—you were limited to one per query. And while Google released ever more powerful special syntaxes, not being able to combine them for their composite power stunted many a search. This has since changed. While there remain some syntaxes that you just can't mix, there are plenty to combine in clever and powerful ways. A thoughtful combination can do wonders to narrow a search. 8.1 The Antisocial SyntaxesThe antisocial syntaxes are the ones that won't mix and should be used individually for maximum effect. If you try to use them with other syntaxes, you won't get any results. The syntaxes that request special information—stocks: [Hack #18], rphonebook:, bphonebook:, and phonebook: [Hack #17]—are all antisocial syntaxes. You can't mix them and expect to get a reasonable result. The other antisocial syntax is the link: syntax. The link: syntax shows you which pages have a link to a specified URL. Wouldn't it be great if you could specify what domains you wanted the pages to be from? Sorry, you can't. The link: domain does not mix. For example, say you want to find out what pages link to O'Reilly & Associates, but you don't want to include pages from the .edu domain. The query link:www.oreilly.com -site:edu will not work, because the link: syntax doesn't mix with anything else. Well, that's not quite correct. You will get results, but they'll be for the phrase "link www.oreilly.com" from domains that are not .edu. If you want to search for links and exclude the domain .edu, you have a couple of options. First, you can scrape the list of results [Hack #44] and sort it in a spreadsheet to remove the .edu domain results. If you want to try it through Google, however, there's no command that will absolutely work. This one's a good one to try: inanchor:oreilly -inurl:oreilly -site:edu This search looks for the word O'Reilly in anchor text, the text that's used to define links. It excludes those pages that contain O'Reilly in the search result (e.g., oreilly.com). And, finally, it excludes those pages that come from the .edu domain. But this type of search is nowhere approaching complete. It only finds those links to O'Reilly that include the string oreilly—if someone creates a link like <a href="http://perl.oreilly.com/">Camel Book</a>, it won't be found by the query above. Furthermore, there are other domains that contain the string oreilly, and possibly domains that link to oreilly that contain the string oreilly but aren't oreilly.com. You could alter the string slightly, to omit the oreilly.com site itself, but not other sites containing the string oreilly: inanchor:oreilly -site:oreilly.com -site:edu But you'd still be including many O'Reilly sites that aren't at O'Reilly.com. So what does mix? Pretty much everything else, but there's a right way and a wrong way to do it. 8.2 How Not to Mix Syntaxes
8.3 How to Mix SyntaxesIf you're trying to narrow down search results, the intitle: and site: syntaxes are your best bet. 8.3.1 Titles and sitesFor example, say you want to get an idea of what databases are offered by the state of Texas. Run this search: intitle:search intitle:records site:tx.us You'll find 32 very targeted results. And of course, you can narrow down your search even more by adding keywords: birth intitle:search intitle:records site:tx.us It doesn't seem to matter if you put plain keywords at the beginning or the end of the search query; I put them at the beginning, because they're easier to keep up with. The site: syntax, unlike site syntaxes on other search engines, allows you to get as general as a domain suffix (site:com) or as specific as a domain or subdomain (site:thomas.loc.gov). So if you're looking for records in El Paso, you can use this query: intitle:records site:el-paso.tx.us and you'll get seven results. 8.3.2 Title and URLSometimes you'll want to find a certain type of information, but you don't want to narrow by type. Instead, you want to narrow by theme of information—say you want help or a search engine. That's when you need to search in the URL. The inurl: syntax will search for a string in the URL but won't count finding it within a larger URL. So, for example, if you search for inurl:research, Google will not find pages from researchbuzz.com, but it would find pages from www.research-councils.ac.uk. Say you want to find information on biology, with an emphasis on learning or assistance. Try: intitle:biology inurl:help This takes you to a manageable 162 results. The whole point is to get a number of results that finds you what you need but isn't so large as to be overwhelming. If you find 162 results overwhelming, you can easily add the site: syntax to the search and limit your results to university sites: intitle:biology inurl:help site:edu But beware of using so many special syntaxes, as I mentioned above, that you detail yourself into no results at all. 8.3.3 All the possibilitiesIt's possible that I could write down every possible syntax-mixing combination and briefly explain how they might be useful, but if I did that, I'd have no room for the rest of the hacks in this book. Experiment. Experiment a lot. Keep in mind constantly that most of these syntaxes do not stand alone, and you can get more done by combining them than by using them one at a time. Depending on what kind of research you do, different patterns will emerge over time. You may discover that focusing on only PDF documents (filetype:pdf) finds you the results you need. You may discover that you should concentrate on specific file types in specific domains (filetype:ppt site:tompeters.com). Mix up the syntaxes as many ways as is relevant to your research and see what you get.
|
Main Menu |
| 500 Juegos Gratis | 500 Giochi Gratis | 500 Jeux Gratuits | 500 Jogos Gratis | 500 Kostenlose Spiele |