For years, a simple robots.txt file has kept search engines out of two subdirectories on my Website, where there is nothing but mapping routines that are called by visible pages elsewhere on the Website, passing parameter strings. Suddenly, Google is treating those parameterized URLs as pages that it can present in search results, while claiming that the robots.txt directives prevent Google from giving a preview of the page. Worse yet, Google puts these unwanted items first in a search results list. (Try googling Wappingers site:towerbells.org and then try the same thing on DuckDuckGo or Bing or Yahoo. The first item in Google’s list is actually a link from within the second item!)
Google’s explanation of why this happens and what to do about is classic doublespeak. On the one hand, it claims that a robots.txt file is useful to keep a Website from being overloaded with requests. On the other hand, it says that the only way to keep a Webpage out of Google searches is to put a meta statement with a “noindex” parameter into the HTML for that page and then eliminate the robots.txt file so Google can find that meta statement. What nonsense! Pretending that the existence of a URL pointing into a protected subdirectory justifies presenting that URL in search results is just plain stupid, because such URLs are the only way any Webcrawler has of even noticing that something actually exists within such a subdirectory. And following Google’s suggested “solution” not only increases the workload on the crawled Website; it also increases the workload on both the Webcrawler and the search engine, all for the purpose of presenting to a searcher something that doesn’t belong in a results list anyway!
Google ought to revert to what it had been doing until recently, and what other Webcrawlers & search engines have done all along – ignore everything that is hidden by a robots.txt directive.
If anyone complains to me that my Website doesn’t follow Google’s recommendations, they’ll get this lesson on Google’s bad judgment.