Stay organized with collections
Save and categorize content based on your preferences.
Tuesday, February 12, 2008
Our search quality and Webmaster Central teams love helping webmasters solve problems. But since
we can't be in all places at all times answering all questions, we also try hard to show you how
to help yourself. We put a lot of work into providing documentation and blog posts to answer your
questions and guide you through the data and tools we provide, and we're constantly looking for
ways to improve the visibility of that information.
While I always encourage people to search our Help Center and blog for answers, there are a few
articles in particular to which I'm constantly referring people. Some are recent and some are
buried in years' worth of archives, but each is worth a read:
Googlebot can't access my website:
Web hosters seem to be getting more aggressive about blocking spam bots and aggressive crawlers
from their servers, which is generally a good thing; however, sometimes they also block
Googlebot without knowing it. If you or your hoster are "allowing" Googlebot through by
allowlisting Googlebot IP addresses, you may still be blocking some of our IPs without knowing
it (since our full IP list isn't public, for reasons explained in the post). In order to be
sure you're allowing Googlebot access to your site, use the method in this blog post to verify
whether a crawler is Googlebot.
URL blocked by robots.txt:
Sometimes the web crawl section of Webmaster Tools reports a URL as "blocked by robots.txt",
but your robots.txt file doesn't seem to block crawling of that URL. Check out this list of
troubleshooting tips, especially the part about redirects.
This thread
from our Help Group also explains why you may see discrepancies between our web crawl error
reports and our robots.txt analysis tool.
Why was my URL removal request denied?
(Okay, I'm cheating a little: this one is a Help Center article and not a blog post.) In order
to remove a URL from Google search results you need to first put something in place that will
prevent Googlebot from simply picking that URL up again the next time it crawls your site. This
may be a 404 (or 410) status code, a noindexmeta tag, or a robots.txt file, depending on what type of removal request you're
submitting. Follow the directions in this article and you should be good to go.
Flash best practices:
Flash continues to be a hot topic for webmasters interested in making visually complex content
accessible to search engines. In this post Bergy, our resident Flash expert, outlines best
practices for working with Flash.
The supplemental index:
The "supplemental index" was a big topic of conversation in 2007, and it seems some webmasters
are still worried about it. Instead of worrying, point your browser to this post on how we now
search our entire index for every query.
Duplicate content:
Duplicate content—another perennial concern of webmasters. This post talks in detail about
duplicate content caused by URL parameters, and also references Adam's previous post on
deftly dealing with duplicate content,
which gives lots of good suggestions on how to avoid or mitigate problems caused by duplicate
content.
Sitemaps FAQs: This post
answers the most frequent questions we get about Sitemaps. And I'm not just saying it's great
because I posted it. :-)
Sometimes, knowing how to find existing information is the biggest barrier to getting a question
answered. So try searching our blog,
Help Center and
Help Group
next time you have a question, and please
let us know
if you can't find a piece of information that you think should be there!
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],[],[[["\u003cp\u003eGoogle provides various resources, including documentation, blog posts, a Help Center, and a Help Group, to assist webmasters in resolving website issues.\u003c/p\u003e\n"],["\u003cp\u003eWebmasters should verify Googlebot access using recommended methods to ensure their website is being crawled correctly and avoid unintentional blocking.\u003c/p\u003e\n"],["\u003cp\u003eUnderstanding how to prevent duplicate content, utilize robots.txt effectively, and address URL removal request denials are crucial for maintaining a healthy website presence in Google Search.\u003c/p\u003e\n"],["\u003cp\u003eBest practices for Flash and Sitemaps, along with information on the supplemental index, are available to help webmasters optimize their content for search engines.\u003c/p\u003e\n"],["\u003cp\u003eWebmasters are encouraged to utilize the available search functionality and resources before seeking direct support, ensuring efficient problem-solving and access to valuable information.\u003c/p\u003e\n"]]],["The post highlights crucial resources for webmasters. Key actions include: verifying Googlebot access, troubleshooting URLs blocked by robots.txt, and understanding URL removal processes. It addresses Flash best practices, the evolution of the supplemental index, and issues of duplicate content, especially those caused by URL parameters. Also, It covers common questions about Sitemaps, encouraging users to utilize the blog, Help Center, and Help Group to find answers to their questions.\n"],null,["Tuesday, February 12, 2008\n\n\nOur search quality and Webmaster Central teams love helping webmasters solve problems. But since\nwe can't be in all places at all times answering all questions, we also try hard to show you how\nto help yourself. We put a lot of work into providing documentation and blog posts to answer your\nquestions and guide you through the data and tools we provide, and we're constantly looking for\nways to improve the visibility of that information.\n\n\nWhile I always encourage people to search our Help Center and blog for answers, there are a few\narticles in particular to which I'm constantly referring people. Some are recent and some are\nburied in years' worth of archives, but each is worth a read:\n\n1. [**Googlebot can't access my website:**](/search/blog/2006/09/how-to-verify-googlebot) Web hosters seem to be getting more aggressive about blocking spam bots and aggressive crawlers from their servers, which is generally a good thing; however, sometimes they also block Googlebot without knowing it. If you or your hoster are \"allowing\" Googlebot through by allowlisting Googlebot IP addresses, you may still be blocking some of our IPs without knowing it (since our full IP list isn't public, for reasons explained in the post). In order to be sure you're allowing Googlebot access to your site, use the method in this blog post to verify whether a crawler is Googlebot.\n2. [**URL blocked by robots.txt:**](/search/blog/2006/09/debugging-blocked-urls_19) Sometimes the web crawl section of Webmaster Tools reports a URL as \"blocked by robots.txt\", but your robots.txt file doesn't seem to block crawling of that URL. Check out this list of troubleshooting tips, especially the part about redirects. [This thread](https://groups.google.com/group/Google_Webmaster_Help-Tools/browse_thread/thread/7373992320bba7fd/a848d486580e28ba#a848d486580e28ba) from our Help Group also explains why you may see discrepancies between our web crawl error reports and our robots.txt analysis tool.\n3. [**Why was my URL removal request denied?**](https://support.google.com/webmasters/answer/9689846) (Okay, I'm cheating a little: this one is a Help Center article and not a blog post.) In order to remove a URL from Google search results you need to first put something in place that will prevent Googlebot from simply picking that URL up again the next time it crawls your site. This may be a `404` (or `410`) status code, a `noindex` `meta` tag, or a robots.txt file, depending on what type of removal request you're submitting. Follow the directions in this article and you should be good to go.\n4. [**Flash best practices:**](/search/blog/2007/07/best-uses-of-flash) Flash continues to be a hot topic for webmasters interested in making visually complex content accessible to search engines. In this post Bergy, our resident Flash expert, outlines best practices for working with Flash.\n5. [**The supplemental index:**](/search/blog/2007/12/ultimate-fate-of-supplemental-results) The \"supplemental index\" was a big topic of conversation in 2007, and it seems some webmasters are still worried about it. Instead of worrying, point your browser to this post on how we now search our entire index for every query.\n6. [**Duplicate content:**](/search/blog/2007/09/google-duplicate-content-caused-by-url) Duplicate content---another perennial concern of webmasters. This post talks in detail about duplicate content caused by URL parameters, and also references Adam's previous post on [deftly dealing with duplicate content](/search/blog/2006/12/deftly-dealing-with-duplicate-content), which gives lots of good suggestions on how to avoid or mitigate problems caused by duplicate content.\n7. [**Sitemaps FAQs:**](/search/blog/2008/01/sitemaps-faqs) This post answers the most frequent questions we get about Sitemaps. And I'm not just saying it's great because I posted it. :-)\n\n\nSometimes, knowing how to find existing information is the biggest barrier to getting a question\nanswered. So try searching our [blog](/search/blog),\n[Help Center](https://support.google.com/webmasters) and\n[Help Group](https://support.google.com/webmasters/community)\nnext time you have a question, and please\n[let us know](https://support.google.com/webmasters/community)\nif you can't find a piece of information that you think should be there!\n\nPosted by Susan Moskwa, Webmaster Trends Analyst"]]