March 25, 2013

Manage Your Website in Google Webmaster Tools: Part 2




How are you getting along with Google Webmaster Tools (GWMT)? Last time we have been discussing about using and managing settings in Configuration section. I hope it was helping you to understand the settings. At least you know how to get started and what's in Configuration section. Don't you? If you missed the previous post, read Manage Your Website in Google Webmaster Tools: Part 1. Now it's time to move on and continue the series. Let's take a look at Health section and what it has for bloggers and webmasters.


Health


Health section is dedicated to display status of your site from Google's point of view. Yeah, according to the data that Google collected regarding your site. If Google identified important issues with your site, such as blocking important pages from Googlebots or being infected by malware, you will able to find necessary information here.


Crawl Errors


Now I believe you are familiar with the term - Crawling. Here you can view errors that Googlebots were experiencing when crawling your blog or website. Simply, the errors occurred when Google trying to access your site to scan through your content.

Site Errors


Would you mind if your site has errors? If your site cannot be accessed, obviously your content cannot be crawled too. As a prerequisite, Google examines statuses of your site's DNS, Hosting Server and Robots.txt before step into crawling process.


You can click on each button to access reported data for the period of last 90 days.

DNS - Domain Name System aka DNS is responsible for translating your domain name into an IP address. Imagine a translator :) Googlebots can't read your human readable site address (www.mayura4ever.com) like we do, but the IP address allocated for it. You know, computer technology is based on 1s and 0s.

If Googlebots couldn't communicate with the DNS server properly, Google won't be able to access your site. So you will be notified about the DNS issues being experienced here.


Server Connectivity - Hosting servers are responsible for holding your site data and files, and offering relevant services to make your site accessible online. If hosting server responds properly, Google will be able to access your site content for crawling. Else, Google have to wait until it can access your site.

Common causes for connectivity errors are that your server is completely down or it's busy enough to respond to the Googlebot, may be due to exceeded bandwidth limit. In addition, there could be server configuration issues which occurs conflicts with Googlebot too.


Robots.txt Fetch - Simply, robots.txt is a simple text file with instructions for search engine bots. Search engine bots can't crawl it alone? They CAN, but robots.txt is a way to instruct what's NOT to crawl and who are NOT allowed to crawl content of the site.

Googlebots look for robots.txt before it starts crawling your site to make sure if Googlebots were allowed to crawl and what pages were disallowed from indexing. If Googlebots are blocked from crawling your site or robots.txt is inaccessible, Google won't crawl your site at all.

Wanna see robots.txt file of your site? View your site's robots.txt by appending /robots.txt to your site address - i.e: www.mayura4ever.com/robots.txt


URL Errors


You can view URL errors if Google was unable to crawl your pages on your blog or website. It offers URL errors occurred when crawling the mobile version of your site too.


Server error - It's similar to what we have discussed under Server Connectivity. But here the reporting is specific for individual URLs. GWMT reports you about the URLs that couldn't be crawled successfully due to server-specific errors. You will need to pay attention to the reliability of your hosting partner, if it reports server errors frequently.


Not found - Not found errors will be occurred if Google trying to crawl a page that not existing on your site. Mostly it could be a page that removed from your site. Further Googlebots may visit your pages from external sites via backlinks.

If someone has linked to a page removed from your site or misspelled the URL, it will lead to a non-existing page which occurs the response code 404 aka not found.

Reviewing not found errors is a good opportunity for you to find if someone links to a non-existing page on your site. Why not, it may coming from your own site. Another way to catch some broken links.

Just click on Not found box and you will able to see URLs been identified as non-existent.


Further, clicking on each URL will allow you to explore more details about the error and how Googlebots found that URL.

Jump to Linked from tab to find out backlinks pointing to that specific URL. Warning: You will find some broken links ;) You can click Mark as fixed button after fixing the issue for URL to be disappeared from the list of Not found URLs.



Other - Other errors would be the errors experienced by Google other than server and not found errors. Still they were preventing Google from crawling your pages. For example, the protected content where it requires user credentials to access the content.

If your site has pages not allowed for public and appears beneath Other, you can ignore such URLs.



Crawl Stats


You can find crawling statistics for your blog or website here. I don't think you will need a detailed explanation, as the graphs reflects it all.

Most importantly, you can view number of pages being crawled per day for last 90 days. Further it will allow you to access download information related to the crawling process.



Blocked URLs


Robots.txt is use to instruct search engine bots that what pages are disallowed to crawl on your site. Earlier I've mentioned how you can view your site's robots.txt file manually.

You can see how many pages being blocked from Google through robots.txt file.

Further, you can test your robots.txt file against different URLs of your site and it will show if Googlebots are allowed to crawl a page or not. Here it's better to be familiar with the use of robots.txt file in order to change values and test it out with Googlebots.


It's only for demonstration purpose and won't make changes to actual robots.txt file associated with your blog or website. By default, Googlebot is selected and you can select another Googlebot you may wish to test too.


Once you test out robots.txt against an URL, the results will be shown below.


As you see in above result, the URL has been blocked from Googlebot and not indexed by Google. Simply, that page won't appears in Google search results.



Fetch as Google


This is a very helpful tool if you are experiencing issues for your site or pages in Google search results. Let's say you can't find a specific page listed in Google search results. You can use this tool to crawl it as Google and see if Googlebot can crawl it successfully.


Enter rest of the URL in the given text field or just click Fetch button to crawl homepage of your blog or website.

You can select Web option to crawl as Googlebot or other options to crawl as Googlebot-Mobile. If all fetch statuses come as Success, Google is capable of crawling your page.


You can click on individual fetch statuses to view how Googlebot fetched the particular page too. A detailed report.




Index Status


How many pages are indexed by Google right now? You are most curious to know.

Basically you will see the number of indexed pages at Basic tab. It doesn't include duplicates and pages that has been blocked from indexing. You can switch to Advanced view to find more details.


I find the graphical representation is very helpful. You can see if Google keep indexing your new content or a drop may indicating a problem when indexing your pages.




Malware


Would you aware of the security of your blog or website? If Google identifies your site has been infected by a malware, you will be able to find information here.

No one can add malware on your blog or website without having access to change settings or content of your site. The common reasons could be,

● A plugin / gadget or a code snippet you have added to your site is acting as an active malicious software

● Someone else taking control of your site and adding malicious content. Simply we'd say, you site has been hacked.

You can find a detailed view about the security of your site via Google Safe Browsing Diagnostic page. Replace www.mayura4ever.com by your site address in the URL below and navigate to see the security status.



Enjoy :-)




Awesome! Thanks for coming by and taking your time to read this post :) I hope you have learnt something today. Now you can share it with your friends and I'd love to hear from you too.






Related Posts and Categories


Category, , , , , ,





Comments

* If you got any questions not related to this post, please ask it at our Support section.

* Make sure your comment is genuine and comply with our commenting guidelines.

* Review your email and reply notification settings to receive replies via email.

* Experiencing problems when commenting? Please report them at our Support section.


Harleena Singh said...

Informative post Mayura!

I remember the first part where you explained everything so well in detail, and now this second part too. I do use Google Webmaster Tools quite often, and I agree - this is one tool that every Blogger should use or at least check off and on if not all that frequently.

I too love Fetch as Google feature, and there was a time when my posts weren't getting indexed, so from that time on I use it now and feel it's at least a surety that things are going to work. :) And yes, sometimes I do have those URL errors showing up and wonder where and how they came up - so need to fix those. I guess all this only happens when we go to Webmasters and see these things - isn't it?

Thanks for sharing this with us. Have a nice week ahead :)

Sapna said...

Hi Mayura,


Great info share!


Google webmaster and Google analytics has really become an integral part of my daily rituals and they(analytics)help me in understand the patterns evolving for the organic and referral traffic. I really didn't notice the malware part, I need to check this one as well.


Thanks for this great share.


Sapna

PibblesNMe said...

Wow Mayura! This is an awesome post! I'm bookmarking it for future use as well as to share with others. I haven't dug into any of this as of yet since Wordpress has so many awesome plugins that help ya with it. But in case I want to try, I have a fab tutorial! Thanks Mayura!

Corina Ramos said...

This is awesome Mayura. I logged and it was so easy to find all you explained here. Got some crawl errors to check out that's for sure!


Saving this for sure! You really know your stuff my friend!

Amy Hagerup said...

Absolutely amazing info, my friend. I am going to come back to this so I can follow along step by step as I check out my sites. How do you learn all of this? Thanks!

Mayura De Silva said...

Hi Harleena,

I thought if I publish the series of posts consecutively, then folks gonna fed up of GWMT ;) You know, though I try to make it simple, it's kinda technical and need some time to be familiar with.

Exactly! If we never been to GWMT or a similar tool, we won't discover such errors Harleena :) Before GWMT I thought everything is working well for my blog, but then got to know about crawling errors. Fortunately they were not critical :)

Very good example of fetching Googlebot dear :) There we don't need to contact Google and ask 'em to do that for us. We can do our own.


You can check URLs and see from where they linked from dear :) If you are confident enough that your blog has no broken links, 404 errors can be ignored.


Thanks for coming by and adding more value to the post through your experiences and thoughts on GWMT Harleena :)


Cheers...

Mayura De Silva said...

Hi Sapna,

We, as bloggers, always aware of the security and health of our sites :) So GWMT is a great place to know about security related details and crawling errors experienced by Google dear.

If your site got malware, then Google won't help bringing traffic until you fix it :) It's kind of banning your site for users. So always keep eye on it and check Google Safe Browsing Diagnostic page too :)

Thanks for coming by and contributing your thoughts on this topic Sapna :)


Cheers...

Mayura De Silva said...

Hi Brenda,

Thought of getting started with GWMT dear? ;) Well, you can try Google Safe Browsing Diagnostic page for your blogs rightaway on the aspect of security.

Yeah, there are lot of plugins to check security dear :) As it's about Google and the data coming from reliable sources, you can rely on it too. You are most welcome when you start with GWMT ;)


Thanks for coming over and sharing your thoughts on this topic Brenda :)


Cheers...

Mayura De Silva said...

Hi Corina,


It's glad to hear that I'm not confusing you with the information here :D lol... Great! You have taken the action and it will help you find if your site has some hidden errors such as broken links.


If you fix possible crawling errors, you are allowing search engines to crawl smoothly :)


Thanks for coming by and sharing your thoughts after trying it out Corina :)


Cheers...

Mayura De Silva said...

Hi Amy,

You can learn too dear :) I was digging GWMT when I work with it and especially for a client website. Had to spend time with GWMT documentations too. Internet is full of information to learn anything :)

I hope you will try it out and find how your site's doing with crawling on Google Amy :) Make sure to try out Google Safe Browsing Diagnostic page too.



Thanks for coming by and adding your thoughts on this post Amy :)


Cheers...

kelly thompson said...

do you know what the robot should look like for technorati to be able to access your site?

Mayura De Silva said...

Hi Kelly,

You can check if TechnoratiBot/8.1 mentioned in Robots.txt :)

Cheers...

Neamat Tawadrous said...

Hi Mayura,

Great informative Post as always my friend. Lots to chew on.



I have to bookmark this post and go back to the last two posts and apply the steps there first and then come back to this one. I guess I have lots of homework to do.


Thanks for sharing this great information Mayura. How would we know all of this without you, my friend? You're one of a kind Mayura. God Bless.




Neamat

Sylviane Nuccio said...

Hi Mayura,


Oh gosh, this is very impressive but it also gives me a headache :) Wow, do I have to do all that?


I would have to read it all again slowing and go step by step I guess, and as busy as I am I'd don't know when this will happen. Anyway is all of this really important to do?


This was a great post Mayura. What a great job you've done!

Donna Merrill said...

How much fun is this Mayura.

I went to check my blog at robots.txt and got this:

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/

Must be ok?

Then I wet to Save Browsing and I'm not suspicious, no malware and

Has this site hosted malware?

No, this site has not hosted malicious software over the past 90 days.


I guess this is a good thing right?



Everything else looks fine. Thanks so much. I had fun following this step by step.


Donna

Angela McCall said...

I guess I have no malicious malware. Thanks for explaining this to us step-by-step. Um, I'm not sure why my facebook "photo profile" appears here instead of my gravatar. Um, sometimes this is a pain in the neck.


Angela

Mayura De Silva said...

Hi Neamat,


Absolutely there's a lot dear :) Hey, Don't rush. Take your time and manage settings slowly.


When you get to know about it all, you don't need to learn again as you use GWMT :) It's a long term investment.


Keep your blog healthy and safe :)


Thanks for coming over and sharing your thoughts to the conversation Neamat :)


Cheers...

Mayura De Silva said...

Hi Sylviane,


lol :D Don't get the whole picture and work on it step by step dear. You just need to do few things and most are for the purpose of learning. Bloggers are learners, no?

Well, security is a MUST and you can find if Google crawling your blogs as it should dear :) You might find answers for questions you have too.


Thanks for coming by and sharing your thoughts on this topic Sylviane :)


Cheers...

Mayura De Silva said...

Hi Donna,


That's fun and fun only, isn't it? ;)


You are awesome and glad you took the actions rightaway dear :) See, you are being more of a techie. Should I say a doer?

Of course, there's no problem with your robots.txt and your blog was safe enough Donna :) Keep up with GWMT and you will be notified if you need attention on anything dear.


Thanks for contributing your thoughts and experiences after following the steps Donna :)


Cheers...

Mayura De Silva said...

Hi Angela,

I hope you reviewed the status via Google Safe Browsing Diagnostic page :) As they might not visible for us, it's a great way to find out dear.

Ah... Seems you signed up for Disqus via Facebook account Angela :) Just go to Disqus Avatar settings page and select Gravatar ;)

Thanks for coming by and commenting on this topic Angela :)


Cheers...

Lisa Buben said...

Mayura, great tips. I do this every morning for my retail sites. Question: What can you do if they do not index all your pages? Those crawl errors are so critical to find. Love how you make it in easy to follow steps.

Mayura De Silva said...

Hi Lisa,


Ah... You have lot of retails to check on :) Not only Google but every webmaster tool offered by search engines give us enormous information about our sites.


Well, the Sitemap feature at GWMT is for Googlebots to learn about all the pages you have dear :) I hope you already submitted sitemaps there.


If you have a HTML or XML sitemap with all pages implemented on your sites, every search engine bot would love to crawl 'em more than individual pages and also help discover pages that not in their index.


Both ways are more effective for indexing all pages Lisa :) Well, except for blocked pages.

Thanks for coming by and sharing your thoughts dear :) You always got very helpful questions for the conversation.


Cheers...

Lisa Buben said...

Interesting Mayura, I was checking this a.m. and not all the pages are indexed, what do make of that? And I don't have blocked pages. it is XML sitemaps I use on the retail sites. And my wordpress sites are all indexed. Making me think.

Mayura De Silva said...

Did you check Index Status in GWMT dear? :)

However take a look at XML sitemap and see if it holds all the links to pages Lisa :)


Cheers...

Mosam Gor said...

Hi Mayura,

Great Share!

Detailed Tutorial on Google Webmaster.. Thank for sharing such a Nice Informative post! And yes, you wrote a wonderful article about google webmaster. Keep writing such Articles. Thank you.


Mosam

Angela McCall said...

Hi Mayura, I FIXED it. YAY!!!!!!!!!!!!!!!!!!!!!


Angela

Sue Price said...

Hi Mayura, I went to read this one first but had missed Part 1 so read that first. Now I see that Donna has taken action so maybe I could try it. I laugh Donna and Sylviane are both friends and I am always glad when I see other people who do not like tech stuff either.


I really appreciated that you explained all the definitions here as I have such gaps in my knowledge. Have I ever told you that when I came online I did not know what a browser was? :-) I knew nothing. I had worked in a professional office with IT people and I had a personal assistant so I was very hands off :-)


Thanks for this and I may follow Donna's lead.

Sue

Mayura De Silva said...

Awesome :) Glad you fixed it and enjoy now Angela.


Cheers...

Mayura De Silva said...

Hi Mosam,


I hope you will check out GWMT for your site too and make necessary changes to keep it healthy :) It's a great way to know about the health.


Thanks for coming and sharing your comment mate :)


Cheers...

Mayura De Silva said...

Hi Sue,


Of course ;) See, Donna has made it and she used to claim herself as a non-techie too. But it's wrong. You can follow and check on your site like Donna did. If you need help, you can always come back and ask here :)

lol :D I came online in 2011 and until then I didn't know lot of stuff either Sue. But we can look back and see how far we have come and grasped, no? There's a lot to learn.

Follow and learn Sue :) I bet, you can do it and see yourself.


Thanks for coming over and contributing your thoughts to the discussion Sue :)


Cheers...

Mayura De Silva said...

Hi Sue,

Awesome :) Glad you go there and check for health matters on GWMT after reading.

No dear, as long as it's 404 errors you don't need to worry much unless they are linked from your own blog dear :)


You know, you have no control over link placed on external sites, if they are not owned by you. If they are linked from your own site, possibly it means you have some broken links to fix.

When you check Linked from tab, check for links originating from your site Sue.

However if you can't find a clue, I mean a link to the non-existent page, mark 'em as fixed and see if it appears again under Not found, after few days. If it appears again, then you can start searching again :)



If you need help, you can always come back and ask here Sue :) You will be familiar with it more as you use it further.


Thanks for coming over and adding your comment after following the steps Sue :) Appreciate your question as someone else can use it as a help here.


Cheers...

Mayura De Silva said...

Hi Adrienne,

Nice timing I guess :)

Ah... Broken links never keeps everyone happy as it can lead to a bad user experience. Better to get rid of all we can find.

However the Broken Link tool you have introduced on your blog has some limitations too dear. Remember it checks for broken links on your site only, which matters most.


GWMT shows external misspelled links and links to non-existent pages too. So we can ignore external links that we have no control over :)



I hope your group will find the series helpful to get most out of GWMT dear :)


Thanks for coming over and sharing your thoughts on this topic Adrienne :) I appreciate sharing the posts with your group too.


Cheers...

RobG said...

Hello Mayura, Thank so much for a very helpful post I've learned that we need to know were out sources are coming from this way we can improve the way we do things on line and focus on our building.


Thanks so much for the share..

Sue Neal said...

Thanks very much, Mayura - two are supposed to be coming from my site, so I'll mark them as fixed and see what happens. Should I also mark the one from the external site as fixed or just ignore it?

Mayura De Silva said...

You're welcome dear :)

You can mark it as fixed too :) But probably it will appear again if Google finds the link again Sue.

Sometimes I mark such fixed just to see no errors there :D lol...

Cheers...

Mayura De Silva said...

Hey Rob,



Site health always matters, nah? :) GWMT shows what Googlebots find regarding your sites and you can fix errors it experiences mate.

Yeah, we can find backlinks and broken links but some will be out of our control mate :) Anyway don't need to worry much.


Thanks for coming over and sharing your thoughts and views on GWMT Rob :)


Hope you are doing great over there :)


Cheers...

Sue Neal said...

Cheers, Mayura - will do :)

Oluwaseun Babajide said...

I have been looking for something like this for a while. Very well written and explanatory!
Thanks for sharing.

Seun

Mayura De Silva said...

Hi Oluwaseun,


Glad to hear you can use a help here mate :) Keep checking on your site health and away from critical errors.


Thanks for coming over and sharing your thoughts on this topic :)


Cheers...

Angela McCall said...

I just wanna tell that I finally did it. And now I'm VERIFIED @ Google Webmaster Tools. Obviously their instruction there is very EASY to follow. I just have to FOCUS more on their website there. I tell ya...knowing Google is largest search engine...I'm surprise their website is NOT that great as far as design is concerned. They are RICH!!! And they could afford web designers...I just wish that they are more User Friendly. Anyhoo, I finally did it and that's all it matters!! :)

Mayura De Silva said...

Fantabulous! Really glad you through it and verified Angela :)

Ha ha... I like the simplicity in their sites and actually you need to know of some prerequisites before you start with GWMT. I was confused at first too, but now I feel that's normal as it's more of a technical kinda thing. Of course, they can improve user experience :)

However I wish if there was a default place to enter GWMT verification tag in self-hosted WordPress platform as like in WordPress.com blogs.

Now keep monitoring and check out how GWMT can help you Angela :)

Cheers...

Vikash Khetan said...

Extremely Informative and very nicely explained. I always treat Health section as the most important part in webmasters, because this is the area which raises alarm when anything is wrong with the site, be that 404 errors, server errors, unnecessary indexing and lots more.

Mayura De Silva said...

Hi Vikash,



Absolutely, Site health is a critical and we need to pay attention everytime mate :) With no site, nothing to manage, right?

You are right. GWMT notifies you when there's any problem regarding to your site and need attention. Glad you already watching on those stuff and you will know when there's any health issues and take action rather than letting it go into thin air :)


Thanks for coming over and sharing your views and thoughts on site health Vikash :)

Cheers...

Carolyn Nicander Mohr said...

Hi Mayura, Once again, soooo helpful! I have two questions. First, should we manually add articles to GWMT to be indexed?


Second, if a link is broken, how do we fix it? I used to have the date inserted in my URL but changed it to get rid of that. When my blog moved to a different host, I got different URL's even for the old posts. So how do I do a redirect?

Barry Wells said...

Hey Mayura, you're a god send my friend :)


I've been having issues with Googlebots reporting that they're not able to crawl my set and it's been driving me potty, as stated in my last comment.


However, following your series and checking/completing my settings I go I'm seeing that Googlebots ARE accessing my site and pulling in all the info as detailed in the post.


So far I'm showing a good bill of health (my site not me ha ha) other than the message saying the bots can't access the site.


The fetch as Google was really helpful as i have been concerned about Google not displaying my blog or posts on a variety of searches. However fetching as Google did exactly that and displayed ok..... Still a wonder why Google doesn't display some of my blog posts though :(


On to the next post now :)


Barry

Mayura De Silva said...

Hi Carolyn,

Sorry for being late to reply dear.

No dear, you don't need to add articles manually in GWMT. But it's always best to submit sitemaps as it is much quicker when it comes to indexing. Further you can find if Google couldn't index any of pages on your blog too. You can find more information about submitting sitemaps in Part 4 :)

Mmm... Changing host and URL structure. Especially changing URL structure is a big process to deal with Carolyn. A structural change :)


Actually I think you are worrying about redirecting old URLs. You can use a plugin for few URLs or have to edit .htaccess and write some lines of codes to do the redirect Carolyn. Well, I can't think of an exact code snippet, 'cause I need to get in there, write code and test it out for old URLs :)


After then, you have to remove old URLs from search engines and submit sitemaps, if possible. Else it will take a long time to crawl all the URLs from scratch. You already know how to for Google, from the Part 4 of this series.


Let me know if you need help with redirection Carolyn :)


Thanks for coming by and sharing your thoughts about the post dear :)


Cheers...

Mayura De Silva said...

Hi Barry,


I'm late to reply here. Sorry about that mate.


Cool... So Googlebots can crawl your pages without any errors. It's really nice of you to check it out. I think, you probably need to submit sitemap to Google via GWMT Barry. Submit and check if it indicates that all the URLs were indexed or there's more to be indexed.


Right after that, we can check for other possible causes mate :)


Let me know if you having issues after submitting sitemaps Barry :) Well, it takes some time to index all though. You can see in GWMT.


Thanks for coming over and sharing your thoughts and experiences you had with your blog Barry :) I'm really glad to see you are make use of the series mate.


Cheers...