Monday, May 30, 2011

How To Make Your AJAX Site Crawlable: 3 Simple Steps

One of the most important aspects of any search engine optimized website is that its content is accessible to search engine bots.  If your site is using javascript there is a good chance that some of your content is not being indexed.  This will result in lower rankings on google and other search engines.

A Google Solution
Luckily Google has a specification for Making AJAX Applications Crawlable.  My implementation of the specification is outlined in three steps below:

1) Hash Fragments That Begin With !

The first requirement is that all javascript-based links (i.e., links that perform some ajax call and return content as a result) should contain '!' as the first character of the hash fragment:

//example Link from my site
<a href="#!about">about</a>

For history management I'm using jquery address with crawlable set to true.  You will need to make sure that your javascript code is able to appropriately handle links of this nature.

2) Create Static HTML Pages

In order for this solution to work you need some html to map to when  requests are made by Google-bot.  For each AJAX link from step 1, generate a separate html page.  For simplicity, name the html page whatever name you have given the hash fragment (i.e., #!about => about.html).  The content of each .html page should contain whatever results from clicking the AJAX link.  NOTE: These static pages will be used only for bots.

3) Handle "_escaped_fragment_" Server Side

There is a slight bit of ambiguity in Google's specification about mapping between #! and _escaped_fragment_.  All this really boils down to is that when Google-bot is scanning your site, urls with #! will get replaced by ?_escaped_fragment_=.  In other words, if you have a url containing #!<some_value>, you can be assured that when Google-bot attempts to index this url, a request containing parameter '_escaped_fragment_' with value '<some_value>' will be made to your server.  My primary jsp that handles all requests before doing anything else will look for this parameter and redirect if necessary as follows:


<%
if(request.getParameter("_escaped_fragment_")!=null) {
String escapedFragment = request.getParameter("_escaped_fragment_");
String decodedEscapedFrag = URLDecoder.decode(escapedFragment,"UTF-8");
response.sendRedirect(App.context()+decodedEscapedFrag+".html");
return;
}
%>



In the above code snippet notice that I am appending '.html' to the end of the decodedEscapedFragment.  This code takes the hash fragment and redirects to the appropriate static html page.  For instance, if the link clicked has href="#!about", then request.getParameter("_escaped_fragment_") will be equal to "about".  The resulting redirect will have a value of "/about.html"

Test It Out!
Now that you are finished, there are a couple ways to see if you're content is now accessible.  For each URL containing your new hash fragment, replace "#!" with "?_escaped_fragment_=", and make sure your static html page is returned by your server.  You can try this out at my website.  You'll notice that http://dsswebdesign.com/#!/about can be changed to http://dsswebdesign.com/?_escaped_fragment_=about, and the page looks the same.

A second way to test this is to add your site to Webmaster Tools.  Under diagnostics there is an option to 'Fetch as Googlebot'.  From there you can make sure that each of your #! links are reached successfully by Google-bot.

23 comments:

  1. Hi,
    cool article.
    I am currently playing around with jquery address and wanted to ask you a few questions.
    1.You wrote, that the links in the html have to look like "#!index" but on their demo pages the links all look like this "/crawling?_escaped_fragment_=%2F%3Fpage%3D%2Fgetting-started".
    I am a bit confused?

    2. The jsp code you use to redirect the requests - could it be written in inside a .htaccess? If so, how would it look like?
    Thnx!

    ReplyDelete
  2. 1) If you inspect the code from the JQuery Address Crawling demo (http://www.asual.com/jquery/address/samples/crawling) you will see that the href attributes begin with "#!" (ie, #!/?page=/getting-started). You should never see "_escaped_fragment_=" in the URL unless you request it explicitly. The example on the Asual site paints an unclear picture of how to make your site crawlable because it unnecessarily passes "_escaped_fragment_" to the server as a request parameter via javascript. The key points to remember about the google-bot is it will not execute javascript but it will send "_escaped_fragment_" request param when it finds hrefs with "#!". Therefore, your code should only include "_escaped_fragment_" on the server (it's not needed in your client code).

    2) I believe there is a way to specify redirects in .htaccess based on request parameters. You should be able to use a RewriteRule with a conditional that includes %{QUERY_STRING}. Have a look at this Apache documentation for more details: http://httpd.apache.org/docs/trunk/mod/mod_rewrite.html#rewriterule

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Good website! I really love how it is easy on my eyes and the data are well wrisbobet
    +tten. I’m wondering how I could be notified when a new post has been made. I hsbo ave subscribed to your RSS which must do the trick! Have a great day!

    ReplyDelete
  5. I countenance, I make not been on this webpage in a ensbobetdless time? withal it was added feeling to see It is such an vital content and ignored by so numerous, alter professionals. I thsboank you to service making grouping more alive of practical issueExcellent whatsis as exemplary.

    ReplyDelete
  6. Once again great post. You seem to have a good understanding of these themes.When I entering your blog,I felt this . Come on and keep writting your blog will be more attractive. To Your Success!




    Classic Dresses
    Classic Bridesmaid Dresses
    Wedding Dresses with Sleeves

    ReplyDelete
  7. Once again great post. You seem to have a good understanding of these themes.When I entering your blog,I felt this . Come on and keep writting your blog will be more attractive. To Your Success!
    ----------------------------------------------------------
    Flower Girl Dresses|Empire Wedding Dresses|New Style Wedding Dresses

    ReplyDelete
    Replies
    1. I like to make friends with you,haha.


      ----------------------------------------------------------------------------------------------------------------------------------------
      Rc Helicopter|Mini Rc Helicopter|Rc Helicopters

      Delete
  8. Hey very good blog!!!! Wow... Gorgeous .. Amazing

    ReplyDelete
  9. HTML5 has a built kept in storage potential that allows it to store client databases off-line, cache data files and this is considered to be the most amazing features of HTML5 Development
    services.

    ReplyDelete
  10. Great info! Thank you for the post. Really it will b helpful for me. I really love to read such articles for you share different body of knowledge that people should know.


    web applications development

    ReplyDelete
  11. I always enjoy reading such posts which provide knowledge based information like this blog. There are many person searching about that now they will find enough resources by your post.
    website design

    ReplyDelete
  12. Thanks for making me aware as how I can make my ajax website crawlable, and that too in 3 steps. This is amazing.....
    Web Design Company India

    ReplyDelete
  13. I appreciate this post. In a very simple way, everyone can easily understand. In fact satisfy each client by providing them with a prominent market existence, Dynamic website development and management integrated with usable applications for greater client interactivity.

    web application developers

    ReplyDelete
  14. Thanks so much for this helpful information come back again for more interesting information…Keep it up
    Builders in Agra

    ReplyDelete
  15. HTML5 Developmentoffers compelling benefits to users who would like to make their presence felt in a distinctive way in a world of web sites and technologies.

    ReplyDelete
  16. How To Make Your AJAX Site Crawlable: 3 Simple Steps is a so nice post. I like your post.Thanks for your post.

    ReplyDelete
  17. Hey nice blog,Thank's for this helpful information come back again for more interesting information…Keep it up!
    Taxi in Agra

    ReplyDelete
  18. AJAX link from step 1, generate a separate html page. For simplicity, name the html page whatever name you have given the hash fragment (i.e., #!about => about.html)

    Keep doing you are doing great job...thanks!.................

    Toranto web designer cost | this is my rss

    ReplyDelete
  19. I went over this website and I believe you have a lot of wonderful information, saved to my bookmarks
    SignatureInfotech

    ReplyDelete
  20. Your website is really cool and this is a great inspiring article.
    web design company singapore

    ReplyDelete