HTML5 And CSS3: Web Design/Development: How To Make Your AJAX Site Crawlable: 3 Simple Steps

Monday, May 30, 2011

How To Make Your AJAX Site Crawlable: 3 Simple Steps

One of the most important aspects of any search engine optimized website is that its content is accessible to search engine bots. If your site is using javascript there is a good chance that some of your content is not being indexed. This will result in lower rankings on google and other search engines.

A Google Solution
Luckily Google has a specification for Making AJAX Applications Crawlable. My implementation of the specification is outlined in three steps below:

1) Hash Fragments That Begin With !

The first requirement is that all javascript-based links (i.e., links that perform some ajax call and return content as a result) should contain '!' as the first character of the hash fragment:

//example Link from my site
<a href="#!about">about</a>

For history management I'm using jquery address with crawlable set to true. You will need to make sure that your javascript code is able to appropriately handle links of this nature.

2) Create Static HTML Pages

In order for this solution to work you need some html to map to when requests are made by Google-bot. For each AJAX link from step 1, generate a separate html page. For simplicity, name the html page whatever name you have given the hash fragment (i.e., #!about => about.html). The content of each .html page should contain whatever results from clicking the AJAX link. NOTE: These static pages will be used only for bots.

3) Handle "_escaped_fragment_" Server Side

There is a slight bit of ambiguity in Google's specification about mapping between #! and _escaped_fragment_. All this really boils down to is that when Google-bot is scanning your site, urls with #! will get replaced by ?_escaped_fragment_=. In other words, if you have a url containing #!<some_value>, you can be assured that when Google-bot attempts to index this url, a request containing parameter '_escaped_fragment_' with value '<some_value>' will be made to your server. My primary jsp that handles all requests before doing anything else will look for this parameter and redirect if necessary as follows:



 <%

if(request.getParameter("_escaped_fragment_")!=null) {

String escapedFragment = request.getParameter("_escaped_fragment_");

String decodedEscapedFrag = URLDecoder.decode(escapedFragment,"UTF-8");

 response.sendRedirect(App.context()+decodedEscapedFrag+".html");

return;

}

%>

In the above code snippet notice that I am appending '.html' to the end of the decodedEscapedFragment. This code takes the hash fragment and redirects to the appropriate static html page. For instance, if the link clicked has href="#!about", then request.getParameter("_escaped_fragment_") will be equal to "about". The resulting redirect will have a value of "/about.html"

Test It Out!
Now that you are finished, there are a couple ways to see if you're content is now accessible. For each URL containing your new hash fragment, replace "#!" with "?_escaped_fragment_=", and make sure your static html page is returned by your server. You can try this out at my website. You'll notice that http://dsswebdesign.com/#!/about can be changed to http://dsswebdesign.com/?_escaped_fragment_=about, and the page looks the same.

A second way to test this is to add your site to Webmaster Tools. Under diagnostics there is an option to 'Fetch as Googlebot'. From there you can make sure that each of your #! links are reached successfully by Google-bot.

17 comments:

AnonymousJuly 27, 2011 at 2:45 PM
Hi,
cool article.
I am currently playing around with jquery address and wanted to ask you a few questions.
1.You wrote, that the links in the html have to look like "#!index" but on their demo pages the links all look like this "/crawling?_escaped_fragment_=%2F%3Fpage%3D%2Fgetting-started".
I am a bit confused?

2. The jsp code you use to redirect the requests - could it be written in inside a .htaccess? If so, how would it look like?
Thnx!
ReplyDelete
Replies
Daniel S. SimonsJuly 27, 2011 at 3:36 PM
1) If you inspect the code from the JQuery Address Crawling demo (http://www.asual.com/jquery/address/samples/crawling) you will see that the href attributes begin with "#!" (ie, #!/?page=/getting-started). You should never see "_escaped_fragment_=" in the URL unless you request it explicitly. The example on the Asual site paints an unclear picture of how to make your site crawlable because it unnecessarily passes "_escaped_fragment_" to the server as a request parameter via javascript. The key points to remember about the google-bot is it will not execute javascript but it will send "_escaped_fragment_" request param when it finds hrefs with "#!". Therefore, your code should only include "_escaped_fragment_" on the server (it's not needed in your client code).

2) I believe there is a way to specify redirects in .htaccess based on request parameters. You should be able to use a RewriteRule with a conditional that includes %{QUERY_STRING}. Have a look at this Apache documentation for more details: http://httpd.apache.org/docs/trunk/mod/mod_rewrite.html#rewriterule
ReplyDelete
Replies
AnonymousSeptember 6, 2011 at 5:34 AM
Thanks for your reply!
ReplyDelete
Replies
JohnDecember 24, 2011 at 8:35 AM
This comment has been removed by the author.
ReplyDelete
Replies
company web design in HoustonJuly 3, 2012 at 5:29 AM
Hey very good blog!!!! Wow... Gorgeous .. Amazing
ReplyDelete
Replies
UnknownJanuary 8, 2013 at 5:20 AM
HTML5 has a built kept in storage potential that allows it to store client databases off-line, cache data files and this is considered to be the most amazing features of HTML5 Development
services.
ReplyDelete
Replies
UnknownMay 30, 2013 at 4:26 AM
Great info! Thank you for the post. Really it will b helpful for me. I really love to read such articles for you share different body of knowledge that people should know.

web applications development
ReplyDelete
Replies
AnonymousJune 3, 2013 at 1:35 AM
I always enjoy reading such posts which provide knowledge based information like this blog. There are many person searching about that now they will find enough resources by your post.
website design
ReplyDelete
Replies
UnknownJune 4, 2013 at 4:31 AM
Thanks for making me aware as how I can make my ajax website crawlable, and that too in 3 steps. This is amazing.....
Web Design Company India
ReplyDelete
Replies
UnknownJune 6, 2013 at 5:43 AM
I appreciate this post. In a very simple way, everyone can easily understand. In fact satisfy each client by providing them with a prominent market existence, Dynamic website development and management integrated with usable applications for greater client interactivity.

web application developers
ReplyDelete
Replies
UnknownSeptember 25, 2013 at 6:48 AM
HTML5 Developmentoffers compelling benefits to users who would like to make their presence felt in a distinctive way in a world of web sites and technologies.
ReplyDelete
Replies
Best ipad menusDecember 8, 2013 at 9:38 AM
How To Make Your AJAX Site Crawlable: 3 Simple Steps is a so nice post. I like your post.Thanks for your post.
ReplyDelete
Replies
fazeelFebruary 3, 2014 at 2:12 PM
Nice post.. Thnx for sharing 3D Wallpapers
ReplyDelete
Replies
Steeve FrankApril 29, 2014 at 1:31 PM
AJAX link from step 1, generate a separate html page. For simplicity, name the html page whatever name you have given the hash fragment (i.e., #!about => about.html)

Keep doing you are doing great job...thanks!.................

Toranto web designer cost | this is my rss

ReplyDelete
Replies
UnknownDecember 22, 2014 at 8:47 AM
I went over this website and I believe you have a lot of wonderful information, saved to my bookmarks
SignatureInfotech
ReplyDelete
Replies
historypakOctober 7, 2015 at 4:37 AM
Your website is really cool and this is a great inspiring article.
web design company singapore
ReplyDelete
Replies
Muhammad HassanMarch 14, 2018 at 1:34 PM
I think this is an informative post and it is very useful and knowledgeable. therefore, I would like to thank you for the efforts you have made in writing this article. Ajax Party Bus
ReplyDelete
Replies

Add comment