This post is also available in:
Español
Italiano
Table of Contents
How to handle escaped fragment for AJAX crawlers for better SEO?
INFRASTRUCTURES FOR JAVASCRIPTS IN WEBSITE OPERATION
Unlike a traditional website, which uses the client-server communication paradigm, a JavaScript application (Ajax and other web apps) uses the “client-side rendering” approach, which entails:
The layout of the page, CSS, and JavaScript are loaded in the initial request. Some or all of the material is frequently not loaded. The JavaScript then sends a second request to the server, which is responded to in JSON, resulting in the necessary HTML being generated.
The use of JavaScript frameworks allows developers to construct online apps that are accessible from a single web page; this is referred to as an SPA (Single Page Application). Take, for example, Twitter, which is a mobile and desktop application.
However, because the bots provide signals that are connected with many sites, this strategy is incompatible with SEO.
The SSR version design allows for server-side rendition, which means that the user will see a complete page in his browser, with HTML created in Javascript by the server rather than the browser.
On the client side, Google's BackBone, Knockout.JS, Ember.JS, and AngularJS are now in competition.
Several issues arise as a result of this development strategy.
*1* The bot downloads the HTML code and evaluates the material included in the code to explore a whole JS site.
With the exception of Googlebot, which uses a headless browser, most JS and Ajax content is ignored.
*2* The second challenge is to decide what to index. Interactions cause the material to change on a frequent basis.
What are the signals? How do you evaluate a page's links and content?
What do these signals have to do with each other? How can we continue to utilize the present method if the concept of a page connected to a URL has vanished?
*3* Challenge #3: A site can only be rendered in full AngularJS if the JS is enabled; else, the site is just invisible. Because the source code only shows variables, it is also difficult for search engines to access the contents of these sites in order to index them.
As a result, it's not unexpected to observe a major influence on these sites' SEO visibility…
As a result, it's not unexpected to notice a major influence on these sites' SEO exposure.

Can Google Crawl AJAX? AJAX SEO
AJAX is popular among web developers who use popular frameworks like React and Angular to create Single Page Applications. AJAX may be used to build an interactive, user-friendly, and seamless online application that functions similarly to a dedicated desktop program.
However, AJAX sites' user-friendliness and usefulness come at the expense of your site's SEO. AJAX isn't that different from Flash for search engine crawlers, especially when it comes to broker website navigation, which loads all pages at the same URL, cloaking difficulties, and useless back, next, and refresh buttons.
However, AJAX's reputation is changing, as major search engines are developing solutions for webmasters and assisting in the improvement of user experiences. So yet, this has gotten little attention, and most AJAX websites are still unoptimized or unindexed in Google search results. This advice should assist you in getting your site featured if it is one of the numerous.
Google engineers have been working tirelessly to improve the crawling and indexing of AJAX pages, and have come up with a number of solutions in the process, including the escaped fragment in the URL, which helps thousands of AJAX sites gain Google visibility, and more recently, they've begun pre-rendering these pages on Google's side for a much better user experience.
We now know that Google reads the JS that creates the title, meta description, robots, and meta tags: support test.
To recognize a mobile-friendly page or discover hidden content, Google understands CSS and JS files.
Google can read Onclick events, which means it can index dynamically created material in the JS.
However, it is erroneous to believe that Google can read everything. The engine does not detect any indication for how to initiate Ajax because it does not require events to be loaded.
In 2009, Google introduced the hashbang (#!) Technique to better crawl and index Ajax. When the robot comes across a URL that contains a hashbang, it will crawl the page and replace the hashbang with escaped fragment before indexing the page in its original form. However, this strategy is now obsolete because Google claims to be able to understand Ajax-generated material…
Because the dynamic surfing URLs remain the same, the IP historyHTML5 pushstate technique provides a better long-term option. Because it is included in HTML5, it also has the advantage of allowing navigation even when the Internet user disables JavaScript.
Pushstate () modifies the path of the URL that shows in the address bar of the user's browser. It's ideal for SEO since the engines can scan these URLs and distinguish them from one another.
You should also be aware that Google suggests using Pushstate to make your infinite scroll SEO friendly.

Prerender of an Escaped Fragment
Each of the crawler queries is checked by prerender.io's middleware, which is placed on a server. If the middleware detects that the request is coming from a crawler, it sends a request for the page's static HTML. If this is not the case, the request is sent to conventional server routes.
When utilizing AJAX to develop websites, SEOs frequently confront a difficulty. These sites provide a wonderful user experience while loading material into the page more faster. However, Google does not index such websites, causing the site's SEO to suffer. Thankfully, Google has proposed a solution that allows webmasters to have the best of all worlds. The websites that implement the plan should create two versions of their content – one for mobile and one for desktop.
Users with JS enabled and a ‘AJAX-style' URL
Traditional static URLs for search engines
The new ‘hashbang' protocol mandates the use of a hash and an exclamation mark – #! When you include the hashbang in a page URL, Google recognizes that you're following the protocol and treats the URL differently. It accepts everything following the hashbang and passes it as a URL parameter to the website. For the parameter, they use the term escaped fragment. The URL will then be rewritten, and Google will request content from the static page.
For the solution to work, there are two requirements. The first step is for the site to sign up for the AJAX crawling scheme, which allows the crawler to request ugly URLs. A trigger is added to the page head to do this. If the page does not include hashbang but does have the directive in the head, the directive is evaded.
To conclude, there are a variety of strategies to make your AJAX website properly positioned in Google. It all starts with notifying Google that your pages are displayed using Javascript, after which Google asks your URL escaping fragment arguments, and finally you deliver static HTML for the crawlers.
It may appear difficult, but there are tools to make it easier, and Google itself has various instructions and tools to assist you.

JAVASCRIPT FRAMEWORK FOR SITE POSITIONING
It is feasible to crawl and index a site built with the JavaScript framework, but what interests us is how to position our pages.
First, if the Fetch as Google test does not get the desired results. There is an issue.
The answer may be found in the headless browser: A “headless” browser is a command-line software environment with a JavaScript API that can render a complete HTML page by executing HTML, CSS, and JS in the same way that a browser does. We call it headless because it lacks a graphical user interface (user interface).
We can test the programs created by the JS and Ajax frameworks by simulating everything that happens in the browser.
Note that with webanalytics systems, this sort of crawl generates “false visits” (because they execute all the scripts without exception).
Phantom JS and Casper JS are both headless browsers, with the latter having the added feature of scraping.
Screaming Frog in ” JavaScript crawl ” mode may also be used to audit.
Screaming Frog is an SEO crawler that recently included a PhantomJS-based “headless browser” mode. As a result, he can crawl a whole Angular site.
In the spider configuration, look at the JS, CSS, and images.
Botify has also offered this option since January 2017.
Solution 1: Use a progressive improvement technique to code differently.
The progressive improvement approach entails constructing your site in three layers:
The HTML code contains the material (content and functionalities accessible to all).
External cascading style sheets control the formatting (CSS).
Features written in a third-party script, such as JavaScript.
Solution 2: Make your own HTML snapshots. For Ajax, Google recommends this solution.
In summary, the JS code is performed on the server side via a headless browser to generate the HTML generated by the JS code.
This code is “caught” before being delivered to the Internet user's browser, just like a regular HTML page.
This strategy, however, loses some of the appeal of “client side rendering,” as the page becomes static.
Sending pictures to search engine bots solely is a common arbitrage.
Solution 3: Use a third-party prerendition server as a third option. Crawling our site is made simpler by document pre-rendering technologies. When a crawler visits your site, a prerendition server allows you to pre-render your pages static, and it will show the final static HTML rendering to it as soon as the tool recognizes it.
Prerender, SEO 4 Ajax (Cocorico! ), and Brombone are the most popular third-party prerendition solutions.
Keep in mind that this approach might appear to be cloaking even if it isn't.
In addition, rendition servers make Ajax crawlable using the deprecated approach. To be aware of:
Alternatively, there's the approach of hash bangs and escaping pieces (as seen previously).
Alternatively, you can use the escaping fragments approach with a meta tag like this: meta name = “fragment” content = “!”>
Be aware that this HTML5 approach, which uses the pushstate () function, is just as outdated as the others. For the time being, this strategy works on all search engines. However, until then,

What woold you learn from this topic ?
I'm struggling to make AJAX-based website SEO-friendly. As recommended in tutorials on the web, I've added “pretty” href
attributes to links: <a href="#!site=contact" data-id="contact" class="navlink">контакт</a>
and, in a div where content is loaded with AJAX by default, a PHP script for crawlers:$files = glob('./pages/*.php'); foreach ($files as &$file) { $file = substr($file, 8, -4); } if (isset($_GET['site'])) { if (in_array($_GET['site'], $files)) { include ("./pages/".$_GET['site'].".php"); } }
I have a feeling that at the beginning I need to additionaly cut the _escaped_fragment_=
part from (...)/index.php?_escaped_fragment_=site=about
because otherwise the script won't be able to GET
the site
value from URL , am I right?
but, anyway, how do I know that the crawler transforms pretty links (those with #!
) to ugly links (containing ?_escaped_fragment_=
)? I've been told that it happens automatically and I don't need to provide this mapping, but Fetch as Googlebot doesn't provide me with any information about what happens to URL.
Google bot will automatically query for ?_escaped_fragment_=
urls.
So from www.example.com/index.php#!site=about
Google bot will query: www.example.com/index.php?_escaped_fragment_=site=about
On PHP site you will get it as $_GET['_escaped_fragment_'] = "site=about"
If you want to get the value of the “site” you need to do something like this:if(isset($_GET['_escaped_fragment_'])){ $escaped = explode("=", $_GET['_escaped_fragment_']); if(isset($escaped[1]) && in_array($escaped[1], $files)){ include ("./pages/".$escaped[1].".php"); } }