The Checkbot SEO Guide covers everything you need to know to optimise the on-page SEO of your website. By following this guide, your pages will rank higher in search results, users will be more likely to click your links in search listings and visitors will get more out of your page content. We'll cover topics such as how to use HTML to structure your pages in a machine readable way, best practices for writing human readable URLs and guidelines on how to configure your site to be search crawler friendly.
Search engine optimization is often about making small modifications to parts of your website. When viewed individually, these changes might seem like incremental improvements, but when combined with other optimizations, they could have a noticeable impact on your site’s user experience and performance in organic search results.
Every page on your site should be given a concise, informative and unique title to improve your search rank and search result click rates.
Titles are critical to giving users a quick insight into the content of a result and why it’s relevant to their query.
Set page titles
Every page should be given a title that describes its content. Well-written titles are vital for improving the search rank of pages because search engines look for keywords in titles to determine how relevant pages are to search queries. Titles are also critical for improving click through rates as titles are displayed in search results and when pages are shared on social networks. Page titles are set using HTML by adding a
<title> tag such as
<title>Page title</title> inside the
<head> tag of each page.
Use optimal length titles
Every page should have a title that isn’t too long or too short. Well-written page titles of a suitable length will help your pages stand out in search results and help search engines understand what your pages are about. Short titles are likely lacking in enough information for both users and search engines. Long titles aren’t displayed in full in search results which can be unhelpful to users. We recommend page titles are between 10 and 60 characters.
Use unique titles
Each page should have a title that isn’t used anywhere else on your site. As titles are prominently displayed in search results, showing results with the same title makes it difficult for users to decide which result is the most relevant to them. Duplicate titles also make it difficult for search engines to determine which page is most relevant to a search query. As each indexable page on your site should contain unique content, you should be able to eliminate duplicate titles by giving each page a more accurate and specific title.
Headings should be added to pages to give their content a hierarchical structure. This helps give search engines and users a better understanding of what each page contains.
Similar to writing an outline for a large paper, put some thought into what the main points and sub-points of the content on the page will be and decide where to use heading tags appropriately.
Set H1 headings
Each page should have a descriptive H1 heading to help search engines and users understand what that page contains. Headings can be added to web pages to give structure to the content in the same way headings are used in books and articles. The most important and highest ranking heading in HTML is called the H1 heading. This is followed down in level of importance by H2, H3, H4, H5 and H6 headings. The H1 heading is like the title heading of an article and similarly should give an accurate and concise description of the entire document to help guide readers. The keywords used in page headings are also treated as a ranking signal by search engines. H1 headings are added to the HTML of a page with an
<h1> tag such as
Use one H1 heading per page
Try to use only one H1 heading on each page to clearly indicate to users and search engines what the topic of the page is. Multiple H1 headings make it difficult for readers to know which heading is suppose to be the one that gives a top-level description of the whole document. Using a single H1 heading leaves no room for confusion. Bing and Mozilla both recommend to only use a single H1 heading per page to help convey document structure. Mozilla specifically mention to ignore HTML5’s proposed “outline algorithm” which allows for multiple H1 headings per page because the outline algorithm hasn’t been widely adopted by browsers or screen readers. We note however that Google only go as far as recommending you use headings in a way that appropriately describes the hierarchical structure of your content. Fix the potential confusion caused with having multiple
<h1> tags on a page by first chosing one heading to be your main one. You should then recorganise your other headings using
<h6> subheadings to give an accurate headings hierarchy to your document.
Use optimal length H1 headings
Each page should have an H1 heading that isn’t excessively long to make it easier for users and search engines to understand the topic of the page. Google specifically recommends avoiding overly long headings so readers can scan your content more easily. Excessively long headings can also be an indication that paragraph text has been unhelpfully tagged as a heading. We recommend making H1 headings no longer than 70 characters.
Use unique H1 headings
Each page should have an H1 heading that is unique between all other pages on the site to avoid duplicate content issues. Each indexable page on a site should have unique content and each indexable page should have an H1 heading that accurately describes the topic of that page. This means each indexable page should have an H1 heading that is unique to that page. Duplicate H1 headings can indicate duplicate content issues and as H1 headings can influence search rankings duplicate headings are a lost opportunity to signal the topic of your pages to search engines. If the same H1 heading is being shared between pages, you can usually resolve this issue by changing each heading to more accurately describe the page the heading is attached to.
Every page on your site should be given an informative, concise and unique description.
Google will sometimes use the meta description of a page in search results snippets, if we think it gives users a more accurate description than would be possible purely from the on-page content.
Set page descriptions
Every page should have a description that summarises its contents. Page descriptions are displayed in search results and when pages are shared on social media so good descriptions can help improve click-through rates. Keep in mind however that search engines will show their own automatically generated page snippet over your descriptions if they think it will be more relevant to the current search query. Further to this, Google says that page descriptions are not a ranking factor. To set a description for a page, add a description meta tag such as
<meta name="description" content="Page description."> to the
<head> tag of the page.
Use optimal length descriptions
Page descriptions shouldn’t be too long or too short. Long page descriptions will only be partially shown in search results and short descriptions are unlikely to to be helpful to users. We recommend page descriptions are between 100 and 320 characters.
Use unique descriptions
Every page should have a description that isn’t used anywhere else on the site. Similar to page titles, it’s unhelpful for users to see duplicate page descriptions in search results and when many pages share the same description that description is less likely to be shown. Google says it’s better to give no description for a page than to have many inaccurate and duplicate descriptions but you should make sure your important pages have well-written unique descriptions.
Duplicate page content should be avoided as you will get less control over how your search results are displayed and how backlinks are consolidated.
If you have a single page accessible by multiple URLs … Google sees these as duplicate versions of the same page. Google will choose one URL as the canonical version and crawl that, and all other URLs will be considered duplicate URLs and crawled less often.
Set canonical URLs
All pages should specify a valid canonical URL to get more control over how duplicate URLs are treated by search engines. When a set of URLs on your site return duplicate or near duplicate content, search engines will select a single definitive URL for that content called the canonical URL. This URL will be crawled more often, will take priority in search results over URLs with duplicate content and search rank boosting backlinks to the URLs with duplicate content will be viewed as linking to the canonical URL. Note that “self-canonicalizing” a page by setting its canonical URL to itself is both valid and useful as it can help eliminate potential duplicates such as when pages may be linked to with tracking URL parameters. To suggest the canonical URL for a page you can 1) add a
<link rel="canonical" href="…"> tag inside the page’s
<head> tag (most common) or 2) add a
Link: <…>; rel="canonical" header to the page’s response headers. Google suggest giving absolute canonical URLs over relative ones. Search engines are likely to ignore your canonical URL suggestion if you 1) include multiple canonical URL suggestions per page or 2) suggest a URL that is broken, redirects, isn’t indexable or isn’t itself canonical. Keep in mind that for exact duplicates you should consider if it is more approriate to use 301 redirects over canonical URLs to consolidate duplicates instead.
Avoid duplicate page content
Every page should provide unique content that doesn’t appear elsewhere on the site. Search engines are likely to choose not to display pages in search results that are too similar as showing duplicates entries in search results is unhelpful to users. Duplicate pages can also reduce the search rank benefit of backlinks because it’s better to have backlinks to a single URL compared to backlinks spread over a set of duplicate page URLs. Crawling duplicates will also use up the resources search crawlers allocate to crawling your site which means important pages might not be indexed. You can eliminate sets of duplicate pages by consolidating them to a single URL using redirects or canonical tags.
Pages should contain substantial, unique and high-quality content that works well on mobile devices and has accessibility in mind.
Thin content with little or no added value - If you see this message … it means that Google has detected low-quality pages or shallow pages on your site.
Avoid thin content pages
Prefer information-rich pages over pages that lack content. Search engines will penalise pages it thinks don’t provide enough value as visitors prefer informative and high quality results. Good content will also naturally improve search rankings by attracting more backlinks and social shares. We recommend a minimum of 300 words per page as a rough guideline for identifying pages that are lacking in content.
Set image ALT text
Every image included on a page using
<img> tags should be given an accurate description using
alt attributes. Well-written
alt text can improve search rankings because search engines will check these for relevant keywords. Providing
alt text is also important for accessibility because assistive technologies like screen readers rely on this text as an alternative to displaying images. It’s particularly important for links that only contain an image to have descriptive
alt text so screen readers and search engines are able to understand what is being linked to. You can set
alt text by adding an
alt attribute to each image tag. For example
<img src="example.png" alt="Description">. When an image is purely decorative, you should set the
alt attribute to empty (
alt="") so assistive technologies know to ignore that image.
Set mobile scaling
Set mobile page scaling properties on each page so your pages are mobile-friendly. Mobile browsers will by default try to show pages at desktop screen widths which will be hard to read and require manual zooming by the viewer. You should instead indicate to mobile browsers using the
viewport meta tag that a page should adjust its content to match the width of the device. This also signals to search engines that your page is mobile friendly which will boost search rankings on mobiles. A reasonable default tag to use is
<meta name="viewport" content="width=device-width, initial-scale=1"> which sets the page width to the device screen width with the current zoom level set to 100%.
Avoid the use of browser plugins to display content and prefer cross-browser solutions instead. Browser plugins such as Java, Flash, ActiveX and Silverlight can be used to add dynamic content to pages using the
<applet> tags. However, plugin usage should be avoided because users who don’t have the required plugins installed won’t be able to view all of your content and some plugins aren’t even available on all platforms. Similarly, search engines may not be equipped to index content that requires plugins. Try to replace plugin usage with solutions that work on most browsers by default and are well supported on mobiles. For example, using Flash (which isn’t available on mobiles) for playing videos should be avoided in favour of the HTML
<video> tag as this tag is widely supported on all platforms and can be understood by search engines.
Each page should have a well-written URL that is short, accurate and friendly for humans to read.
A site’s URL structure should be as simple as possible. Consider organizing your content so that URLs are constructed logically and in a manner that is most intelligible to humans …
Use short URLs
Prefer short but accurate URLs for your pages. Short URLs are more appealing to users in search results, are easier to remember and are simpler to type without making mistakes. Try to keep page URLs short while still making sure they accurately describe the content of each page. For instance, the URL
example.com/how-to-cook-a-whole-roast-chicken could perhaps be better written as
example.com/roast-chicken-recipe. We recommend keeping URLs under 100 characters.
Avoid underscores in URLs
Words in URLs should be separated by hyphens and not underscores. Google recommends this approach for making URLs more human friendly. In particular, words joined with underscores may be viewed as a single word during searches which is rarely what you want. For example, the URL
example.com/pc_laptop_reviews would be better written as
example.com/pc-laptop-reviews by using hyphens.
Avoid URL extensions
Avoid adding unnecessary extensions at the end of URLs. Common extensions that appear at the end of URLs are
.asp. Extensions are usually linked to what backend technology is being used to serve the page. This is rarely relevant to users, could change in the future and make URLs more lengthy. When displayed in search results, URL extensions are usually meaningless and distracting to users so more human-readable URLs without extensions should be preferred. For instance, a URL with an extension such as
example.com/gallery.html could be rewritten as
Avoid URL parameters
Prefer simpler URLs by avoiding the use of URL parameters where possible. For example, in the URL
example.com/forum?topic=tv-shows the URL parameter
topic=tv-shows can make the URL look unfriendly and complex in search results. Try to eliminate URL parameters where possible such as by using subfolders instead of URL parameters. For instance, the previous example could be written as
Avoid symbols in URLs
Avoid the use of symbols in URLs and prefer more human readable alternatives. Symbols such as
* appearing in URLs can make your search listings look less appealing. URLs generally only need to include letters, digits, slashes and hyphens. For example, the word separators in
example.com/john+smith%20interview could be replaced by hyphens to create the URL
example.com/john-smith-interview which is easier to read. In particular, watch out for
%20 making its way into URLs as this is usually done automatically by backend systems as a valid way to encode a space character.
Use lowercase URLs
Use only lowercase letters in URLs. Some search engines and web servers treat URLs as case sensitive so mixed case URLs can cause issues. URLs that differ only by case which display the same page can create duplicate page issues which can impact your search ranking. Mixed case URLs are also more difficult to manually type in correctly and some servers will fail to serve a page if the casing is wrong. Generally you should only use lowercase letters in URLs. Words that are normally written with uppercase letters should be written in lowercase and you should use hyphens over trying to separate words with capitals. For example,
example.com/Flights/GermanyToUK would be better written as
Avoid deeply nested URLs
Prefer simple URL structures that minimise the amount of subfolders used. Deeply nested URLs such as
example.com/community/forum/subforum/food/ look long, complex and are hard to read. Try to stick to simpler and more shallow directory structures that help the user understand where they are on the site by grouping related pages into the same folder. For instance, the previous URL could perhaps be rewritten as
example.com/forum/food/ to reduce the number of subfolders from four to two. We recommend that URLs don’t exceed five subfolders.
While contemporary Web browsers do an increasingly good job of parsing even the worst HTML “tag soup” … different software on different platforms will not handle errors in a similar fashion …
Use valid HTML
Pages should be free of HTML validation errors. Invalid HTML can cause problems to users as pages may not be displayed as you intended. Browsers differ in how they treat invalid code so you should always use valid HTML to avoid browser specific issues. Likewise, search engines trying to interpret invalid HTML may misunderstand the content of a page.
Use valid CSS
CSS files should be free from syntax errors. Invalid CSS can cause pages to be incorrectly displayed which means visitors may not see your content as intended. CSS errors could also impact search rankings as for example Google says pages can be penalized if CSS causes page content to be hidden.
Your site should be free of broken links and configured to signal broken links to crawlers using a 404 response status code.
No matter how beautiful and useful your custom 404 page, you probably don’t want it to appear in Google search results. In order to prevent 404 pages from being indexed by Google and other search engines, make sure your webserver returns an actual 404 HTTP status code when a missing page is requested.
Use 404 code for broken URLs
When a URL is requested that doesn’t exist, return a 404 HTTP status code so search bots know the link is broken. If the URL hasn’t been indexed yet, this stops search bots indexing the page which is what you want if the page really doesn’t exist. A working page that’s been indexed already that begins returning a 404 code as opposed to say a 200 success code will eventually be removed from search results so make sure the 404 code is only returned for broken URLs. The 404 code is also important if you want to use tools that scan your site for broken links as there’s no other way for a machine to warn you that broken links exist. For users, when returning a 404 error you should make sure to display a human friendly “not found” page that helps users find what they were looking for. Test your 404 setup by 1) visiting a URL that shouldn’t exist like
/page-not-found-test and 2) verifying the URL returns a 404 status code. If your setup is broken, the solution is often highly specific to the web framework and web server combination you’re using as either or both could be misconfigured. Try searching for a tutorial on setting 404 pages for your particular setup and then investigate how your configuration differs.
Avoid broken internal links
All internal links on your website should be valid and working. Broken hyperlinks between pages can prevent search engines from finding parts of your site and stops your pages from passing on the boost in search rank that come from page links. Users will also become frustrated if they can’t view content because of broken URLs.
Avoid broken external links
All links to external websites should be valid and working. Links to sites you don’t control should be regularly monitored and updated as links that used to work may break in the future when external pages are deleted or moved. Broken links can signal to search engines that your site is poor quality and will frustrate users.
Avoid broken page resources
Every subdomain on your site should have a robots.txt file that links to a sitemap and describes any crawler restrictions.
A robots.txt file is a file at the root of your site that indicates those parts of your site you don’t want accessed by search engine crawlers.
Use robots.txt files
Add a robots.txt file to every subdomain so you can specify sitemap locations and set web crawler rules. Robots.txt files are always located in the root folder with the name
robots.txt. Each robots.txt file only applies to URLs with the same protocol, subdomain, domain and port as the robots.txt URL. For example,
http://example.com/robots.txt would be the robots.txt URL for
http://example.com but not
http://www.example.com. Even an empty robots.txt file is useful to have for cleaning up server logs as it will reduce 404 errors from visiting bots. Keep in mind that if you use a robots.txt file to tell search bots not to visit a certain page, that page can still appear in search results if it’s linked to from another page. To hide pages from search results, use
noindex meta tags instead.
Set sitemap locations
Each robots.txt file should specify sitemap file locations. Sitemap files contain a list of page URLs that you want indexed and are read by search bots. These files can also include metadata that describes when pages were last updated and how often different pages are updated to help crawlers index your site more intelligently. A sitemap location should be specified in the robots.txt file with a line such as
Sitemap: http://example.com/sitemap.xml. A robots.txt file can include more than one sitemap reference.
Redirects are used to signal the URL for a page has changed. These should be used carefully as redirects can influence page rank.
If you need to change the URL of a page as it’s shown in search engine results, we recommend that you use a server-side 301 redirect … The 301 status code means that a page has permanently moved to a new location.
Avoid temporary redirects
Prefer permanent redirects (status 301) over temporary redirects (usually status 302 and 307). A permanent redirect from one URL to another indicates the original URL has changed for good. This causes search engines to update their URLs while passing link equity from the old URL to the new URL. Temporary redirects indicate a page has only moved temporarily so search engines won’t update their links or pass on link equity to the new URL. When moving a page with many backlinks, it’s extra important to use permanent redirects so the search rank of the page is maintained.
Avoid meta redirects
Avoid the use of meta tags for performing redirects and prefer server-side redirects instead. Meta tag redirects are performed using a special kind of HTML tag that instructs browsers to load a new URL. For example, the tag
<meta http-equiv="refresh" content="5;http://example.com/destination"> tells the browser to wait 5 seconds then go to the specified URL. Meta tag redirect are discouraged because some crawlers will ignore them, they break the browser “back” button and it can be confusing for users to see one page load followed quickly by another. Google and W3C strongly recommend the use of server redirects over meta tag redirects to avoid these issues.