XML and HTML Sitemaps

The difference between an XML to an HTML sitemap is that the XML sitemap is built for search engines and submitted to the Google Search Console, whereas the HTML sitemap is built with the human visitors in mind and is destined to help them navigate through the content.

Here is my HTML sitemap.


What does an HTML sitemap do?

  • Organize the website’s content into categories.
  • Improve the site’s navigation
  • Organize large websites
  • Offers a way for visitors to stay on the website when they hit a 404 (Not Found) page. Read below.
  • Acts as a link farm distributing link juice to the listed pages.

Common Mistakes

Blocking the sitemap from being indexed by search robots with noindex + nofollow.
Use this tag in the sitemap page <meta name=”robotscontent=”index,follow“> and do not block it in the robots.txt.
 
Talking about the robots.txt here is how you can include your sitemaps. At the end of the file give it a new line and paste something like this (replace domain.com with your website). post-sitemap.xml and page-sitemap.xml are URLs that SEO plugins commonly create (verify this manually before including them or else you will get errors in the Search Console. category-sitemap.xml is optional, again created by the SEO plugin, you may not have it or need it as categories don’t need to get indexed.
Sitemap: https://domain.com/sitemap.xml
Sitemap: https://domain.com/post-sitemap.xml
Sitemap: https://domain.com/page-sitemap.xml
Sitemap: https://domain.com/category-sitemap.xml
 
Don’t clutter your page with thousands of links.

Build the sitemap to be helpful for the users, and group it into sections or topics. If there is a need to list many links, it is better to create a multi-level map that will open the sections in stages e.g. via drop-down widgets, or leave enough space between the sections to de-clutter it.

 
sitemap
Make sure to only list links to pages that are available and open for indexing.
No 404 pages, no noindex pages.
Regularly update the sitemap.
Clean up deleted pages and add new ones.
Main CF

HTML Sitemap Generators

XML-Sitemaps offers a Free HTML generator for up to 500 pages. Add the website address and when the generator has produced results go to the bottom of the screen and download the HTML sitemap or all sitemaps (HTML+XML in a zip file), which you should upload into the root folder of your website.

html sitemap

Note the Rank Math plugin creates HTML sitemaps too.


XML Sitemap Generators

XML-Sitemaps also offers a Free XML generator for up to 500 pages. Add the website address and when the generator has produced results go to the bottom of the screen and download the XML sitemap, which you should upload into the root folder of your website and submit it to the Search Console Sitemap section.

Do you prefer to let AI create your sitemap? I have to warn you it can be messy. Go to your favorite AI tool and give it this prompt:

Develop a complete website sitemap for our website [URL] to improve search engine crawling

Don’t forget to replace [URL] with your website address. On most AI tools, you will need to do some manual work, yes I know it sucks but those bots are still infants and they can’t crawl. 


Sitemap Validation

offers a very useful tool to validate your uploaded sitemaps so they don’t get rejected when submitted to the engines.

sitemap validator


Screaming Frog Sitemap Audit

Open your Screaming Frog software and from the Top menu go and select / Mode / List. Then hit Upload and download an XLM sitemap file. Give the address and wait for SF to crawl the page.

To find the sitemap address you can check the default address e.g. domain.com/sitemap.xml, or if the sitemap is produced by a plugin (very common) give /sitemap_index.xml for the  XML Sitemaps Yoast SEO, and Rank Math plugins.

Other common sitemap addresses are: /post-sitemap.xml for posts, /page-sitemap.xml for pages, that’s all that you will need.

When you have crawled the sitemap file in Screaming Frog, you will want to order the pages/links by Status (hit once on the Status Code column header). When you get a 200 response code all is fine. When you get a 301 it means the resource is moved (redirected). If any 404s show up, that page is broken and you need to fix it (correct the link listed in the file).

Screaming Frog Sitemap Audit

Sitemap Plan

  • 1. Analysis
  • 1.1 Review current website structure
  • 1.2 Identify all existing web pages
  • 1.3 Evaluate current sitemap if present
  • 2. Develop New Sitemap
  • 2.1 Organize webpages in a hierarchical structure
  • 2.2 Ensure all URLs are correct and functioning
  • 2.3 Include only canonical URLs
  • 2.4 Prioritize important and high-traffic pages
  • 3. Validate Sitemap 
  • 3.1 Check sitemap for any errors or issues
  • 3.2 Validate sitemap with a sitemap validator tool
  • 3.3 Make necessary corrections or adjustments

Main CF