UPDATE: You should now use the bridgetown-sitemap plugin instead.
During playing with Bridgetown (that I am using on this site), I needed to add a sitemap.xml file for search engines.
As Bridgetown does not have any official solution, Konnor recommended me a solution from _wout_ (thank you!). The solution described in this post is based on it heavily.
I don’t believe we currently have an official solution. @_wout_ came up with this gist.
— Konnor Rogers, March 22, 2021
The original tweet, _wout_’s X (Twitter) account, and the referenced gist are no longer accessible.
Preparing data for sitemap
I wanted to extend the sitemap.xml file to include the lastmod (and changefreq) for each item. Firstly, I needed to have the ability to ignore some of the pages defined in my menu. I added sitemap.ignore attribute for it:
- title: Home
href: "/"
content: '<i class="fas fa-user-circle"></i>'
sitemap:
changefreq: weekly
ignore: false
- title: Blog
href: "/blog/"
content: '<span>Blog</span>'
sitemap:
changefreq: daily
ignore: false
- title: Resume on LinkedIn
href: "https://www.linkedin.com/in/petr-hlavicka/"
content: '<span>Resume</span> <i class="fas fa-external-link-square-alt text-xs ml-1"></i>'
sitemap:
ignore: true
I also added changefreq, but I am not sure, how much the attribute is valid for Google. In the worst case, it will be ignored.
For the lastmod attribute, I use a last_modified_at front matter field. The date always represents the original publish date (derived from the filename), and last_modified_at is only added when a post has been updated. This convention is also supported by the bridgetown-feed plugin, which uses it for the <updated> tag in the Atom feed.
In the post template, I display the update date when present:
{% if page.last_modified_at %}
<span>
Updated on
<wa-format-date month="long" day="numeric" year="numeric"
date="{{ page.last_modified_at | date: '%Y-%m-%dT%H:%M:%S' }}">
</wa-format-date>
</span>
{% endif %}
Whenever I update a post, I simply add last_modified_at with the update date to the front matter. The date stays as the original publish date.
Sitemap generation
---
layout: false
permalink: "/sitemap.xml"
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
{%- assign newest_post = site.posts | sort: "date" | last -%}
{%- for item in site.data.menu %}
{%- if item.sitemap.ignore -%}
{%- continue -%}
{%- endif -%}
<url>
<loc>{{ site.url }}{{ item.href }}</loc>
<lastmod>{{ newest_post.date | date_to_xmlschema }}</lastmod>
<changefreq>{{ item.sitemap.changefreq }}</changefreq>
</url>
{%- endfor %}
{%- for post in site.posts -%}
<url>
<loc>{{ site.url }}{{ post.url }}</loc>
<lastmod>{{ post.last_modified_at | default: post.date | date_to_xmlschema }}</lastmod>
<changefreq>weekly</changefreq>
</url>
{%- endfor -%}
</urlset>
Line #12 will select the newest post so I can set its date as lastmod for all pages in the menu.
Line #15 skips menu items with external links.
For posts (line #27), last_modified_at is used when available, falling back to date otherwise. This ensures the sitemap reflects actual content changes.
You can see generated sitemap.xml of this site here.
If you know about a better way or you have any comments, don’t hesitate to contact me on Mastodon. Thanks!