This note explains the technical details and source code of the Python scripts used to publish my Obsidian vault with Hugo, using the amazing Hugo Bootstrap Theme. The instructions here should work with any Hugo theme, perhaps after minor adjustments. The accompanying Python code can be found in this Github repository. The code supports publishing only a subset of the Obsidian vault (notes with publish: true
in the frontmatter, as well as manually specifying a list of files and folders to be excluded from publishing), converting Obsidian -[[wikilinks]]-
to regular Hugo links, as well as backlinks.
Why another script?
There already exist a bunch of ways to publish an Obsidian vault with Hugo. For example, there is the Hugo quartz theme that provides a ready-made Hugo configuration for publishing an Obsidian vault. There also exist scripts like obsidian-to-hugo, obsidian-meets-hugo, obyde, Obsidian Export that you can use and customize as well as blog posts talking about how to use Hugo to publish content written in Obsidian. While it is awesome to have all of this
- I could not find detailed technical notes on exactly how to export an Obsidian vault for consumption by Hugo (and by ‘detailed notes’, I mean the sort of stuff mentioned in the subsequent paragraphs :-) )
- I did not want to use some well-prepared Hugo template that is different from the template I already use for my Hugo-enabled blog. It is nice if the notes published on this site have the same look-and-feel as the rest of the site.
- I couldn’t find a lot of info on supporting backlinks in the published notes. Some of the Hugo templates use a bunch of Hugo-trickery to generate the backlinks. However, for various reasons I prefer to generate the backlinks outside of Hugo
Mostly for these reasons, I decided to create my own scripts for publishing an Obsidian vault with Hugo. The scripts described below are supposed to run sequentially, as exemplified in the obsidian-export.sh
file in the Github repository.
Overview of the main steps
The main steps are listed below, along with a link to the part of the source code/script that performs the step
- Exporting a subset of the Obsidian markdown files to the appropriate location in the Hugo site source (Script: export-files.py)
- Determining the subset of the Obsidian files that should be published
- Based on contents of the YAML frontmatter in each note
- Based on manually specified lists of files and folders to be excluded from publishing
- Copying over the subset of the Obsidian vault to the appropriate location of the Hugo site
- Creation of
_index.md
files as needed, depending on how you are organizing content in Hugo
- Determining the subset of the Obsidian files that should be published
- Replacing any
-[[wikilinks]]-
in the Obsidian notes with corresponding links in the Hugo format (Script: process-wikilinks.py) - Adding Backlinks, if any, to each published note (Script: add-backlinks.py)
- Copying any images/attachments associated with to-be-published files to the right locations in the Hugo site (Script: copy-assets.py)
Exporting files
Brief explanation of the source code in the script export-files.py. This code ends up creating the folder to be published via Hugo and adds any missing files that Hugo may need.
- Get the locations of the filesystem folder in which the Obsidian vault is stored (aka origin), and the folder in which the corresponding Hugo content is to be stored (aka destination).
- If the Obsidian vault folder aka origin does not exist, throw and error and stop.
- Delete the Hugo content folder aka destination (since we’ll be regenerating it next)
- Create a list of files that should not be published i.e. mark them for exclusion
- Start with a manually created list of files to be excluded from publishing
- Go through each markdown file in the origin folder. If it’s frontmatter’s
publish
key has any value excepttrue
, or if the publish key is missing, append that file to the list of files to exclude from publishing
- Copy files from the origin folder to the target folder excluding the files and folders that are marked for exclusion
- Create
_index.md
files as needed by your Hugo content organization. In my case, if there is any folder copied over which contains even a single markdown file, but no_index.md
file, then the script creates the missing/needed_index.md
file in that folder. That file has the folder name astitle
and the timestamp at which it was created as thecreated
frontmatter variables - Similar to the previous point, create an
_index.md
file in the root (top level) notes folder. In my case, I have a file in my Obsidian vault namedroot.md
whose contents are copied over to the_index.md
in the root folder. This has some content like, “Welcome to my Digital Garden”. Please review the Hugo docs to learn more about organizing content and the need/importance of_index.md
files. - Go through each file in the destination folder and add the
date
frontmatter variable to it if needed, because Hugo needs that variable to be present.- The script just copies over and formats the timestamp from the
created
variable which must be present.
- The script just copies over and formats the timestamp from the
Processing wikilinks
Brief explanation of the source code in the script process-wikilinks.py. This code replaces links to other notes in the Obsidian vault from the -[[wikilink]]-
format to Hugo’s links and references format, specifically as [Link text]({{< ref "path/to/file" >}})
.
- The script has a regex to detect wikilinks
- The script does some pretty kludgy hackery to avoid matching/replacing wikilinks in fenced code blocks
- Create a sqlite database with a table named
links
which has columnsfrom
,from_title
,to
, andto_title
. See next point for how this is used. - For each markdown file in the destination folder that contains a regex match
- Replace the contents of the matched text with equivalent link content in the Hugo links and references format, specifically as
-[Link text]({{< ref "path/to/file" >}})
- Add the full file path (e.g.
/notes/path/to/file.md
), its title (as written in the frontmatter), the full file path of the link’s target file, and the title in the frontmatter of the link’s target file to the sqlite database created previously. This database will then be useful to add backlinks to each file in a subsequent step
- Replace the contents of the matched text with equivalent link content in the Hugo links and references format, specifically as
- Known limitations of regex matching: Currently, wikilinks in inline code blocks will still be matched and replaced. Also, wikilinks need a preceding space in order to be matched.
Adding backlinks
Brief explanation of the source code in the script add-backlinks.py. This script adds backlinks, if any, to each published file.
- Open the sqlite database previously created in the “Process wikilinks” step
- For each distinct markdown file in the
to
column- Find the rows in the
from
column which point to it (i.e. the list of files that have links to the file in theto
column) - Add a ‘Backlinks’ section to the file
- Add links (in Hugo
ref
format) to all the files in point 2.1 above
- Find the rows in the
Copying assets
Brief explanation of the source code in the script copy-assets.py. This script copies over images referenced by the files to be published into the appropriate Hugo content folder
- The script has a regex to detect image references in markdown format
- Open each markdown file in the Obsidian vault folder. If it contains a regex match to an image,
- If the file is marked for publishing, extract the path to the image file referenced
- If the referenced image file exists, copy it over to the appropriate folder in Hugo
- Else skip copying the file
- If the file is marked for publishing, extract the path to the image file referenced