Static site search with Hugo + Algolia

Static site search with Hugo + Algolia

For this week on Frontend Friday, we’ll be covering how to set up lightning ⚑️ fast search for your Hugo site using Algolia, the SaaS (Search as a Service πŸ˜‰ ) provider. We published a Jekyll-focused version of this guide last week.

Algolia’s self-proclaimed claim-to-fame is that they are“the most reliable platform for building search into your business,” and honestly, it’s hard to disagree. Forestry’s search is powered by Algolia (just try searching for Algolia in the search above!).

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/02/forestryio-algolia-search.gif

Table of Contents

We’re going to generate a JSON search index for our static site using Hugo’s custom output formats. Then we’ll do the necessary configurations on Algolia and send the new index to Algolia using the npm package atomic-algolia. Lastly, we’ll simplify updating your search index using Serverless.

  1. Why Algolia?
  2. Generating Your Search Index
  3. Create Your Index in Algolia
  4. Sending Your Search Index to Algolia
  5. Updating Your Search Index with Serverless Functions
  6. Next steps

1) Why Algolia?

There are many search solutions for static sites out there. You can roll your own search using frontend Javascript with tools like Lunr.js or Fuse.js, set up powerful open-source search technology using ElasticSearch or Amazon CloudSearch, or SaaS solutions like Algolia.

So the question is, what makes Algolia so great?

The answer comes down to two factors:

How Algolia Works

Algolia provides a REST API to query and update your search indices. All input and output is provided in JSON, making it extremely easy to use in frontend Javascript.

In order to create, update, and maintain an Algolia search index, you’ll need to generate a valid JSON array of all of the content in your Hugo site.

We’ll do that in the next step!

2) Generating Your Search Index

To get started with Algolia, the very first thing you’ll need to do is sign up. Once that is out of the way, your next step is to generate your JSON search index.

With Hugo, we’ll do this using the custom output formats feature, which allows us to output an existing document in a different format (in this case, a valid Algolia JSON index).

To get started, open up config.toml. Here, we’ll add the Hugo configuration for your custom output formats.

Don’t have a Hugo site yet? Check out our Up & Running With Hugo series to get started with the static site generator Hugo in less than 30 minutes!

[outputFormats.Algolia]
baseName = "algolia"
isPlainText = true
mediaType = "application/json"
notAlternative = true

[params.algolia]
vars = ["title", "summary", "date", "publishdate", "expirydate", "permalink"]
params = ["categories", "tags"]

In [outputFormats.Algolia]:

  • baseName tells the output format how to look for the Hugo layout for this output format
  • isPlainText tells the output format to use GoLang’s plain text parser for the layout, preventing some automatic HTML formatting from ruining your JSON
  • mediaType tells the output format what kind of file to output.
  • notAlternative tells the output format not to be included when looping over the .AlternativeOutputFormats page variable.

In [params.algolia]:

Creating the JSON Template

The next step is to provide Hugo with the JSON template for your custom output format. To do this, we’ll create a new layout to do this.

In the example above, we set baseName to algolia, which tells Hugo to look for list templates with “algolia” in the filename. For example, list.algolia.json, taxonomy.algolia.json.

Copy the contents below into layouts/_default/list.algolia.json

{{/* Generates a valid Algolia search index */}}
{{- $.Scratch.Add "index" slice -}}
{{- $section := $.Site.GetPage "section" .Section }}
{{- range .Site.AllPages -}}
  {{- if or (and (.IsDescendant $section) (and (not .Draft) (not .Params.private))) $section.IsHome -}}
    {{- $.Scratch.Add "index" (dict "objectID" .UniqueID "date" .Date.UTC.Unix "description" .Description "dir" .Dir "expirydate" .ExpiryDate.UTC.Unix "fuzzywordcount" .FuzzyWordCount "keywords" .Keywords "kind" .Kind "lang" .Lang "lastmod" .Lastmod.UTC.Unix "permalink" .Permalink "publishdate" .PublishDate "readingtime" .ReadingTime "relpermalink" .RelPermalink "summary" .Summary "title" .Title "type" .Type "url" .URL "weight" .Weight "wordcount" .WordCount "section" .Section "tags" .Params.Tags "categories" .Params.Categories "authors" .Params.Authors)}}
  {{- end -}}
{{- end -}}
{{- $.Scratch.Get "index" | jsonify -}}

In this layout, we loop through all of the current page’s children and do the following:

  • Set the objectID of the Algolia indexes document using the .UniqueID page variable.
  • Loop through the document’s built-in variables and add specific variables to the document.
  • Loop through the document’s custom Front Matter params and add specific params to the document.

The above layout will only add pages that do not have private = true or draft = true in their front matter. This makes it easy to exclude results that shouldn’t be index, and prevents drafted content from being included.

Outputting the Index

Now that we’ve created our custom output format, the layout for it, and configured the variables and page-level params we want included in the index, we now have to set up the site to actually create the JSON index!

We can do this in two ways, using the outputs param:

  1. Using site-wide outputs configuration to output indexes for all content of a specific type
  2. Specifying the outputs on per-page basis, outputting the index for a specific page.

For the purposes of this guide, we’ll do the former. Open up your config file one more time, and add the following:

[outputs]
home = ["HTML", "RSS", "Algolia"]

This configuration tells Hugo to output the HTML document, the RSS Feed, and an Algolia index for your site’s homepage, which will contain every other page on your site. (That’s perfect!)

In your built site, you’ll now find a file called algolia.json in the root, which we can use to update your index in Algolia.

3) Create Your Index in Algolia

Head over to your Algolia app dashboard, and click New Application.

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/02/algolia-screen-1.pngSet the application name to something memorable (i.e, your company name), and choose Community as your plan.

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/02/algolia-screen-2-region.png

Then, choose the region closest to you. (In our case, that’s Canada. πŸ‡¨πŸ‡¦ )

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/02/algolia-screen-3-indicies.pngYou’ll be redirected to the app’s dashboard. Select the Indices tab on the left, and then click Add New Index. Give it a unique name, (.ie, your site’s domain), as this is what we’ll use when updating the index.

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/02/algolia-screen-4-api-keys.png

Finally, select the API Keys tab on the left, and copy the Application ID and Admin API Key, as we’ll need these to update the index.

4) Sending Your Search Index to Algolia

The next step is sending your search index to Algolia. For this article, we’ll be using a great NPM package to do this: atomic-algolia.

atomic-algolia is an NPM package that does atomic updates to an Algolia index. This means that it only updates changed records, adds new records, or deletes expired records, and does it all at once, so that your index is never out-of-sync with your website’s content.

This is important, because Algolia’s plans are based on operations on your index, and searches on the index, and this plugin ensures you use the smallest amount of operations possible! Our user @budparr ran a quick test to find out just how many operations can be saved using atomic-algolia. The results are impressive, you can see that hugo-algolia generated 4,613 operations vs. atomic-algolia’s 911 operations.

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/03/atomic-algolia-vs-hugo-algolia-test.png

To get started, make sure you have Node installed. If you don’t, you can do so by downloading an installer for your operating system.

Update Your Index

Now that you have an Algolia index, open up your terminal, navigate to your Hugo project, and run the following command:

npm install atomic-algolia --save

This will install the atomic-algolia package to a local node_modules folder and make it available for use in your Hugo project.

Next, open up the newly created package.json, where we’ll add an NPM script to update your index. Find "scripts", and add the following:

"algolia": "atomic-algolia"

Now, you can update your index by running the following command:

ALGOLIA_APP_ID={{ YOUR_APP_ID }} ALGOLIA_ADMIN_KEY={{ YOUR_ADMIN_KEY }} ALGOLIA_INDEX_NAME={{ YOUR_INDEX NAME }} ALGOLIA_INDEX_FILE={{ PATH/TO/algolia.json }} npm run algolia

The path to the index file in the Hugo Boilerplate is dist/algolia.json, whereas the path in a default hugo site is public/algolia.json

Using a .env File

Passing in the environment variables to the NPM script each time you call it isn’t ideal. That’s why atomic-algolia supports a .env file.

Create a new file in the root of your Hugo project called .env, and add the following contents:

ALGOLIA_APP_ID={{ YOUR_APP_ID }}
ALGOLIA_ADMIN_KEY={{ YOUR_ADMIN_KEY }}
ALGOLIA_INDEX_NAME={{ YOUR_INDEX_NAME }}
ALGOLIA_INDEX_FILE={{ PATH/TO/algolia.json }}

Now you can update your index more simply by running:

npm run algolia

5) Updating Your Search Index with Serverless Functions

Having to run the NPM script manually each time your site changes isn’t ideal, especially when using services like the Forestry.io CMS.

That’s why we’ve created an open-source template for creating a Serverless Webtask Function that can automatically update your Algolia index each time your site is updated using web hooks.

What is serverless?

With Serverless, you can set up functions that run in the cloud and don’t require a full-blown backend server running PHP, Node, etc. These functions are perfect for performing tasks like updating your Algolia index. Serverless infrastructure is a great combination with static sites. For more background, check out this article by Auth0.

Setting Up

To get started, clone the template repository to your local machine by running:

git clone https://github.com/forestryio-templates/serverless-atomic-algolia.git

Don’t have or want to use Git? Feel free to download the template as a zip instead.

Then, navigate to the template directory and install the dependencies:

cd serverless-atomic-algolia
npm install serverless -g && npm install

Configuring

Next, if you don’t already have a Webtasks profile set up, you’ll need to do so. This can be done directly from the command line.

serverless config credentials --provider webtasks

You will be asked for a phone number or email. You’ll immediately receive a verification code. Enter the verification code and your profile will be entirely setup and ready to use

Next, you’ll need to configure the function with your indices and Algolia app information.

First, copy config/secrets.yml.stub to config/secrets.yml and then open it up in your favorite text editor.

ALGOLIA_APP_ID: {{ YOUR_APP_ID }}
ALGOLIA_ADMIN_KEY: {{ YOUR_ADMIN_KEY }}
DEBOUNCE: 0

Then, open up config/index.js in your favorite text editor:

module.exports = () => {

  var indexes = [

    {

      name: "YOUR_INDEX_NAME",

      url: "PUBLIC_URL_OF_INDEX"

    }

  ]

  return JSON.stringify(indexes)

}

Update name to the name of your index that you set up earlier, and url to yourdomain.com/algola.json, replacing yourdomain.comwith your site’s domain.

Deploying the Function

Now we can deploy the function by running:

serverless deploy

In the terminal, you’ll receive an output for the success of your deployment, including the public URL for your new function. Copy that to the clipboard, as this is the URL we’ll trigger with a web hook when changes are made to the site.

Setting Up A Webhook

Finally, our last step is to set up a post-deployment web hook with your deployment tool. Set up for each tool will be different, but we’ll provide setup for:

  • Forestry.io

Forestry.io

https://res.cloudinary.com/forestry-demo/image/fetch/c_limit,dpr_auto,f_auto,q_80,w_640/https://forestry.io/uploads/2018/01/settings-webhook.png

Head over to the Settings page of your site in Forestry, and scroll down to the Webhook URL setting.

Enter the URL you received when deploying your function, and then click Save Settings.

Now each time Forestry finishes deploying your site, your function will be invoked to update your Algolia index.

This set up only works when Forestry is set up to handle the build and deployment of your site. If you’re using a third party CI service to build your site (like GitLab CI or Netlify), you will need to use their webhook features to trigger your function.

6) Next Steps

Now that you’ve successfully set up search indexing for your static site, it’s time to add the actual search interface to your website!

Algolia has a fantastic library called InstantSearch.js for implementing search on the web, and provides a full tutorial for implementing search from scratch.

Join us every Friday πŸ“…

Frontend Friday is a weekly series where we write in-depth posts about modern web development.

Next week: Up & Running with Hugo: Building Your First Site

Last week: We wrote a Jekyll-focused version of this article: Jekyll Search with Algolia and Webtasks .

Have something to add?

Discuss on Hacker News