Thornton 2 Library of Scraps
1K38F

How To Add Site Search to a Static Website

Adding a dynamic website search to your static website is actually surprisingly easy. All you have to do is borrow an established search engine that indexes your site. This article tells you how in four general steps.

Before beginning, don't forget to make your website search-friendly. It might help to keep your site map up to date, or if you use a static website generator, make sure it generates search-friendly site maps whenever you update your site.

Choose your search engine

First, choose your search engine. All of the major search engines and most of the minor ones allow you to restrict search results to only those on a certain domain by using "site:example.com" as a search term. For example, to search my website for pages containing unix, you would search for "site:thornton2.com unix". (Typically, the search term order doesn't matter.)

Choose your search engine carefully. This is one that your site's users, not you alone, are going to use. Google is the obvious and convenient choice. However the company behind the search engine tracks and captures as much data as it possibly can about everyone using it, piecing all the collected data together to build a picture of every individual that can be shockingly complete. Depending on your intended audience, many of your site's visitors may find this too creepy to tolerate.

On the other extreme, Startpage boasts user privacy while using Google as its search backend. However, its results may at times lag by several days. Depending on how frequently you create and edit pages, this lag may or may not be acceptable.

My personal preference is DuckDuckGo, which provides most of the useful features of Google without any of the filter bubbling or privacy issues of Google. That's the engine I use in my examples below, but if you choose another search engine, edit my examples to fit your choice.

Find the name of the search term parameter in the query string

Take a look at the URL when you use the search engine you decide on. After the "?" is a string of "name=value" pairs for a query string. One of those pairs will have your search terms as the field value; note the field name.

For most search engines, the search terms are assigned to the name "q". Startpage is a notable exception, naming its search terms "query" instead.

Since I'm using DuckDuckGo, I need to use the name "q" for search terms.

Create your search page

I decided that /search.html is as good a page address as any, so that's where I put mine. Create that page, and give it this HTML form somewhere in the page body:

<div id="searchform" style="text-align: center;">
  <form method="get" id="search" action="/do/search">
    <input type="text" name="q" maxlength="255" />
    <input type="submit" value="Search" />
  </form>
</div>

Note the "action" attribute of the "form" element. The query is going to be sent to a URL on your own website, not the actual search engine site. I'll get to that in a moment.

Note also the text input field. This is where the search terms go, so this form field has to have the same name as the search term parameter.

The rest is basic HTML stuff. Add to this however you see fit, and rename and restyle the form and its container div however you want.

Redirect queries to your search engine

Depending on your web host and what they're using, there are five ways I know of to redirect queries to your search engine. They are, in order of most to least preferable to me:

  1. With .htaccess on Apache and similar with mod_rewrite,
  2. With Nginx,
  3. With a small CGI script,
  4. With a special DuckDuckGo form, or
  5. With JavaScript

Redirect queries to your search engine with .htaccess

If your site host uses Apache and has mod_rewrite enabled, edit the .htaccess file in your website's root directory. If it doesn't exist, create it. Add the following lines:

RewriteCond %{QUERY_STRING} (?:^|&)q=([^&]*)
RewriteRule ^do/search$ https://duckduckgo.com/?q=site:thornton2.com+%1 [NC,R=302,L]

Replace the "thornton2.com" bit on the RewriteRule line with your website's domain name.

Depending on your host and the existing contents of .htaccess, you may need to add a RewriteEngine on line just before these lines.

Also, confession time: This part was lifted from this StackOverflow answer. However, not only did that user give an elegant answer, they explained what the PCRE expression is doing. What you'll have to edit, if necessary, is the "q=" part in the expression, because that's the query string field name that this expression is plucking out.

The condition is matching the query string for the search term field name and extracting its value as a numbered capture. The rule is matching the form action URL, editing the query string, and redirecting the modified search terms to your search engine of choice.

The end of the RewriteRule line makes the form action URL match case-insensitive (NC), makes the redirect to the search engine a 302 temporarily-moved redirect (R=302), and stops the rewrite process (L). See RewriteCond and RewriteRule in the Apache documentation for details.

Because of mod_rewrite, there's no need to create an actual "/do/search" page.

That's it, you're done! Upload these files to your server, and you're good to go!

Redirect queries to your search engine with Nginx

I don't have any experience with Nginx, so I don't know how. However, in researching this topic and making my own search page, I found this page: Proper Search Engine for a Static Website powered by DuckDuckGo (and similar).

Redirect queries to your search engine with a small CGI script

The last server-based option, if you can't use any sort of URL rewriting for your website, is creating a CGI script to do the redirect for you.

Check your website host's support site for whether you can use custom CGI scripts, whether or not they have to be named "something.cgi" to work, and whether or not they have to live in a special "/cgi-bin" directory. The answers for Dreamhost sites are yes, yes, and no.

If your CGI script has to be in /cgi-bin, then you need to edit your form action URL to "/cgi-bin/search".

If your CGI script has to have a ".cgi" extension, then you need to edit your form action URL to "/search.cgi".

If both of the above, then you need to edit your form action URL to "/cgi-bin/search.cgi". While researching and troubleshooting this option, I did this even though I didn't need a cgi-bin directory.

Create your CGI script with whatever name and directory you need to give it, and fill it with this, changing the values of site, prefix, and suffix as needed:

#!/bin/sh -p

# A CGI script to redirect a personal website search form to DuckDuckGo.
# This script takes a query that would search the entire Internet and
# adds a search term that restricts search results only those pages
# under that personal website's domain.  Use an HTML form like this on
# your website:
#
# <div id="search-div" style="text-align: center;">
#   <form method="get" id="search" action="/cgi-bin/search.cgi">
#     <input type="text" name="q" maxlength="255" />
#     <input type="submit" value="Search" />
#   </form>
# </div>
#
# On Dreamhost by default, this script only needs to have a .cgi
# extension and mode 755 to be run as CGI.
#
# Copyright 2019 Ariel Millennium Thornton
# Licensed to you under the terms of my license at
# https://thornton2.com/documents/license.txt

# Your website's actual domain name
#
site="your-web.site"

# Error page address to redirect to if the query is basically bad
#
err403="/403.html"

# Queries that look good get redirected to a URL like:
# https://duckduckgo.com/?q=site:your-web.site+search+query+terms
#
prefix="https://duckduckgo.com?q=site:${site}+"
# Search terms in between prefix and suffix
suffix=""

# Business logic:

ok="" # Use an 'okay' sentinel: 0 = okay, null = not okay.

rawquery="${QUERY_STRING}"; basequery=""; newquery=""

if [ -n "$rawquery" ]
then
        # Cut out any params before and/or after the q param, as well as
        # the q param itself, leaving only the query.
        # (Ampersands in the search query are encoded as %26.)
        basequery=`echo "$rawquery"|\
        sed -n "s/^.*\&\{0,1\}q\=//;p"|sed -n "s/\&.*$//;p"`
        if [ "$?" -eq 0 ]; then ok=0; fi
fi

if [ $ok ]
then
        # Test if the query has terms, assume an empty query is
        # either a blank search or basically malformed.
        if [ -n "$basequery" ]; then ok=0; fi
fi

if [ $ok ]
then
        # Test if the query was actually isolated.
        # The strings shouldn't be identical if it was.
        if [ "$rawquery" != "$basequery" ]
        then
                # Yes, it was.  Compose search redirect destination.
                newquery="${prefix}${basequery}${suffix}"
        else
                ok=""
        fi
fi

if [ $ok ]
then
        echo "location: ${newquery}"; echo "" # Redirect to DuckDuckGo.
else
        echo "location: ${err403}"; echo "" # Redirect to our 403.
fi

Redirect queries to your search engine with a special DuckDuckGo form

For this option, edit your search form page to look like this instead:

<div id="searchform" style="text-align: center;">
  <form method="get" id="search" action="https://duckduckgo.com/">
    <input type="hidden" name="sites" value="thornton2.com" />
    <input type="text" name="q" maxlength="255" />
    <input type="submit" value="Search" />
  </form>
</div>

Change the value of the sites input element to your domain name.

The major drawbacks to this option are, first, DuckDuckGo is the only search engine I found that lets you specify a site restriction as a separate field in the query string, and second, visitors using older browsers or modern browsers without JavaScript support will get redirected to their HTML or "lite" search versions which don't respect this form of site restriction (they'll get whole-Internet search results instead).

Redirect queries to your search engine with JavaScript

If none of the other ways listed above are suitable, then this is the last option I know of for you.

DuckDuckGo has a search box creation tool and wizard that will let you create a search box for your website. Instead of creating an HTML form, paste the code the wizard gives you, and you'll have an iframe with the search box instead.

While convenient, this limits users able to search your website to only those whose browsers have JavaScript support. Those without and those who have it turned off for security reasons won't be able to use it. Also, again, DuckDuckGo is the only search engine I found with a build-your-own search box wizard.

A JavaScript-based approach that lets you keep and use your own search form is described here: Adding DuckDuckGo Search to Your Website. This has the same "must have JavaScript" restriction that DuckDuckGo's wizard has, but you can edit the JavaScript function easily to use another search engine instead.