The QuickSearch module for J1 Theme is based on the search engine Lunr, fully integrated with J1 Theme. Lunr is designed to be lightweight yet full-featured to provide a great search experience. No need for complex external, server-sided search engines or commercial services on the Internet like Google. Searching a website using QuickSearch is different from search engines like Google or Microsoft Bing. Those search platforms using complex algorithms to provide a simple interface to the public but using a lot of artificial intelligence (AI) methods to make sense of results out of a handful of words given for a search.
Nevertheless, QuickSearch, the J1 implementation of Lunr, is simple like searching at Google but offers additional features to do searches more specifically - if wanted. QuickSearch provides an easy-to-use query language for better results - anyway!
Core concepts
Understanding some of the concepts and terminology that QuickSearch (Lunr) uses will allow users to provide powerful search functionality - to get more relevant search results.
Indexing documents
QuickSearch offers searches on all documents of the website generated by J1 but only for this site. Advantage, no internet access is done for searches because it’s not needed. Searches are based on a pre-build local site full-text index loaded by the browser on a page request. The index for a site is generated by the (Jekyll) plugin lunr_index.rb
located in the _plugins
folder.
The full-text index is always generated by Jekyll at build-time:
Startup the site .. Configuration file: ... Incremental build: enabled Generating... J1 QuickSearch: creating search index ... J1 QuickSearch: finished, index ready. ....
Or, if you’re running a website in development mode, the index get refreshed for all files added or modified.
site: Regenerating: n file(s) changed at ... site: ... site: J1 QuickSearch: creating search index ... site: J1 QuickSearch: finished, index ready. ...
Documents
The searchable data in an index is organized as documents containing the text and the words (terms) you want to search on. A document is a data set (JSON object) with fields that are processed to create the result list for a search.
A document data set might look like this:
{
"id": 3,
"title": "Roundtrip",
"tagline": "present images",
"url": "/pages/public/learn/roundtrip/present_images/",
"date": "2020-11-03 00:00:00 +0100",
"tags": [
"Introduction",
"Module",
"Image"
],
"categories": [
"Roundtrip"
],
"description": "Welcome to the preview page ... and galleries.\n",
"is_post": false
}
In this document, there are several fields, like title
, tagline
, or description
, that could be used for full-text searches. But additional fields are available, like tags
or categories
that can be used for more specific searches based on identifiers
.
The document content is collected by the (intrinsic) field body . To limit the index data loaded by the browser, the body field is removed from a document. The body field not available as an explicit field for searches, but the content is still fully searchable. |
To do a simple full-text search as well as more specific searches, the QuickSearch core engine Lunr offers a query language, a DSL (domain-specific language). Find more about QuickSearch|Lunr DSL queries with the section Searching.
Scoring
The relevance (the score
) is calculated based on an algorithm called BM25, along with other factors. You don’t need to worry too much about the details of how this technique works. To summarize: the more a search term occurs in a single document, the more that term will increase that document’s score, but the more a search term occurs in the overall collection of documents, the less that term will increase a document’s score. In other words, seldom words count and increase the score.
Scoring information generated by the BM25 algorithm is added to the (local) search index and allows a very fast calculation of the relevance of documents for queries.
Imagine you’re website contains documents about Jekyll. The term Jekyll
may occur very frequently throughout the entire website. Used quite often for the content. So finding a document that mentions the term Jekyll isn’t very significant for a search.
However, if you’re searching for Jekyll Generator
, only some documents of the website has the word Generator
in them, and that will bring the score (relevance) for documents having both words in them at a higher level, bring them higher up in the search results.
Matching and scoring are used by all search engines - the same as for J1 QuickSearch. You’ll see for QuickSearch a similar behavior in sorting search results as you already know from commercial internet search engines like Google: the top results are the more relevant ones.
Searching
To access QuickSearch, a magnifier button is available in the Quicklinks
area in the menu bar at the top-right of every page.
A mouse-click on the magnifier button opens the search input and disables all other navigation to focus on what you’re intended to do: searching.
Search queries look like simple text. But the search engine
under the hood of QuickSearch transforms the given search string (text) always into a search query. Search queries support a special syntax, the DSL, for defining more complex queries for better (scored) results.
As always: start simple!
Simple searches
The simplest way to run a search is to pass the text (words, terms) on which you want to search on:
jekyll
The above will return all documents that match the term jekyll
. Searches for multiple terms (words) are also supported. If a document matches at least one of the search terms, it will show in the results. The search terms are combined by a logical OR
.
jekyll tutorial
The above example will match documents that contain either jekyll
OR tutorial
. Documents that contain both will increase the score, and those documents are returned first.
Comparing to a Google search (terms are combined at Google by a logical AND ) a Quicksearch combines the terms by an OR . |
To combine search terms in a QuickSearch query by a logical AND, the terms could be prepended by a plus sign (+
) to mark them as for the QuickSearch query (DSL) as required:
+jekyll +tutorial
Wildcards
QuickSearch supports wildcards when performing searches. A wildcard is represented as an asterisk (*
) and can appear anywhere in a search term. For example, the following will match all documents with words beginning with Jek
:
jek*
Language grammar rules are not relevant for searches. For simplification, all words (terms) are transformed to lower case. As a result, the word Jekyll is the same as jekyll from a search-engines perspective. Language variations of Jekyll’s or plurals like Generators are reduced to their base form. For searches, don’t take care of grammar rules but the spelling. If you’re unsure about the spelling of a word, use wildcards. |
Fields
By default, Lunr will search all fields in a document for the given query terms, and it is possible to restrict a term to a specific field. The following example searches for the term jekyll
in the field title:
title:jekyll
The search term is prefixed with the field’s name, followed by a colon (:
). The field must be one of the fields defined when building the index. Unrecognized fields will lead to an error.
Search queries based on fields can be combined with all other term modifiers like wildcards. For example, to search for words beginning with jek
in the title AND the wildcard coll*
in a document, the following query can be used:
+title:jek* +coll*
Besides the document body, an intrinsic field to create the full-text index out of the document content, some more specific fields are available for searches.
Name | Value | Description|Example|s |
---|---|---|
|
| The headline of a document (article, post) Example|s: QuickSearch
|
|
| The subtitle of a document (article, post) Example|s: full index search |
|
| Tags describe the content of a document. Example|s: Roundtrip, QuickSearch |
|
| Categories describe the group of documnets a document belongs to. Example|s: Search |
|
| The description is given by the author for a document. It gives a brief summary what the document is all about. Example|s: QuickSearch is based on the search engine Lunr, fully integrated with J1 Theme … |
Term presence
By default, Lunr combines multiple terms in a search with a logical OR. That is, a search for jekyll collections
will match documents that contain jekyll
or contain collections
or contain both. This behavior is controllable at the term level, i.e., the presence of each term in matching documents can be specified.
By default, each term is optional in a matching document, though a document must have at least one matching term. It is possible to specify that a term must be present in matching documents or that it must be absent in matching documents.
To indicate that a term must be present in matching documents, the term could be prefixed with a plus sign (+
) (required), and to indicate that a term must be absent (not wanted), the term should be prefixed with a minus (-
).
The below example searches for documents that must contain jekyll
, and must not contain the word collection
:
+jekyll -collection
To simulate a logical AND search of documents that contain the word jekyll
AND the word collection
, mark both terms as required:
+jekyll +collection
What next
You reached the end of the roundtrip. Hopefully you enjoyed exploring what J1 can do for your new website. To make things real for your site, go for J1 in a Day. This tutorial guides you through all the steps on how to setup your environment, manage and build a website and how to create content.
It’s a pleasant journey to learn what modern static webs can offer today. Start your journey from here: J1 in a Day.
Have fun!