Lucene Search Module in Sitecore - Part 1: Installing the Module

Jun 25, 2012
Carlos Martinez

Lucene is an open source search engine (supported by the Apache Software Foundation) used in Sitecore for indexing and searching the contents of a web site. Sitecore implements a wrapper for the Lucene engine which has its own API. The original API (Lucene.Net) and the Sitecore API (Sitecore.Search) are both accessible to developers that want to extend their indexing and search capabilities[1].

The Lucene Search module is taken from the Sitecore Starter Kit and made into a separate module [2]. You can download the module from the TRAC Web site. Because it is a part of the Starter Kit, it also uses the Shared Source License.

Installing the Module

To install the Lucene Search module you follow the same procedure as installing any other package in Sitecore.

  • Goto Sitecore > Development Tools > Installation Wizard
  • Browse to the location where you downloaded LuceneSearch-1.1.zip file
  • Follow the instructions given on the wizard.

When you install this module the following files and items are added to your installation:

Files

/bin/LuceneSearch.dll
/images/search.gif
/LuceneSearch.css
/sitecore modules/LuceneSearch/
/sitecore modules/LuceneSearch/CommonText.cs
/sitecore modules/LuceneSearch/LuceneSearchBox.ascx
/sitecore modules/LuceneSearch/LuceneSearchBox.ascx.cs
/sitecore modules/LuceneSearch/LuceneSearchBox.ascx.designer.cs
/sitecore modules/LuceneSearch/LuceneSearchResults.ascx
/sitecore modules/LuceneSearch/LuceneSearchResults.ascx.cs
/sitecore modules/LuceneSearch/LuceneSearchResults.ascx.designer.cs
/sitecore modules/LuceneSearch/SearchManager.cs

Items

/sitecore/Content/Settings/Common Text

Items that allow you to customize the search behavior and the messages presented to the user.

/sitecore/Layout/Sublayouts/LuceneSearch

The two sub layouts needed to provide search functionality on your site.

/sitecore/Content/Home/Standard_Items

Item used to display the search results.

/sitecore/Templates/Starter Kit/Meta-Data

Template for the items in the Common Text folder

When you install the Lucene Search module you get two sub layouts LuceneSearchBox and LuceneSearchResults that you should place on your web site. LuceneSearchBox is the search box that you can place somewhere on the top of your page and then the LuceneSearchResults  sublayout is where the search results are displayed.

The LuceneSearchBox redirects to the content item /sitecore/Content/Home/Standard_Items/Search_Results to display search hits.

If you want to use the default styling, remember to add a reference to the LuceneSearch stylesheet on the same layout that you placed the search results sub layout.

Creating the Index

Sitecore maintains indexes by scanning items in Sitecore databases. Every time you update, create or delete an item Sitecore runs a job that updates the indexes. The process is usually complete by the time you have saved or published an item.

The web database does not have a search index by default. So you will need to create one to enable search functionality on your published site.

Indexes are created in the web.config file under the node /sitecore/search/configuration/indexes.

The following shows a sample index configuration:

<index id="MySearchIndex"
      type="Sitecore.Search.Index, Sitecore.Kernel">
    <param desc="name">$(id)</param>
    <param desc="folder">__mysearchindex</param>
    <Analyzer ref="search/analyzer"/>
    <locations hint="list:AddCrawler">
        <customindex type="Sitecore.Search.Crawlers.DatabaseCrawler, Sitecore.Kernel">
            <Database>web</Database>
            <Tags>My Custom Tag</Tags>
            <Root>/sitecore/content/Home</Root>
            <include hint="list:IncludeTemplate">
                <template>{TemplateId #1}</template>
                <template>{TemplateId #2}</template>
                ...
                <template>{TemplateId #n}</template>
            </include>
        </customindex>
    </locations>
</index>           

Each index you defined has its own unique identifier provided in the ID attribute of the INDEX element.

The first two parameters describe the index name and folder where it should be stored.

The <Analyzer> element indicates the analyzer that should be used.

The <locations> element defines the locations for the index. It's possible to have multiple locations for one index. It's even possible to have content from different databases in the same index.

Every child of the locations node has its own configuration with the following options:

<Database>

Specify which database you want to index.

<Tags>

You can attach a string tag to items from this location making it possible to filter or categorize results during a search.

<Root>

 

Specify the root node of the content tree to be included into the index. The indexing crawler will index content below this location.

<include>

 

In this section, it’s possible to add templates that should be included/excluded from the index.

Additionally, the indexes in Sitecore use the History.Engine mechanism to create or update the index when an item has been created or updated. In order to enable this for the web database, you will need to add the following lines to the web database defined in the /sitecore/databases/ section on the web.config file.

<Engines.HistoryEngine.Storage>
    <obj type="Sitecore.Data.$(database).$(database)HistoryStorage, Sitecore.Kernel">
        <param connectionStringName="$(id)"/>
        <EntryLifeTime>30.00:00:00</EntryLifeTime>
    </obj>
</Engines.HistoryEngine.Storage>

Stay tuned for Part Two of this Lucene Search Module series that will focus on how to use this module.

 

REFERENCES

[1] www.sdn.sitecore.net/Reference
[2] http://trac.sitecore.net/LuceneSearch/