BLOG: Web Content Management

Welcome to Oshyn’s Web Content Management Blog where our experts discuss the latest developments and best practices in the Content Management industry with a focus on several leading platforms: Drupal, EPiServer, Jahia, Open Text and Sitecore.

Jahia Custom Searches

Juan Pablo Albuja... - Wednesday, January 27, 2010

Jahia like another power full java based CMS like Alfresco, uses industry standard Lucene open source index engine. Lucene is a java based search engine for applications that requires full-text search. This engine can perform basic and complex searches over the data inserted in the Jahia system.


If you are new with template development in Jahia and in the implementation of basic searches, I strongly recommend you read my Jahia posts “Jahia WCM Quick Review: Maven, Templates and Navigation”, “Jahia Search in the Enterprise”.


The metadata and keywords defined in the Jahia system can be indexed by Lucene in order to perform complex queries. In a Jahia installation, Lucene indexes are stored in the folder /WEB-INF/var/search_indexes/ and they can be explored with a Java tool named Luke. Also this tool can be used to perform test searches over the indexed data. The following picture shows what this tool looks like. All the indexed fields are in the left column, and in the right column are the values.



With this tool, we can know the name of the fields to use them in the search query.


Jahia has an administration Tool that permits you to manage the search engines. From here you can Re-index your Site in case that is needed. Like the following picture shows, you just need to click the button “Next step” to perform the full site re-index.



Metadata can be used to filter information in the queries and you can assign metadata values to pages or containers from the edit mode. Notice in the following picture that from this screen you can assign values to the metadata like Keywords, Categories and Description. The others metadata fields can’t be modified because they are read only elements.



Note: If you want to know the name on the metadata field names indexes you can use Luke.


Query Implementation


Suppose that you already implemented the search form implemented in my Jahia post “Jahia Search in the Enterprise”. Now the idea is to create a custom query according to a given requirement. The idea is to create a weighted query where the score of the page is bigger if the searched terms are in the title page. So, if a page has only the searched terms in its content and not in its title, a page with the searched terms in the title is going to have more score.


f the term “Andromeda Galaxy” is searched, we need to transform that term to the query:


jahia.title:"Andromeda Galaxy"^9 OR (jahia.title:Andromeda^5) OR (jahia.title:Galaxy^5) OR (jahia.containerfield_my_templates_generictext_inserttext:Andromeda^1) OR (jahia.containerfield_my_templates_generictext_inserttext:Galaxy^1)


We are using the index jahia.title, this index contains the title of the pages, and the index jahia.containerfield_my_templates_generictext_inserttext is a specific field defined in the .cnd file that contains the text of the pages. In this query we are giving the weight of 9 when the complete phrase exists in the title, the weight of 5 is assigned if one term of the search is in the title, and the weight of 1 is assigned when the term is in the body.


This string manipulation can be done by creating a java function in the template set project. So, you can create your custom queries depending on your needs. For more information about the available queries operators, you can check here.


The following screen shows the results for this query. The first hit has the word “Andromeda Galaxy” in the title, the second hit has the word “Andromeda in its title”, and the rest have the word Galaxy or Andromeda in their body.


ajax rotator
Recent Posts

RSS feeds
Tag cloud
jsp open text liferay google ASP.NET configuration management web evolution CTA ui Navigation ipc sitecore devices friendly url Solr Jahia open text management server tuning CMS Training scalability content mangement software design open text web solutions content authors Visual Basic design patterns content management los angeles Acquia patterns sitecore oms Visual Studio call to action ajax push industry challenges Sitecore php5 web marketing for dummies icefaces keyword selection reddot white paper web services content management systems reddot cms GSA css mashups mashup mashware MVC cms white paper sitecore meetup social media lead management sitecore online marketing suite, online marketing suite sales 2.0 Business Users social network missing images profile tab WCM Design Patterns fbml Marketing Automation sitecore user group Sitecore Active Directory web3.0 ASP.NET MVC editing content Dynamic Data social 2.0 tools facebook app, OS 3.0, three20 sitecore layout ubuntu IIS CRM drools architecture rdbs lucene web marketing integration opentext design portal GIS twitter LDAP search engine SMM print iphone plugin code design dynament cms los angeles, content management los angeles, web development los angeles, website design los angeles, web design los angeles, sitecore meetup, sitecore user group project management release management web content management SEO OO Development content management whitepaper url EPiServer Active Directory marketing ajax reddot whitepaper Sharepoint webcomponent modules CMS wcm Live Server oshyn performance template design html google appEngine javascript community applications reddot facebook development Velocity web2.0 google search appliance licencing website design los angeles java Drupal VPP open-source LS asp redundant database structure presentation layer open text delivery server VB google maps web design facebook developers mobile web development los angeles profile box .net inbound marketing reddot higher education LiveServer HubSpot enterprise reddot cms higher education jahia cms content authoring deployment open text liveserver jquery tuckey target dynament Maven new sites jahia wcm google analytics Delivery Server cms los angeles linux open source cloud computing JSR-168 CMS usability web design los angeles lead generation data access Active Directory Web development portlets jquery, jquery plugin keywords Flash online marketing search suggest EPiServer cloud consulting content management white paper IT Investment oms open text cms cms whitepaper higher education
2010 Copyright Oshyn. All rights reserved.