<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
   <channel>
      <title>Ard Schrijvers</title>
      <link>http://blogs.onehippo.org/ard/</link>
      <description></description>
      <language>en</language>
      <copyright>Copyright 2010</copyright>
      <lastBuildDate>Wed, 28 Jul 2010 17:17:32 +0100</lastBuildDate>
      <generator>http://www.sixapart.com/movabletype/?v=3.35</generator>
      <docs>http://blogs.law.harvard.edu/tech/rss</docs> 

            <item>
         <title>Relevance scoring your search results with the HST Query</title>
         <description><![CDATA[<p>Typically, customers like it when a title matches some query, that this item gets scored higher than a match for the query only in the documents body.</p>

<p>Now, without going into the subtle details of Lucene scoring, by default all text in a Hippo Document is indexed with equal weight (assuming you do not configure this in indexing_configuration.xml). This means, that a word in the property 'title' is equally important as a word in the 'body'. Now,  short documents containing the query term in their body get higher ranked then long document containing the query term in the title (see http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html at norm(t,d)).</p>

<p>This is what customers don't like. Easy fix, is to not only search in 'all-text', but OR this with a search in the 'title'. This results in *much* higher scores for the title (tests showed really well results for me).</p>

<p>How To?</p>

<p>Where you had something like :</p>

<p>    Filter filter = hstQuery.createFilter();<br />
    filter.addContains(".", query);<br />
    hstQuery.setFilter(filter);</p>

<p>change this into</p>

<p>    Filter filter = hstQuery.createFilter();<br />
    hstQuery.setFilter(filter);</p>

<p>    Filter titleFilter = hstQuery.createFilter();<br />
    titleFilter.addContains("@demosite:title", query);</p>

<p>    Filter fullTextFilter = hstQuery.createFilter();<br />
    fullTextFilter.addContains(".", query);</p>

<p>    filter.addOrFilter(titleFilter);<br />
    filter.addOrFilter(fullTextFilter);</p>

<p>Of course adding an extra property that is also more important. like 'summary' is straighforward I assume.</p>

<p> </p>

<p>Note this blog did not describe Lucene index or query time boosting. That is something completely different. With query time boosting, you can for example say that some specific word in the query is more important, for example : if my search is [ ard^10 is great ] , than the word 'ard' is boosted a factor ten in scoring.</p>]]></description>
         <link>http://blogs.onehippo.org/ard/2010/07/relevance_scoring_your_search.html</link>
         <guid>http://blogs.onehippo.org/ard/2010/07/relevance_scoring_your_search.html</guid>
         <category></category>
         <pubDate>Wed, 28 Jul 2010 17:17:32 +0100</pubDate>
      </item>
            <item>
         <title>Cocoon is easy</title>
         <description><![CDATA[<p>I just disagree with people telling that cocoon is hard to learn. If you get it explained by somebody who knows how it works, then it simply can't be hard. Perhaps it is hard for people because nobody takes their hand, and help them through the first few steps. After all, what is hard about it? You need to know some sitemap matching things, and you need to know how to write some xsls. Only knowing xsl, which you learn in twice the time you learned html, so about 2 days, and knowing sitemaps, you can make yourself valuable for a project. Not only valuable, but already productive. And after some time, you will be able to build sites yourself very fast. I am convinced that development in cocoon outruns any other framework, and when starting with the right patterns and maintain them, maintainance will be easy as well.</p>

<p>So, what is the problem. Why is not the entire world using cocoon? Were the founders too superior for other people to be able to understand cocoon? Did they have a brilliant vision, but forgot about (average) users?</p>

<p>Lately, I have seen many other projects developed in cocoon, and some things frustrated me extremely. The things the sites all had in common were:</p>

<p>1) The content repository was hippo repository, a repository implementing the WebDaV protocol<br />
2) The customer was in control of maintaining their own sitemenu (all in the repository). <br />
3) Standard cocoon components, either standard in cocoon, or standard added by Hippo</p>

<p>As you can see, 1 and 2 are part of the repository, which is the same for every customer (except for custom backend templates), and 3 is part of a jar we include in our cocoon build. </p>

<p>So, that should give a strong enough base, to implement sites through some sort of best practice patterns. But, and that is the weakness of cocoon: it is just to flexible. People do not know how to start. I did not look at <a href="https://forge.pronetics.it/svn/scratchpad/blueprint/ ">blueprint</a> yet, but it is an attempt to help people getting started with cocoon. </p>

<p>Now, for building websites with hippo repository and with using a sitemenu, we discussed about the best way to get started, and how to structure your pipelines. The result is something incredibly easy to use. It assumes one thing: you have a repository with some data and a genuine sitemenu. When you do have this, you have your site with a best practice pattern up and running (including paging through archives) within minutes. </p>

<p>All you need to do is check out a project.xml and a properties file and install the latest hippo cocoon plugin. You just deploy your site, and start it. After starting, you open your browser on the right domain and port, and start configuring cocoon from there (like which sitemenu's to use, if your urls do have a prefix, if you have cforms, etc). when ready, you have to redeploy, and start your site again. Then, you have the option to auto-generate a site, which is based on the sitemenu(s) structure found in the repository. All configuration is done for you. We use 5 different stores for cocoon, to optimize performance. All is configured and arranged in the default auto created site. </p>

<p>And more important, it is an incredible simple sitemap pattern <strong>and</strong>, evenly important, a strong and correct xsl pattern, build on import precedence concept. With this functionality, you help everybody starting with cocoon to learn the patterns, enable incredible fast development, and enable everybody to understand each others sites. </p>

<p>I hope to have a demo online on short notice.</p>]]></description>
         <link>http://blogs.onehippo.org/ard/2006/12/cocoon_is_easy.html</link>
         <guid>http://blogs.onehippo.org/ard/2006/12/cocoon_is_easy.html</guid>
         <category>Hippo CMS</category>
         <pubDate>Mon, 04 Dec 2006 16:31:20 +0100</pubDate>
      </item>
            <item>
         <title>just checking....</title>
         <description></description>
         <link>http://blogs.onehippo.org/ard/2006/04/just_checking.html</link>
         <guid>http://blogs.onehippo.org/ard/2006/04/just_checking.html</guid>
         <category></category>
         <pubDate>Wed, 19 Apr 2006 13:18:19 +0100</pubDate>
      </item>
      
   </channel>
</rss>
