Solr Integration+indexing Channel Manager pages

#1

Hi all,

I am trying to develop search functionality for a Hippo-powered website.

The key requirements are

  1. Pages are created and edited in the Channel Manager using containers, catalog components, etc.
  2. There is a search tool that returns pages whose content matches the query terms.

While solr integration very naturally indexes documents, indexing Channel Manager pages doesn’t seem to be expected behavior. Still, I was able to successfully pull the important bits of content from the workspace JCR nodes for a given page, and create a custom ContentBean to index. So far so good.

My problem is that I also need to be able to update my index when Channel Manager changes are published. I am able to capture channel manager events using the Hippo Event Bus. However, I don’t see any way to tell, during the publication event, which pages have been changed.

code snippet:

public void onHippoEvent(HippoEvent event) {
    if ("channel-manager".equals(event.category()) &&
        "publishMount".equals(event.action())) {
        List<Node> updatedPages = ???
        for (Node pageNode : updatedPages {
            // SolrIndexItem is a custom class implementing 
            // ContentBean and designed to hold all the information 
            // I need from a Channel Manager page 
            SolrIndexItem bean = getBeanForChannelManagerPage(pageNode);
            solrClient.getSolrServer().addBean(bean);
        }
        solrClient.getSolrServer().commit();
    }
}

One option I see is re-indexing the whole channel, but I would prefer not to do that, as I see it becoming expensive if there are many Channel Manager pages

Is there a good way to obtain changed pages in the Channel Manager, or do I need to re-think my approach to the problem?

#2

I think you would received more HippoEvent event then just “publishMount” one when you publish the channel changes.
Can you debug to verify that ?

#3

Thank you for the direction. I failed to notice that the “write changes” event is triggered when the channel is published. I had only seen it happening when changes were saved, before the channel was published.

By monitoring for the “write changes” event, and then ignoring preview channel changes and changes to non-page content (e.g. sitemap nodes, which change when a new page is added), I was able to isolate the data I needed.