HippoCMS with Redis instead of file storage

#1

Hello,

I’m wondering if it’s possible to replace Lucene filesystem storage with some other like memory cache - for example Redis. Creation of lucene index (storage folder) takes a lot of time for bigger websites during warmup. In some cases I experienced 40 minutes warmup due to indexing operation. I think that moving this to Redis might speed up startup operation.

Is it anyhow possible? I found that there is DirectoryManager class and we can provide our own implementation of getDirectoryManager() to override default behaviour and return our own manager implementation.

Do you have any experience with this? Do you think that it’s possible and worth effotr trying to keep this index in-memory to speed up and make it possible to share index between instances in cluster?

Maybe it’s somewhere out of the box in lucene and I’m not able to find it?

Thank you very much in advance for any help/tips.

#2

There’s a lucene index export addon which allows you to download the index export of one of the instances in the cluster (enterprise offering):
https://documentation.bloomreach.com/library/enterprise/enterprise-features/lucene-index-export/lucene-index-export.html.

You could use this export to start up faster

#3

Indexing is one time operation and you don’t need to index after each restart.

#4

I know but it’s system working in cloud and instances can be removed or added any time

#5

As Baris already mentioned, we solve that problem using lucene index backup and restoring solution (in the enterprise stack).

I once wrote some utility scripts to backup and restore the lucene index easily in that scenario from the enterprise lucene export endpoint:

Its dockerization is probably deprecated now since v13 ships its own docker support in product level, but the scripts (*.sh) files might be still useful to download/backup, restore, or re-initialize on startup (when invoked by setenv.sh) lucene index folder.

In Lucene 3.6, which Jackrabbit depends on now, I see only FSDirectory and RAMDirectory as essential Directory implementations, each of those is mapped to Jackrabbit’s manager implementation: FSDirectoryManager or RAMDirectoryManager.
So, it doesn’t seems that Redis is supported even in Lucene (3.6) level, so it will be very difficult.

According to the warning in the JavaDoc of RAMDirectory, it doesn’t sound like for normal production envs:

Maybe in the latest version, but not in v3.6 as far as I can see.

As Apache Jackrabbit v2 is under the maintenance mode and mostly done for API compatibility for Apache Jackrabbit OAK, I don’t expect a big new feature or improvement in that area in Jackrabbit v2.

Regards,

Woonsan

#6

Thanks, I’ll check if we can migrate to Enterprise