Scheduled publish in clustered environment

Hi,

We have an issue with scheduled publishes where we see concurrent modification exceptions. In some cases this results in a node in the repo being corrupted, leading to being unable to link to that page in the cms. We run bloomreach in our own AWS with two site nodes and two cms nodes.

We disable scheduled jobs in delivery nodes as indicated here: “Disabling Scheduled Jobs on the Delivery Nodes”

I have also disabled all event listeners and custom validation to eliminate those as a potential cause but still see the issue.

We see two types of exception:
Execution of scheduled workflow operation publish on /content/documents/mygov/browse/aaaaaa failed
org.hippoecm.repository.api.WorkflowException: Concurrent workflow action detected
at org.onehippo.repository.documentworkflow.task.CopyVariantTask.deleteDuplicateVariant(CopyVariantTask.java:158) ~[hippo-repository-workflow-14.3.2.jar:14.3.2]

and

Execution of scheduled workflow operation publish on /content/documents/mygov/browse/guide1/six failed
org.hippoecm.repository.api.WorkflowException: Unable to update a stale item: item.save()

It is quite easy to reproduce this by scheduling the publication of multiple items at the same time. I am pretty stumped by this and would welcome any guidance on how we might get to the bottom of it!

Thanks,

David