Virtual host configuration and bootstrapping / auto-export behavior

I’ve been semi-successful in managing virtual hosts in the brXM/Hippo platform in that by modifying existing .yaml files which have the virtual host information in them I’ve been able to get these changes or new hosts in some cases apply to the environments in which the project is deployed and the content & configuration bootstrapped.

The problem however is that when making changes to virtual hosts or adding new virtual host groups through the CMS Console with the auto export feature enabled I’ve noticed what seems to me to be some strange behavior in that the new virtual host configuration files are written to paths under repository-data\application\src\main\resources\hcm-content\hst\hosts\www.example.com.yaml.

Reasons this seems to be a concern:

  • From all that I’ve read it seems that virtual host setup should be considered “configuration” and not “content” and based on the various places in the Bloomreach documentation it seems to be the case that there is supposed to be “strict separation” between configuration and content so immediately this seems like an issue to me.
  • All of our current host configuration is stored, what seems to be redundantly, under two different places (repository-data\application\src\main\resources\hcm-config\hst\hosts.yaml and repository-data\application\src\main\resources\hcm-config\hst\hosts\www.example.com.yaml) so the content oriented directory specified above would be a possible third redundant place where virtual host information is expressed within the source code for the project.
  • The files that are being created by the auto-export module do not have the typical definitions: and config: nodes at the top of the file like I’d expect based on the documentation about configuration in general.

Part of the reason why bootstrapping has been difficult is that it’s unclear whether there is:

  1. a problem with the auto-export module in general.
  2. a problem within our configuration which is preventing the auto-export module from working correctly.
  3. an issue in how our established virtual host configuration is setup and perhaps the auto-export module is correct in attempting to store virtual configuration under the hcm-content subdirectories and the hcm-config files should be removed from application-data.
  4. a problem in our existing hosts.yaml and individual host yaml files (ex. www.example.com.yaml) which were originally migrated from .xml and could be a problem all on their own with regard to bootstrapping regardless of the behavior of the auto-export module.

The confusing thing in general with the Documentation on the site is that the example snippets don’t always match up 1:1 with how the actual content or configuration files end up after being auto-exported to the file system for version control. So for example in some of the examples it shows that all of the hosts are configured in a single .yaml file which seems to pretty much match the intentions of the hosts.yaml file that we have but it’s not called out specifically in the documentation and in our case we have what appear to be individual hst:platform configurations stored under repository-data\application\src\main\resources\hcm-config\hst\hosts\ which I would guess are supposed to represent the platform configurations normally found in the console under hst:platform in v13+ but even though the directory structure may not be relevant to the bootstrapping behavior of configuration files it’s not clear whether the current organization is effectively the same as what the auto-export feature would produce if I could verify whether it’s working correctly or not in the first place.

Besides the rambling above I suppose it would be easier in some ways to simply ask whether there is a simple best practice or convention on how these virtual host configuration files should be organized in the file system for a typical project? And the secondary question I suppose is what is the expected behavior of the auto-export feature with regard to host configuration and the layout of those configuration files? Based on answers to these two questions I should be able to determine whether there is something specific in our project which is causing us additional problems.

After digging a bit more into the auto-export behavior described above I found what seems to be a related configuration in the autoexport-module.yaml file under repository-data\application\src\main\resources\hcm-config\configuration\modules\autoexport-module.yaml. In this file it appears that our project has autoexport:injectresidualchildnodecategory property with the following value as one of the values in it’s collection of values.

'/hst:hst/hst:hosts: content'

Reading up on this in on the Automatic Export Add-On page and under the Configuration Options section it states the following:

A few overrides are added into the configuration model by default, but you can adjust this if your project or workflow so requires. As an example: the default for .meta:residual-child-node-category for the node /hst:hst/hst:configurations is content. This is because new channels created from blueprints should be classified as content rather than config. However, during local development when you add a new channel it is more likely those nodes and properties should go into the config tree. For this reason, the value /hst:hst/hst:configurations: config is added by default through the hst configuration model.

I’ve read the above documentation a few times now and I’m still trying to wrap my head around what it means. In the same paragraph it seems to say that that the “the default for .meta:residual-child-node-category for the node /hst:hst/hst:configurations is content” and then later this statement seems to be contradicted by the statement that “the value /hst:hst/hst:configurations: config is added by default”. From what I can tell by looking at a brand new projected created from the v13 archetype it only produces a very simple and minimal autoexport-module.yaml file

New autoexport-module.yaml from vanilla project created from v13 archetype:

definitions:
  config:
    /hippo:configuration/hippo:modules/autoexport:
      /hippo:moduleconfig:
        autoexport:modules: ['repository-data/application:/',
                             'repository-data/site:hippov13:/hst:hippov13']

The autoexport-module.yaml configuration from the problematic project noted above:

definitions:
  config:
    /hippo:configuration/hippo:modules/autoexport:
      /hippo:moduleconfig:
        autoexport:excluded:
          operation: override
          type: string
          value: ['/hst:hst/hst:configurations/*-preview', '/hst:hst/hst:configurations/*-preview/**',
            '/hst:hst/hst:configurations/*-v*', '/hst:hst/hst:configurations/*-v*/**',
            '/hst:hst/hst:sites/*/hst:version', '/hst:hst/hst:channels/*-preview',
            '/hst:hst/hst:channels/*-preview/**', /webfiles/**, '/**/hippo:lockExpirationTime',
            '/**/jcr:lockOwner', '/**/jcr:lockIsDeep',
            '/hippo:configuration/hippo:modules/broadcast/**', /collections, /collections/**,
            '/targeting:targeting/targeting:dataflow/targeting:lock/**', '/targeting:targeting/targeting:statistics/**',
            '/targeting:targeting/targeting:dataflow/**', '/hippo:configuration/hippo:users/**']
        autoexport:modules: ['repository-data/development:/content', 'repository-data/application:/']
        autoexport:injectresidualchildnodecategory: ['**/hst:workspace/hst:containers:
              content', '**/hst:workspace/hst:sitemenus: content', '**/hst:workspace/hst:abstractpages:
              content', '**/hst:workspace/hst:channel: content', '**/hst:workspace/hst:components:
              content', '**/hst:workspace/hst:pages: content', '**/hst:workspace/hst:sitemap:
              content', '**/hst:workspace/hst:templates: content', '/hst:hst/hst:hosts: content',
              '/targeting:targeting/targeting:characteristics/**[targeting:characteristic]: content']
        autoexport:overrideresidualchildnodecategory: ['/hst:hst/hst:configurations: config']

I’m going to dig around more in the documentation as well as source change history for my project to see if I can understand where this additional configuration above was added and determine whether the current behavior is intentional, desirable, or perhaps even best practice or if it even has any bearing at all on the auto-export behavior that I noted in my original post. Any input / advice would be welcome.

From what I can tell the /hst:hst/hst:hosts: content value was introduced within the autoexport:injectresidualchildnodecategory property during a v12 upgrade some number of years ago without much of an explanation from what I can tell. After removing this value and rebuilding and running the project again, I see the same behavior as before with regard to auto-export in that creating a new hst:virtualhostgroup under hst:hist/hst:hosts named www.example.com through the CMS Console and then writing to the repository results in a new file exported under repository-data\application\src\main\resources\hcm-content\hst\hosts\www.example.com.yaml. At this point my expectation is that it should have added the node within hosts.yaml where the other virtual hosts are currently defined in our project but I still haven’t confirmed whether this is truely the expected behavior or best practice from how a standard project would behave.

Hi,

Our best practice is to not modify yaml files by hand when using autoexport. Make the changes in the console. Otherwise you may get conflicts between autoexport behavior and what you do. Autoexport has to structure the files in some way, but it is possible to do this in many ways. Additionally, if using both then your changes to files might get overwritten by changes from autoexport. Do one or the other but not both.

As for the overrides you mentioned. The default for ‘/hst:hst/hst:configurations’ is that children are content, as channels can be created within the experience manager from blueprints. As content it will never be changed by a deploy. However, you may not use blueprints at all, or have some reason to load channels from config. Channels created during development are more properly configuration. As such changes made during development need to be propagated to server environments. So auto-export is set to create yaml files as config.

Unless your intent is to manage channels as content, then I would leave them as config.

Ok, after testing out the auto-export behavior in a Vanilla v13 project I’ve confirmed that creating an additional virtual host group www.example.com through the CMS console that when writen to the repository created the following file.

repository-data\site\src\main\resources\hcm-config\hst\hosts\www.example.com.yaml

definitions:
  config:
    /hst:hst/hst:hosts/www.example.com:
      .meta:residual-child-node-category: content
      jcr:primaryType: hst:virtualhostgroup

This is clearly different behavior from our project and definitely matches more closely to the expectations that I had originally. I’ll have to dig deeper to figure out what is causing the strange content export behavior described in a previous post as it seems to be specific to our project. It still baffles me that removing the /hst:hst/hst:hosts: content value from the set of values under autoexport:injectresidualchildnodecategory property in autoexport-module.yaml didn’t seem to impact this behavior. I’ll have to retest this scenario because perhaps I missed something.

Thank you @jasper.floor for your attention looking at this and for your response. Below is my responses to some of your points.

Our best practice is to not modify yaml files by hand when using autoexport

Yes, I don’t plan on modifying any of the .yaml files by hand if I don’t have to. Currently the autoexport module doesn’t seem to function properly as it’s creating serialized content files when I’m modifying configurations when I’d expect (and confirmed in a fresh project) that this should be creating content node files under repository-data\site\src\main\resources\hcm-config\hst\hosts\ by default.

As for the overrides you mentioned. The default for ‘/hst:hst/hst:configurations’ is that children are content, as channels can be created within the experience manager from blueprints.

At face value it seems like the concept of treating configurations as content is a violation of the strict separation concept mentioned in several places of the Bloomreach documentation. Perhaps this strict separation is a bit less strict depending on whether it is local development or other potential scenarios though :man_shrugging:.

As content it will never be changed by a deploy.

In my understanding content can be changed by a deploy by either bootstrapping new content or by writing a custom action to either reload or delete specific nodes. If this understanding is correct then content does get changed by a deploy. I believe however that by default existing content isn’t overwritten by content that would otherwise be bootstrapped though by default unless there are any specific customizations.

However, you may not use blueprints at all, or have some reason to load channels from config.

Our project currently is setup as a single channel and there are no plans to expand beyond this.

Channels created during development are more properly configuration. As such changes made during development need to be propagated to server environments. So auto-export is set to create yaml files as config.

Unless your intent is to manage channels as content, then I would leave them as config.

To be honest the points here are a little lost on me only because the terms “content” and “config” seem like extremely overloaded terms. Another point is that I’m not sure that I’m really focusing on the concept of “channels” right now but instead the concept of virtual host configuration. When looking into multi channel scenarios in the past the concept of blueprints and some of the nuances on them sort of made sense but at the moment that isn’t my concern as we have a single channel setup, that is of course unless I should be concerned but just haven’t connected the dots yet.

Ok, I just realized why channels may be coming into play here. I just now noticed that in the fresh v13 project that the virtualhostgroup was being created in a directory under repository-data\site instead of repository-data\application. In our project the site` specific modules were never created as part of the v13 upgrade in order to avoid additional complexity after a very rough upgrade. I’m not sure whether these modules will be necessary to add in order to support the auto-export functionality in v13+ but I can start to see perhaps why the discussion of channels may have come up if these modules are normally channel specific (which I’m not sure whether they are not typically).

Righy, so there is a strict separation between how content and configuration is handled, but when it comes to deciding what is content and what is config things can get complicated.

You are right that content can be changed on a deploy by adding new content or creating specific actions to force an update. Even there you have to understand when something is seen as new content and when it isn’t. The rules are explained at [1]. But it simpler for me to say that content isn’t updated by a deploy. I apologize for the simplification, that has no place in a technical discussion.

The question of whether /hst:hst/hst:hosts is config or content is one of those places where content and configuration are mixed. Host configuration created on a live system should be content as then they are safe from accidental changes done on a deploy. Any change will be explicit as you have to add a specific content action that will only be performed once. Host configuration coming from the project is most likely intended as configuration managed by the project. Changes made in production shouldn’t happen or can be seen as hotfixes that need to be included in the project. This allows further development to make changes to that channel. It is absolutely possible to have both in your system.

If your autoexport is not behaving as it is in a fresh project, and that is what you wish, then you should compare the autoexport configurations found at ‘/hippo:configuration/hippo:modules/autoexport/hippo:moduleconfig’.

Autoexport also has specific patterns for how it creates files/directories. If you depart from this, mostly autoexport can adjust, but I can’t guarantee that. So the structure may be different in an existing project comared to a fresh project.

[1] Manage Content - Bloomreach Experience Manager - The Fast and Flexible Headless CMS

Understandable. Likely fixable but that does require a good understanding of the mechanics.

Thanks for the clarifications here as it does help to reinforce my understanding from the documentation. I plan as you suggest by comparing the fresh project’s /hippo:configuration/hippo:modules/autoexport/hippo:moduleconfig configuration with the one in our project to see if I can determine what is causing the difference in the behavior. If it comes down to the lack of the site modules then I may need to go the route of introducing those as well but I’ll hold off until I exhausts other potential paths.

Alright, some more progress here. After adding the following change to /hippo:configuration/hippo:modules/autoexport/hippo:moduleconfig and then exporting the change, rebuilding, and then restarting the application locally then new virtualhostgroup nodes as well as previously existing virtualhostgroup nodes which had previously been mapped and exported as content were then exported as individual yaml files under repository-data\application\src\main\resources\hcm-config\hst\hosts\ as expected.

Interestingly I noticed that this has the potential to collide with configurations for nodes under /hst:platform/hst:hosts as these nodes also seem to be auto-exported into the same location… not sure how that will resolve but I’m guessing that the collision will be handled by the difference in the actual config path in the file (ex. /hst:hst/hst:hosts/ vs /hst:platform/hst:hosts/).

It’s still not clear to me why I was only able to get this out of the box auto-export behavior to work after adding the overrideresidualchildnodecategory value of /hst:hst/hst:hosts: config to the auto-export configuration as this is not how the fresh v13 project is configured and it still has the expected behavior. Perhaps the clue has something to do with the term “residual” because now come to think of it I’m not even sure I fully grasp what the the meaning of this property is trying to convey; For example “child node category”, so the category in which child nodes will be considered presumably for the purpose of the AutoExport feature; Then “override residual”, where does the residue come from, where is it stored, and where can I see it to know that it is there? Perhaps the fact that children of the /hst:hst/hst:hosts were being categorized as content was something that was “residual” and that wasn’t easily visible through the CMS console. I should note that all of these attempts are also done while pointing local development a copy of our production content database so perhaps that’s where the residual setting came from? So for it’s still not clear but with the above change I may have found at least a possible workaround to get our project to auto-export virtual host configurations so that we can propagate those changes to our various environments upon future deploys as we had expected to do originally and in alignment with what assumes to be the standard project setup and workflow for virtual hosts.

Oh, ok. Things are starting to click now. Initially we had '/hst:hst/hst:hosts: content' under autoexport:injectresidualchildnodecategory so the setting was “injected” to be “residual” for that node. After removing this configuration it was presumably still “residual”. Subsequently adding '/hst:hst/hst:hosts: config' under autoexport:overrideresidualchildnodecategory seemed to correct the issue because it’s “overriding” the “residual” setting. If my logic based on inference from naming, discussion, and documentation then it leads to two new questions:

  1. When overriding the residual child node category does that also set the overridden setting in a way where it is residual? In other words if I remove this configuration at a later time after an initial bootstrapping will the “category” still stick around for future changes to those child nodes?
  2. By instead changing our existing '/hst:hst/hst:hosts: content' value to '/hst:hst/hst:hosts: config' under autoexport:injectresidualchildnodecategory would be able to effectively reset the residual category to the original behavior as if we had never set this explicitly? There doesn’t appear to be a an option to reset or clear residual category so perhaps this would be closest option to doing that.

So the experiment to reset everything and attempt to instead set the '/hst:hst/hst:hosts: content' value to '/hst:hst/hst:hosts: config' under autoexport:injectresidualchildnodecategory didn’t seem to have the expected effect. After re-reading now with more context and understanding of how these concepts work, I recall that somewhere I may have read that .meta properties are may be a configuration file only concept and that they do not exist in the repository. If this is the case the concept of “inject” and “override” may make more sense in that the injection may be referring to the practice of injecting extra properties when the nodes are exported (serialized) to .yaml files. Following this trail of thought I realize now also that the default hosts.yaml file from the fresh v13 project as well as the hosts.yaml file from my project both have .meta:residual-child-node-category: content littered all over.

Ex.

definitions:
  config:
    /hst:hst/hst:hosts/dev-localhost:
      .meta:residual-child-node-category: content
      jcr:primaryType: hst:virtualhostgroup
      hst:defaultport: 8080
      /localhost:
        .meta:residual-child-node-category: content
        jcr:primaryType: hst:virtualhost
        /hst:root:
          .meta:residual-child-node-category: content
          jcr:primaryType: hst:mount
          hst:homepage: root
          hst:mountpoint: /hst:hippov13/hst:sites/hippov13

This sort of makes the behavior I’ve been investigating even more strange as it appears that the defaults for most of the child nodes within the virtual host related configurations are all set to content instead of config however in the fresh v13 project we see that auto-export does create content node files… Perhaps I’ve only tried too shallow of cases though in that I’ve been focusing for now on virtualhostgroups which if I understand correctly should be influenced by the set on the .meta:residual-child-node-category property set on hst:hst/hst:hosts/ node if at all but in both projects it seems to be unset within hosts.yaml at least. Perhaps in the two projects this is being defaulted differently though?

From peeking at hst-root.yaml it appears that /hst:hosts is set to .meta:residual-child-node-category: content by default which seems to perhaps support the behavior within our project so why doesn’t the fresh v13 project work this way?

Today while trying move forward with the above mentioned workaround involving use of autoexport:overrideresidualchildnodecategory to force hosts to be treated as content when exporting I seem to have run into a separate yet equally tricky problem. It seems that I’m unable to add new hst:virtualhostgroup nodes under /hst:platform/hst:hosts without locking myself out of the CMS. I’ve been using the HstMode=false query string trick to get back in however without it it appears that I always get a 301 redirect which appears to be forcing me from HTTPS scheme to HTTP which isn’t available through the local proxy. We are currently using a modified form of @woonsanko’s hippo7-rproxy-nodejs which also simulates the HTTPS stuff and normally work without a problem. The question that I have then is why does adding any additional virtualhostgroup under /hst:platform/hst:hosts seem to put my system in this weird state? I couldn’t find any specific details on any nuances which may make virtual host matching work any differently for platform/cms access compared to site hosts under /hst:hst/hst:hosts and the new virtualhostgroups don’t even need any nodes under them to cause the problem and the name doesn’t seem to be relevant. Additionally I have several prexisting virtualhostgroup nodes under /hst:platform/hst:hosts which don’t seem to cause any problem being there. By using the HstMode=false trick I am able to get back in delete the problematic node and then like magic the problem goes away…

Alright, so for the most recent issue pointed out above where I was getting locked out of the CMS console exposed locally through a proxy using HTTPS I happened to find something that I thought was a bit strange in a related document. Although we’re not interested in making Tomcat responsible for the secure HTTPS handling in our local environment because we want to better simulate production like setup, the document on how to Configure Cargo for SSL/TLS suggests adding the hst:scheme property directly on /hst:platform/hst:hosts which I had never before seen done and wasn’t even entirely sure was possible until I tried it.

/hst:platform/hst:hosts
  - hst:scheme = https

Surprisingly after making this simple change then my newly added virtualhostgroup under /hst:platform/hst:hosts no longer seemed to cause any problem and I can now access the CMS console without the HstMode=false workaround.

A few thoughts:

  1. It’s strange in my opinion to require setting hst:scheme = https so high up in the hierarchy as it would then apply to all virtual host groups and hosts.
  2. The fact that hst:scheme property is settable on on /hst:platform/hst:hosts or even /hst:hst/hst:hosts for that mater is not very discoverable because it is not settable on hst:virtualhostgroup nodes which had always led me to believe that it could only be set at the hst:virtualhost nodes and thus had always and currently do set this property at the highest hst:virtualhost node in the hierarchy.
  3. It’s unclear why creating a new hst:virtualhostgroup node under hst:platform specifically had caused the issue for me in the first place as the setup followed preexisting nodes which were setup in the same exact way.

In the case of my project in particular we can probably get by by setting the configuration at the /hst:platform/hst:hosts level and then perhaps overriding it for dev-localhost but it seems like a strange thing to have to do and seems more like a bug than a feature to require this.

I may have celebrated too soon because while the above workaround seemed to fix the issue with the CMS & CMS Console access after adding in the new virtualhostgroup nodes it now appears that with this setting on that the channel manager is no longer loading any channels in the CMS.

Another update regarding the virtualhostgroup issue. It turns out that while setting the hst:scheme property up at a higher level as mentioned, it seems that this may have not been entirely necessary although it alleviate the immediate issue. Instead in order to try to resolve the new Channel Manager issue where it stopped showing channels I took a chance at renaming (effectively removing) the localhost hst:virtualhost node under /hst:platform/hst:hosts/dev-localhost which is the default virtual host that comes with most configurations and which was added recently during our v13 upgrade. As it turns out though this virtual host I believe was conflicting with another by the same name but a different port configured (8443) that we had set up specifically so that we could use everything through the local secure proxy in the past. I had figured the port was enough to deduplicate this and had assumed that we would need to retain dev-localhost virtualhostgroup as it was placed there by those who did the upgrade but I guess maybe it was causing an issue due to a name conflict at least once the two were forced under the same scheme. Once I effectively removed dev-localhost then the channel manager started working immediately and after some testing I was able to determine that setting hst:scheme deliberately at the highest points in the hierarchy seemed to be unnecessary.