Allowing HTML void elements and entities in hst.headContribution

Hi Bloomreach Community,

we are experiencing a lot of log output that indicates erroneous XML, like

[Fatal Error] :1:42: The entity "auml" was referenced, but not declared.

This can be caused by a headContribution like this:

<@hst.headContribution keyHint="foo" category="bar">
    <meta name="author" content="John D&auml;"/>
</@hst.headContribution>

After digging around a bit, we found out that the headContribution tag uses an XMLScanner to parse its element, and of course, as it is XML, HTML entities like &auml; or HTML void tags are no valid XML.

Is there a straightforward way to make the headContribution parser not fail when encountering HTML entities or void elements?

Cheers,
Fred

I believe you should be able to use CDATA blocks, so you can skip escaping.

Thanks for the suggestion. However, I cannot really seem to get this working.

The “<![CDATA[” block seems to be illegal when trying to surround the whole meta tag. The implementation seems to only allow !DOCTYPE and comments there, see XMLDocumentScannerImpl::829ff.

<@hst.headContribution keyHint="wat" category="metadata">
    <![CDATA[
    <meta name="author" content="John D&auml;"/>
    ]]>
</@hst.headContribution>

CDATA must not be the value of an attribute in XML, so no luck there, either.

<@hst.headContribution keyHint="wat" category="metadata">
    <meta name="author" content=<![CDATA[John D&auml;]]>/>
</@hst.headContribution>

And an attribute value in quotes must not contain “<”.

<@hst.headContribution keyHint="wat" category="metadata">
    <meta name="author" content="<![CDATA[John D&auml;]]>"/>
</@hst.headContribution>

Maybe you can try to use a variable for the content string.

I do so, I just tried to keep the example here minimal:

    <@hst.headContribution keyHint="wat" category="metadata">
        <meta name="author" content="${author!"foo"}"/>
    </@hst.headContribution>

where author is just

request.setAttribute("author", "John D&auml;");

Try adding:

<!ENTITY Auml "&#196;">

or just use the number code above. From XHTML 1.0 - DTDs

I resolved it by changing the code that passes the content string to FTL.

Short context:

  • We generate breadcrumb structured data for Google from HstSiteMenuItems.
  • We used to HTML escape the items’ names
  • With changing this to XML escaping, everything works fine.

It would be nice to have this detail in the documentation for headContribution(s) at Head Contributions - Bloomreach Experience Manager (PaaS/Self-Hosted) - The Fast and Flexible Headless CMS.
I bet we weren’t the first to stumble over this.

Now that I know that I can only pass valid XML, it makes sense that no void HTML elements are allowed in headContributions (HTML’s tag is a void element and breaks headContribution rendering if the closing / is omitted).