HomeClusterLabs Projects

Fix: libcrmcommon: Escape newlines and tabs in XML attribute values

Description

Fix: libcrmcommon: Escape newlines and tabs in XML attribute values

Regression introduced by c6d9cea9, sort of.

Before that commit, newlines and tabs were escaped as "\n" and "\t",
respectively. These are not valid XML escape sequences.

We made the decision not to escape newlines and tabs when we added the
pcmk__xml_escape() function, with the caveat that the "right" answer was
unclear:

"If not, the XML will be more visually understandable, but conformant
XML parsers will treat the newline as a space (for example, if something
extracts the text and displays it to the user). If we do, the parser
case may be closer to our original intent, but the XML is harder to
read, and it may not be appropriate (for example, the parsing tool might
prefer to do line wrapping of a textual paragraph on its own)."
https://github.com/ClusterLabs/pacemaker/pull/3323#discussion_r1474790296

However, now if a serialized (dumped-to-text) attribute value contains a
newline or tab, it gets replaced by a space character upon parsing.

In some cases, an attribute value may contain newlines or tabs that need
to be preserved. So here, we replace them with their respective
character references.

There's no need to escape newlines and tabs in text nodes, because the
replacement with space (0x20) characters happens only during
attribute-value normalization. It doesn't affect text nodes.

Also add separate unit test cases for newline and tab, since they're now
handled differently depending on whether for_attr is true. Separate out
the carriage return test from the non-printing characters test, since it
does print (in a sense) and we want to make it explicit that it's
handled differently from the other whitespace characters. We always
escape a carriage return.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Details

Provenance
nrwahl2Authored on Mar 21 2024, 4:50 PM
Parents
rP781d971396cd: Refactor: libcrmcommon: Capital letters in hex XML character references
Branches
Unknown
Tags
Unknown