Page MenuHomeClusterLabs Projects

Handle multi-byte Unicode characters in pcmk__xml_escape() and pcmk__xml_needs_escape()
Closed (Merged)Public

Assigned To
Authored By
nrwahl2
Feb 1 2024, 10:15 PM
Tags
  • Restricted Project
  • Restricted Project
  • Restricted Project
Referenced Files
None
Subscribers

Description

See TODOs in these functions.

This will probably be easy to implement. It might be trickier and will require some research to ensure we're covering all cases correctly. Also not sure if we can assume UTF-8 or need to handle some exotic encoding.

See also the xmlEncodeEntitiesReentrant() function in libxml2. It has problems (see T768) but its Unicode logic might be mostly correct:
https://gitlab.gnome.org/GNOME/libxml2/-/blob/master/entities.c

Alternatively, xmlEncodeSpecialChars() (also problematic) thinks Unicode characters will "just work" somehow... I haven't done any research on this yet.

Event Timeline

nrwahl2 created this task.
nrwahl2 created this object with edit policy "Restricted Project (Project)".

At time of task creation, these two functions don't exist, but they should be merged soon.

kgaillot added projects: Restricted Project, Restricted Project.Mon, Jul 8, 5:49 PM