Page MenuHomeClusterLabs Projects

Try to replace xml.c:utf8_bytes() with GLib UTF-8 functions
Closed (Merged)Public

Assigned To
Authored By
Mar 29 2024, 4:29 PM
  • Restricted Project
  • Restricted Project
  • Restricted Project
Referenced Files


utf8_bytes() was a hack to get a particular job done in a hopefully portable manner. We needed a way to escape XML special characters. XML characters can include non-ASCII UTF-8 characters.

It's awkward, partly because it's paranoid about the value of CHAR_BIT: UTF-8 uses 8-bit bytes, while the size of a C byte is implementation-dependent.

It also may have errors. Maintainers of a widely used, general-purpose library are more likely to find and fix those errors, compared to our small group of developers with a piece of code that sees very limited use.

This task is to explore using GLib UTF-8/Unicode functions to fill the role of utf8_bytes(). We may find other uses for them as well.

GLib Unicode manipulation functions:

(using developer-old link because there is no corresponding page on the up-to-date docs)

Revisions and Commits

Event Timeline

nrwahl2 created this task.
nrwahl2 created this object with edit policy "Restricted Project (Project)".
nrwahl2 changed the task status from Open to WIP.EditedApr 1 2024, 5:39 PM
nrwahl2 claimed this task.

This is part of CLPR#3403

kgaillot added projects: Restricted Project, Restricted Project.Mon, Jul 8, 5:47 PM