Skip to content

XMLEncode: encode illegal XML control characters#391

Open
elharo wants to merge 3 commits into
masterfrom
fix/xmlencode-control-chars
Open

XMLEncode: encode illegal XML control characters#391
elharo wants to merge 3 commits into
masterfrom
fix/xmlencode-control-chars

Conversation

@elharo

@elharo elharo commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

XMLEncode.xmlEncodeTextAsPCDATA() passes characters in the range U+0000–U+001F (excluding TAB, LF, CR) through unencoded in the default branch of its switch statement. These characters are illegal in XML 1.0 and cause XML parsers to reject the output.

Additionally, needsEncoding() only checked for & and <, so text containing only control characters was written directly without reaching the encoding method at all.

Fix:

  • In xmlEncodeTextAsPCDATA(), the default case now encodes illegal control chars as &#xHH; numeric character references
  • In needsEncoding(), added a check for illegal control chars so they're routed through the encoding path

Fixes #390

@slachiewicz slachiewicz added the bug Something isn't working label Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

XMLEncode: illegal XML control characters not encoded

2 participants