I decided to trim down the groups in the cross-post.
I think the only one really relevant is
so that's the only one I keep in the followup.
About UTF-8, actually the dsig spec only has a "should" about
representing the data in UTF-8, it's C14 that really imposes it.
Dsig only says :
CanonicalizationMethod is a required element that specifies the
canonicalization algorithm applied to the SignedInfo element prior to
performing signature calculations. [...] Text based canonicalization
algorithms (such as CRLF and charset normalization) should be provided
with the UTF-8 octets that represent the well-formed SignedInfo element
6.5 Canonicalization Algorithms
[...] Various canonicalization algorithms require conversion to
[UTF-8].The two algorithms below understand at least [UTF-8] and
[UTF-16] as input encodings. We RECOMMEND that externally specified
algorithms do the same. Knowledge of other encodings is OPTIONAL.
Then you need to read the C14 spec ... XML strings which respect some
criteria of cleanness in a simple document without namespace will not be
changed at all by C14.
C14 imposes UTF-8, the spec gives a sample that describes the result of
3.6 UTF-8 Encoding
Input Document :
<?xml version="1.0" encoding="ISO-8859-1"?><doc>©</doc>
Canonical Form :
Note: The content of the doc element is NOT the string #xC2#xA9 but
rather the two octets whose hexadecimal values are C2 and A9,
If the input document were chinese, all of it's chinese content would
likewise have to be converted to utf-8 before calculating the signature.
The other most signative aspects in C14 are the copy of the document
namespace definitions inside the subset you create, expansion of
entities (all &xxx;) and the "cleaning" of the tag and attributes, but
NOT of the value of the tag.
<e3 name = "elem3" id="elem3" />
changes to :
<e3 id="elem3" name="elem3"></e3>
<e4 id="elem4" name="elem4"></e4>
but the "pretty print" space at start of line were *not* suppressed.
The examples in the doc are very explicit, probably more than reading
all the text.