I have an XML file I'm processing - comes from a "partner" app elsewhere
here.
One of the nodes is our customer number, and it can contain more than
one space, as here -
<custno><![CDATA[008_XY 00020001]]></custno>
We are to expect the CDATA, since we are assuming it should tell the
parser to leave things alone.
Now is that a correct assumption? I did a little digging, and it seems
there is some variation in interpretation.
XML-INTO is what I'm using, with the default for the trim option (to
trim all, including leading and trailing whitespace when there is more
than one space, leaving a single space). I left it this way, because we
also get newlines in the data.
I would like to know if XML-INTO should leave things alone that are in a
CDATA block - that seems to be generally assumed, but I can easily be
mistaken here.
My main option is to encode these particular spaces - sed should do the
trick with a little effort. The alternative is to get the software on
the other end to do the encoding - good luck! And some consultant would
want us to run the PAYMNY command.
This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact
[javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.