I have an XML file I'm processing - comes from a "partner" app elsewhere here.

One of the nodes is our customer number, and it can contain more than one space, as here -

<custno><![CDATA[008_XY 00020001]]></custno>

We are to expect the CDATA, since we are assuming it should tell the parser to leave things alone.

Now is that a correct assumption? I did a little digging, and it seems there is some variation in interpretation.

XML-INTO is what I'm using, with the default for the trim option (to trim all, including leading and trailing whitespace when there is more than one space, leaving a single space). I left it this way, because we also get newlines in the data.

I would like to know if XML-INTO should leave things alone that are in a CDATA block - that seems to be generally assumed, but I can easily be mistaken here.

My main option is to encode these particular spaces - sed should do the trick with a little effort. The alternative is to get the software on the other end to do the encoding - good luck! And some consultant would want us to run the PAYMNY command.

Thoughts? Bug? Feature? Options?

Thanks
Vern

This thread ...

Follow-Ups:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.