Good comments, Craig, thanks.

As to the occasional stumble of those variant characters - as I recall, that usually happens when text is embedded in source and even when certain characters are used in variable names.

A solution is to put all character values into message files or data files - then flag the message or data with the CCSID you want - then the system just does the conversion.

At least, something like that - Bruce V and others can easily straighten out that description, I hope!

Vern

On 6/20/2018 2:34 AM, Craig Richards wrote:
I spent quite a lot of time working on interfaces passing xml data between
Oracle and a Legacy IBMi system using MQ.
Handling and understanding and not corrupting the UTF-8 data took some
care, especially since the CCSID support for RPGLE at the time wasn't as
good as it has become recently

The xml messages were in UTF-8 and we stored them in XML Data Types where
they could remain UTF-8.
I wrote an enquiry to take the xml file and write it to a stream file in
UTF-8 where it could be accessed by a web server running on the IBMi, and
then invoke a browser on the client to view.

That was invaluable in being able to understand the data - to be able to
view it and determine how to check and deal with it before trying to bring
it into the CCSIDs of the Jobs and DB2 database.
Browsers are generally very good at handling the UTF-8, even showing what
the normally non-visual control characters.

Buck - on the System CCSID change.
As it happens, two of the major clients I have worked with in the last 22
years share a similar problem.
One of the Clients has the DEV IBMi set to CCSID 37 and the PROD one to
CCSID 285,
the other has the DEV IBMi set to 285 and the PROD set to 37.

Obviously this wasn't by design and they both still occasionally fall foul
of fairly common characters like £, $, ] ( or is it [ ? ) I think one of
the square brackets is the same for both CCSIDs and the other is not.
This is a problem for both DB2 and the IFS and you have to take a bit of
care to avoid tripping up.

But the point I wanted to make is that my understanding is that both of the
clients have spoken on more than one occasion to IBM about trying to sort
it out so that both machines are 285 but in neither case were they willing
to take the risk and do it.
I'd be a bit nervous about the implications of changing QCCSID.
As you rightly point out there are several areas to consider like the CCSID
of the DB2 tables and how it gets set ( e.g is it specified in DDL or
whatever ) and the CCSID of the Jobs.
I think there is a bunch of other stuff to take into account when you look
into Internationalisation.
Just my thoughts but I'd be doing quite a bit of investigation before I
considered changing that system value.

best regards,
Craig


On 20 June 2018 at 03:43, John Yeung <gallium.arsenide@xxxxxxxxx> wrote:

On Tue, Jun 19, 2018 at 6:42 PM, Buck Calabro <kc2hiz@xxxxxxxxx> wrote:
On Tue, 19 Jun 2018 at 17:19, John Yeung <gallium.arsenide@xxxxxxxxx>
wrote:
When you're inspecting a file using WRKLNK, option 5 (or,
equivalently, using DSPF directly on the file), what you get is a
character (not hex) display. In this mode, you CANNOT BE SURE what the
hex codes are.

You ***MUST*** press F10 to get the hex display. I don't care if you
don't see the BOM in character mode. That doesn't matter. Press F10.
The extra bytes will be there. EFBBBF. No matter what the CCSID is,
those bytes will be there at the beginning. Those three bytes are the
BOM for UTF-8, and CHGATR has no effect whatsoever on the bytes.
I don't want to speak for John, but I'm sure I missed out on why his
advice is, in general, useful and important.
I don't think my advice is *generally* useful and important. But it's
pretty darn close to critical for the times when you really need to be
sure of the exact bytes in a stream file, and it seems that this is
often the case when trying to diagnose encoding issues.

Likewise, it's /possible/ (but I haven't myself
tested it) that DSPF, WRKLNK and friends will try to do the conversion
for you as IBM i tries to display the text from the IFS file onto your
display screen.
This absolutely happens. If you create a stream file with UTF-8
content, including BOM, and set its CCSID to 1208, then DSPF will not
show you the BOM. It will do you the "favor" of hiding it. Let's say
your stream file has the following content:

3 bytes for BOM, followed by the three letters 'foo', followed by
Windows-style newline. (You can create this with Notepad, for example.
You have to be careful to save it as UTF-8, and if you have to FTP it
to your i, be sure to force binary mode!)

This file is 8 bytes long. If it has the proper CCSID of 1208, DSPF
will show you three visible characters, and the 'f' will be
left-aligned, as though it's the very first byte in the file.

But then press F10. Lo! You will see 16 hex characters, corresponding
to 8 bytes.

If you use CHGATR to set the CCSID to, say, 1252, then the system
won't know to interpret the BOM as a BOM, and it will instead do its
best to actually render those bytes as characters, thus showing the
six visible characters 'foo'. But once again, press F10 and the
bytes will be identical to what they were under 1208. Or indeed any
CCSID, because CHGATR has no effect whatsoever on actual bytes.

I urge people to try this for themselves. Until you see it with your
very own eyes, you won't really understand that you CANNOT know what
the bytes are unless you are using hex mode. I know I was astonished.

John Y.
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: https://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at https://archive.midrange.com/rpg400-l.

Please contact support@xxxxxxxxxxxx for any subscription related
questions.

Help support midrange.com by shopping at amazon.com with our affiliate
link: http://amzn.to/2dEadiD



As an Amazon Associate we earn from qualifying purchases.

This thread ...

Follow-Ups:
Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.