Re: CCSID - RPG issues -- RPG400-L

Hello Barbara

Thanks for the extended checking...

UTF8 Hebrew presents no problem since it translates , in the built-in RPG
conversion,
to CCSID(424) without issues.

The problem lies with Spanish, French, Chinese,Russian, Greek, Arabic etc.

As to the truncation in %subst , will deal with it when reach that bridge

Gad

date: Wed, 25 Feb 2026 08:03:19 -0500
from: Barbara Morris <bmorris@xxxxxxxxxx>
subject: Re: CCSID - RPG issues

On 2026-02-25 5:41 a.m., Gad Miron wrote:

Hello Barbara

The (source) ds_NMMLIMP2.ITEMNAME looks OK - I'll check again though

Does %sbst function "knows" how to deal with a VARLEN data?

I'll check the CHARCOUNT thing.

%subst understands VARLEN data.

ds_NMWRKR.ITEMNAME8 = %subst(ds_NMMLIMP2.ITEMNAME :1 : 128) ;

But I don't think you should use %SUBST here.

The target field ds_NMWRKR.ITEMNAME8 has a length of 128 so if the
source field ds_NMMLIMP2.ITEMNAME has data that is longer than 128, it
would be automatically truncated.

If the source field ds_NMMLIMP2.ITEMNAME has data that is shorter than
128, the %SUBST would fail with RNX0100 ("Length or start position is
out of range for the string operation"). The start and length for %SUBST
refer to the current VARLEN length so 128 would not be allowed unless
the length of the source is 128 or greater.

Use an ordinary assignment instead:
ds_NMWRKR.ITEMNAME8 = ds_NMMLIMP2.ITEMNAME;

But this assignment without %SUBST would still be a problem if the data
in ds_NMMLIMP2.ITEMNAME is longer than 128 bytes and bytes 128 and 129
contain a 2-byte Hebrew character. ds_NMWRKR.ITEMNAME8 would only get
the first byte of that character (in CHARCOUNT STDCHARSIZE mode).

I did a little experiment with some CCSID 424 (Hebrew) data and all the
Hebrew characters were 2 bytes in UTF-8. It seems likely that would be
true of all Hebrew characters. If your source string has a mixture of
Hebrew (2-bytes per character) and non-Hebrew (1-byte per character), I
think you'd have a 50/50 chance of having invalid data in the target
string after the assignment in STDCHARSIZE mode.

Adding CHARCOUNTTYPES(*UTF8) and then doing the assignment in CHARCOUNT
NATURAL mode would allow RPG to handle the UTF-8 data correctly and
possibly only assign the first 127 bytes instead of splitting the last
character.

--
Barbara

This mailing list archive is Copyright 1997-2026 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.