|
On 06-Jun-2014 10:00 -0500, Henrik Rützou wrote:
You can't concatenate SBCS and DBCS data in one string. It doesn't
make sense since SBCS only has 256 code points in one byte and DBCS
has 64K code points in two bytes and there is no way you can
distinguishes if a character is made of one or two bytes in a
concatenated string.
They can be combined, using _shifted_ DBCS; "mixed data" character
strings. Not to imply the utility of that capability for the OP, just that
the capability does exist, contrary to the above implication.
UTF-8 is in basic a one to four byte character set that in one byte
encoding shares ASC-II 7 bit character set. UTF-8 has reserved bits
in the first byte that tels how many of the following bytes (0-3)
that creates the "character". <<SNIP>>
Whereas the UTF uses reserved bits to indicate how many bytes for each
character, the /shift characters/ of EBCDIC identify when the stream of
bytes are DBCS vs SBCS; i.e. whenever there is a shift-out of SBCS [␎:
EBCDIC 0x0E] into DBCS, and a shift-in [␏: EBCDIC 0x0F] returning to SBCS
out of DBCS.
--
Regards, Chuck
--
This is the RPG programming on the IBM i (AS/400 and iSeries) (RPG400-L)
mailing list
To post a message email: RPG400-L@xxxxxxxxxxxx
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/rpg400-l
or email: RPG400-L-request@xxxxxxxxxxxx
Before posting, please take a moment to review the archives
at http://archive.midrange.com/rpg400-l.
As an Amazon Associate we earn from qualifying purchases.
This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].
Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.