On Tue, May 20, 2025 at 11:11 PM Barbara Morris <bmorris@xxxxxxxxxx> wrote:

A CHARACTER(10) does get 10 bytes, possibly fewer characters. The
length-prefix for a VARCHAR field is measured in bytes.

Thank you, this is what I wanted to find out.

A rule of thumb I saw once for converting an EBCDIC database field to
UTF-8 is to triple the length just to be sure.

Wow, that is an engineer's safety margin!

I'm not expert enough to say how database handles the different
character sizes for UTF-8.

Well, if CHARACTER(N) fields are ultimately N bytes, then that says
enough (for my immediate curiosity) about how the database "handles"
UTF-8. Clearly there is an inherent possibility that the number of
*characters* that can be stored is less than what was declared.

I think this would be fairly intuitive for an RPG (or C) programmer.
But for people who "think" in SQL or any other language that largely
abstracts away the raw bytes, it will probably be quite a shock to
discover that their character fields mysteriously drop the last few
characters on occasion.

John

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2025 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.