I was wondering why I didn't see the REGEXP_* functions here, and the reason is that we don't have the International Components for Unicode APIs loaded. So now it makes sense why x'41' matches 'A', x'41' is Unicode for 'A' and REGEXP_LIKE operates on Unicode values. Using REGEXP_* functions, which appear to assume Field1 is Unicode instead of EBCDIC, could have other unintended consequences since some Unicode code points are 2+ bytes wide. Or maybe your CCSID is 65535. And yet that leaves me with even more questions since the documentation says both the string to be searched and the pattern are converted to UTF-16, and FOR BIT DATA strings are not allowed. It seems that the REGEXP_* functions could be extremely confusing if the data is invalid, and therefore not really a good way to go for correcting that data.

When you look at Field1, does it have an 'A' in there instead of a blank? Or is it really an x'41'. Maybe the conversion from your job CCSID to UTF-16 converts x'41' to x'0041'. Maybe it assumes that a hex constant is already UTF-16 (even if it is only a single byte wide).

Mark Murphy
STAR BASE Consulting, Inc.
mmurphy@xxxxxxxxxxxxxxx


-----mprice@xxxxxxxxx wrote: -----
To: Midrange Systems Technical Discussion <midrange-l@xxxxxxxxxxxx>
From: mprice@xxxxxxxxx
Date: 02/15/2016 11:23AM
Subject: Re: SQL and Regular expression


When we import data from an external source, we sometimes get a 'bad'
character.

In this particular case I was looking for X'41' ( EBCDIC ) that gets sent
for a ' ' (X'40').

In other words REGEXP_LIKE(Field1,'\x41') was matching 'A'.

Michael




John Yeung <gallium.arsenide@xxxxxxxxx>
Sent by: "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxx>
02/15/2016 10:30 AM
Please respond to
Midrange Systems Technical Discussion <midrange-l@xxxxxxxxxxxx>


To
Midrange Systems Technical Discussion <midrange-l@xxxxxxxxxxxx>
cc

Subject
Re: SQL and Regular expression






On Mon, Feb 15, 2016 at 8:34 AM, <mprice@xxxxxxxxx> wrote:
I was running an SQL statement on a field in a physical file using
REGEXP_LIKE(Field1,'\xC1') expecting to find 'A' .

After this failed to return the desired results, I discovered that I
have
to use the ASCII equivalent.

ie REGEXP_LIKE(Field1,'\x41')

Expected ?
or
Strange ?

To me, SQL is behaving in a way that is expected (or at least not
particularly strange), but what you are trying to do is very strange.
Why are you searching for hex codes?

John Y.

As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.