Re: Weird Characters in table -- MIDRANGE-L

On 09-Nov-2011 11:56 , Ashish Kulkarni wrote:

I am running into weird issue of having some weird characters in
table which are not displayed on any SQL program, for example if I
run query

select * from mycontact where email = 'myemail@xxxxxxxx'; this query
does not return any rows, but if I modify the query like below

select * from mycontact where email like 'myemail@xxxxxxxx%'; it returns
data, so I believe is there is some weird data after ".com", but if I run

select * from mycontact; from 5250 screen or iseries navigator I
don't see any characters after .com

Some "unprintable" characters may not appear as visible when presented at a screen. Most of those are "control characters", some of which can be interpreted as 5250 control sequences. Querying the data on the 5250 device\emulator using the Query/400 report writer [using either WRKQRY\RUNQRY or via STRSQL SELECT output to display] would typically expose the problem characters; possibly with an error about failing to write the data to the screen, and the "bad" characters were replaced by the "dup-character". However there is at least the one other character, code point x'41' which is typically the "required space" character, and that character will be invisible but not as an "unprintable" character. IIRC, running with a Greek setup, the x'41' will appear as a visible glyph.

so how do I find what it is

Using STRSQL or RUNQRY with output to printer when the printer file is defined or overridden to have RPLUNPRT(*YES '!') [or similar, or a better, minimally used character] would allow for a relatively easy F16=Find search within the spooled report.

Generically for all "unprintable" characters, one of the following SELECT statements should assist. Optionally add the x'41' into the long hex string on the TRANSLATE scalar [I had originally written here that the required space character would be *unlikely* as an issue, but a later update by the OP showed that character was at fault for the specific error case; I have adjusted each example to include x'41'].

<code>

select rrn(e), email, hex(email) from mycontact e
where translate( email, x'3F'

,x'000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F41FF'
, x'3F') like '%' concat x'3F' concat '%'

_or perhaps_

with
emailmask (r, email, emailx) as
( select rrn(e), email, trim( trailing x'3F' from
translate(email, x'3F'

,x'000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F41FF'
, x'3F') )
from mycontact e
)
select r, email, hex(substr(email,length(emailx)+1))
from emailmask
where email<>emailx

</code>

and how do i fix this data

While also replacing any embedded control\unprintable characters which may or may not be an issue [but easily detected in a query after the UPDATE], the following should generically strip off unwanted trailing characters. If the data already has the "substitution character" x'3F' in the data, then there is another existing problem, and this /solution/ might obfuscate; so perhaps scanning the data for the x'3F' is worthwhile before, as well as after the UPDATE.?

<code>

update mycontact
set email= rtrim( trim( trailing x'3F' from
translate( email, x'3F'
,x'000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F41FF'
, x'3F'
) /* include x'41' too, perhaps */
)
)

</code>

Regards, Chuck