On 15-Jul-2015 11:07 -0600, Gary Thompson wrote:
I have the following SQL that has been working as expected for about
2 years:
SELECT COALESCE(om8.suptyp,' ')
INTO :wk_OutAsnSts
FROM om01 om1
LEFT JOIN om08 om8
ON om8.outnum = om1.outnum
AND om8.suptyp = 'AS'
AND om8.recsta = '5'
WHERE om1.outnum = :wk_OutNum
The query is intended to return the EDI 856 ASN status of the "Sales
Outlet" specified by "wk_OutNum".
This week we received a problem report from one of our seven
warehouse locations that six outlets did not receive the expected ASN
pallet labels and Document.
This happened on Monday and Tuesday of this week at that one
location, but has yet to be reported by any other warehouse location.
We send 70-90 ASN documents daily.
I have the impression from followup(s) that the row data that should
be selected is static\unchanging; i.e. remains unchanged since a prior
[auto-committed] update performed long before the query runs, and that
the data remains unchanged even until the next run of the query when the
problem did not occur [as noted below].
Probably worth ensuring the query runs with messages_debug active,
and the joblogs are saved to review if the Access Plan was rebuilt
and\or there is a visibly different implementation being logged for the
failing runs as contrasted with the successful runs. The database
monitor feature should assist in that regard too; detailed monitor
against those jobs to catch that query each time it runs. Be sure to
get a Print SQL Information (PRTSQLINF) of the program too; taken after
a successful run.
All location data is processed by one RPGLE program which has the
SQL shown above.
I reviewed all status reports created by this program yesterday and
found six examples where the program reported SQL State = 02000 (no
rows) for an outlet that *is* defined.
Late yesterday, we identified this problem in time to repeat the
procedure that runs the query, and then successfully created and
transmitted the required ASN documents.
Perhaps the Access Plan changed, and the new plan runs a code path
that is incorrectly unable to find the record; i.e. perhaps there is a
defect in the run-time implementing the query with an alternate AP. If
the change date of the program is updated, then the changed\defective
plan or the original\functional plan presumably has been stored; the
Print SQL Information (PRTSQLINF) would record some details of the plan.
If the plan changes for the failing queries, that would support the
idea of a possible defect and either the maintenance is not applied or
the problem is not yet diagnosed. The plan(s) for the statement could
exist only in the cache; e.g. if the user(s) running the program are not
authorized to update the program [associated space (PAS)] to store the
updated plan.
There was no information given about the host variable declaration
(wk_OutNum) or the value of that variable at run-time, nor the DDL for
the column being compared (outnum columns) nor keys\index\constraints.
Note that seemingly unpredictable results can be /normal/ in some
cases. If the data is VARCHAR and the physical data is no properly
trimmed, there is a known issue [though there may be a QAQQINI feature
that might have a toggle to adjust in some way] whereby comparisons via
a keyed Access Path may operate different than non-key compares;
untrimmed and trimmed values [for trailing blanks] may compare equal or
mis-compare when the RTRIMmed values are equivalent. Similar issues may
affect host variable compares. Similarly unpredictable (mis)compares
may be possible for zoned decimal with improper zone portions of the
numerals; i.e. different results for different implementation, keyed vs
non-keyed. If the Sort Sequence (SRTSEQ) is dynamic, then data issues
like capitalization [upper\lower case] might effect comparison
differences for different submissions; e.g. submitted later in response
to a prior failure, that Work Management setup might be different than
the typical WM setup for running the query, and thus the resubmissions
function as expected whereas the others submissions could be properly
failing to find the row.
Repeating the query now also returns the expected "row found"
condition.
Presumably that intends to suggest sqlcode='00000'; arguably, a lack
of a condition\class code indicating a lack of any identifiable issue
with the operation of the SQL statement.?
My question is what, other than an actual "no row" could cause SQL
State = 02000 ?
Only the condition that the row was not found, according to the
specified selection for the SELECT INTO, should be identified by the
SqlState [with Class Code 02 meaning "No Data"] of '02000' with the
associated SQLCODE=100.
Because I often see CPU well above 100%, I suspect that some kind of
"time-out" may cause this ?
Almost surely, No. A timeout should be reported as an SQLSTATE that
reflects whatever was the timeout condition.
If the effect or implementation for the query changes such that the
result is incorrect [perhaps due to environment such as available CPU or
memory], then the effect is clearly a defect for apparently static row data.
As an Amazon Associate we earn from qualifying purchases.