On Sat, 2016-03-12 at 11:32 +0200, Gad Miron wrote:
Hello guys

An SQL question, I hope this is the right form

adding a field (NMDSTTYP) to a 50M records DWH file I pondered what is a
better way to populate this new field:

By a RPG-Chain-like way using a subselct
update DWHD.RTDACTBF_R a
set NMDSTTYP = (select DSTTYP from NMLIBD.NMCOMPNYF b
where b.COMPNY = a.COMPNY)

Or

By doing a MERG with a When Matched clause
(imitating MS SQL Server Update from) :
MERGE INTO DWHD.RTDACTBF_R a
USING NMLIBD.NMCOMPNYF b on b.COMPNY = a.COMPNY
WHEN MATCHED THEN
UPDATE SET a.NMDSTTYP =b.DSTTYP

I figured the 2nd way is better as it should do a one time join
compared to the 1st way that does (well, I don't quite know what it does)

both SQL statements were submitted to batch using SBMJOB RUNSQL with
COMMIT(*NONE)

The 1st RPG-Chain-like Update took 10 minutes
The 2nd MERG Update took 4 hours

anybody care to explain?

I'm guessing a number of things come into play. The first thing to do
would be to get the SQL to explain what its doing, such as "checking
indexes", "index found", "choosing best option" and so on. At least then
you'd know what its chosen and what its rejected. Sadly I can't help on
how to do this step as my knowledge of SQL is lacking; I only know that
most SQL systems allow a method of seeing what it does as a way of
analysing if additional indexes or ways of presenting the SQL might
improve things.

If I was the SQL in the middle, I would do the following:-
1) Process both the files in the CMPNY key order, as you don't say it
must be done in any specific order.
2) Read a NMCOMPNYF record, and save the value of DSTTYP.
3) Read all RTDACTBF records for the key and update NMDSTTYP with stored
DSTTYP.
4) Go back to 2 until done.

or
1) same.
2) Read RTDACTBF
3) On key change only, get NMCOMPNYF and store DSTTYP
4) Update NMDSTTYP with stored DSTTYP
5) Go back to 2 until done.

Either way, its a many update to few gets - really fast.

I don't know how a "merge" works, but I can't see why it would do it
differently to the first one unless one (the first) is making an
assumption that "no records within NMCOMPNYF will/are allowed to be
changed during the run" (repeatable read, kinda) and the other is saying
"always use the latest value in NMCOMPNYF" so is having to go back and
re-get the record for each record read in RTDACTBF.

Then again, it could be doing something totally different such are the
joys of SQL!


Thanks
Gad



As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.