Not addressing this specifically but if you need a switch I'd recommend
CxTec as a possible source. Used equipment at good prices with
"lifetime" warranty. You'll save a lot of money. www.cxtec.com
elehti@xxxxxxxxxxxxxxxxxx 08/25/2009 1:02:23 PM >>>
Recommendations, anyone?
We know we need to purchase a new network switch for $20,000 because
our
switch gets overloaded with traffic every Monday morning at 0738A.M.
and
occasionally at random intervals during the regular work day. Really
aggravating.
Symptom #1: +++++++++++++++++++++++++++++++
Our web server is at I.P. address 10.0.0.39 loses connection to IBM i
at about 13:32
+++++++++++++++++++++++++++++++++++++++++++++++++++++=
Symptom #2:
During the period that connection to web server at 10.0.0.39 is
interrupted, the 15 or so QZDASOINIT jobs on the IBM i go crazy,
spawning as many as 130+ QZDASOINIT jobs.
We learn about this only because users call telling us that response
time has degraded for jobs on the IBM i and on the web server.
(We are still on V5R3M5 for another 4 weeks, and then will upgrade to
POWER 6 and V5R4M5.)
Step 1. I issue the command DSPACTPJ SBS(QUSRWRK) PGM(QZDASOINIT)
to
get a birds-eye view of things.
Step 2. ENDHOSTSVR SERVER(*DATABASE)
Step 3. ENDPJ SBS(QUSRWRK) PGM(QZDASOINIT) OPTION(*IMMED)
Step 4. Wait 30 seconds for QZDASOINIT jobs to end. I monitor with:
wrkactjob sbs(qusrwrk)
Step 5. STRHOSTSVR SERVER(*DATABASE) to get things going again.
+++++++++++++++++++++
Here is some detail from today's occurrence.
08/25/09 13:32:26.083920 QMHGSD QSYS 0748 QCMD
QSYS
Message . . . . : -ENDHOSTSVR SERVER(*DATABASE)
08/25/09 13:32:35.097808 QMHGSD QSYS 0748 QCMD
QSYS
Message . . . . : -ENDPJ SBS(QUSRWRK) PGM(QZDASOINIT)
OPTION(*IMMED)
00 08/25/09 13:32:35.249968 QWTCCEPJ QSYS 01A6 QCMD
QSYS
Message . . . . : End of prestart jobs in progress.
Message ID . . . . . . : CPF0920
Date sent . . . . . . : 08/25/09 Time sent . . . . . . :
13:32:35
Message . . . . : All prestart jobs are ending for program
QZDASOINIT
in
QSYS.
Cause . . . . . : All prestart jobs for program QZDASOINIT in
library
QSYS
in subsystem QUSRWRK are ending for reason 1. See reason 1 shown
below:
1 - The End Prestart Job (ENDPJ) command was entered.
2 - An error occurred when new jobs were being started.
For reason 2, display the job log (DSPJOBLOG command) for the
subsystem to
determine the cause of the error.
When all the prestart jobs have ended, message CPC0905 will
appear
in the
subsystem job log and the system operator message queue.
Recovery . . . : To start jobs, display the system operator
(QSYSOPR)
message queue (DSPMSG command). After message CPC0905 appears in
the
message
Job Log SAMSON 08/25/09
14:01:49
QPADEV003H User . . . . . . : ELEHTI Number . . . . . .
.
. . . . : 62
ELEHTI Library . . . . . : QGPL
SEV DATE TIME FROM PGM LIBRARY INST TO
PGM
LIBRARY
Cause . . . . . : The prestart jobs for program QZDASOINIT in
library QSYS
in subsystem QUSRWRK are in the process of being ended.
08/25/09 13:33:01.879488 QMHGSD QSYS 0748 QCMD
QSYS
Message . . . . : -wrkactjob sbs(qusrwrk)
08/25/09 13:33:28.966936 QMHGSD QSYS 0748
QCMD
QSYS
Message . . . . : -STRHOSTSVR SERVER(*DATABASE)
00 08/25/09 13:33:29.024400 QWTCCSPJ QSYS 01E6
QC2SYS QSYS
To module . . . . . . . . . : QC2SYS
To procedure . . . . . . . : system
Statement . . . . . . . . . : 6
Message . . . . : Start of prestart jobs in progress.
Cause . . . . . : The prestart jobs for program QZDASOINIT in
library QSYS
in subsystem QUSRWRK are being started.
30 08/25/09 13:33:29.365800 QWDMMSG QSYS 0117
QZSOSVCT QSYS
To module . . . . . . . . . : QZSOSVCT
To procedure . . . . . . . : QzsoAddRoutingTableEnts
Statement . . . . . . . . . : 17
Message . . . . : Routing entry sequence number 600 already
exists.
Cause . . . . . : One of the following errors occurred: -- The
routing entry
with sequence number 600 already exists. -- The sequence number
(SEQNBR
parameter) was not specified correctly. Recovery . . . : Omit
the
command, or change the sequence number (SEQNBR parameter) and
then
try the
command again. To change the routing entry, use the CHGRTGE
command.
Job 619183/QUSER/QZDASRVSD ended on 08/25/09 at 13:32:29; 1 seconds
used; en
All prestart jobs are ending for program QZDASOINIT in QSYS.
Job 621491/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:35; 1 seconds
used; e
Job 621469/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:35; 1 seconds
used; e
Job 621461/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:35; 1 seconds
used; e
Job 621450/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:35; 1 seconds
used; e
Job 621475/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:35; 1 seconds
used; e
Job 621439/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:36; 1 seconds
used; e
Job 621456/QUSER/QZDASOINIT ended on 08/25/09 at 13:32:36; 1 seconds
used; e
And this message finally appears 10 minutes after the initial problem
began:
Message ID . . . . . . : TCP2617
Date sent . . . . . . : 08/25/09 Time sent . . . . . . :
13:42:35
Message . . . . : TCP/IP connection to remote system 10.0.0.39
closed,
reason code 1.
Cause . . . . . : The TCP/IP connection to remote system 10.0.0.39
has been
closed. The connection was closed for reason code 1. Full
connection
details for the closed connection include:
- local IP address is 10.0.0.5
- local port is 8471
- remote IP address is 10.0.0.39
- remote port is 1601
Reason codes and their meanings follow:
1 = TCP connection closed due to expiration of 10 minute FINWAIT2
timer.
2 = TCP connection closed due to R2 retry threshold being run.
More...
From job . . . . . . . . . . . : QTCPIP
User . . . . . . . . . . . . : QTCP
Number . . . . . . . . . . . : 613770
From program . . . . . . . . . : QTOCTCPI
Instruction . . . . . . . . : 0764
To message queue . . . . . . . : QSYSOPR
Library . . . . . . . . . . : QSYS
Time sent . . . . . . . . . . : 13:42:35.562688
As an Amazon Associate we earn from qualifying purchases.