Below is an update from IBM.
For the archives.
I've staggered my DUPMEDBRM start times by 5 minutes, should avoid the issue, which is probably not fixable.
- When an operation needing a tape resource runs and needs a tape drive the tape code will attempt to reserve a tape drive. If it cannot get a  
reservation on it's first choice it will try again to reserve another   
drive. It will do this repeatedly until it gets a reservation or until  
all drive resources have been tried.   Assuming a drive is found that is
 available a 'reservation' is put on the tape drive resource. This      
reservation exists on the tape drive itself and is assigned to a        
specific adapter port by World Wide Port Name (WWPN).  This reservation will be persistent until the host releases the reservation. If anything 
stops the host from releasing the reservation the drive will remain     
reserved and can only be used by that specific port adapter.            
                                                                        
- Once a drive is reserved and a tape needs to be mounted, tape code    
will check to see if the tape is already mounted in another drive. If   
the tape is mounted in another drive the system running the operation   
needing a tape resource will attempt to reserve the tape drive that the 
tape is mounted in.  If it gets the reservation for a short period of   
time the job will have caused two drives to be reserved. The first drive will be released shortly, however if multiple jobs are starting at the  
same time that need drives it is possible that each job could reserve   
multiple drives for short periods of time and some job may not be able to get a drive. Operations requiring more than one tape resource such as
DUPMEDBRM or DUPTAP can magnify this issue. Slightly staggering job     
start times can reduce this possibility.                                
                                                                        
- Different types of errors (user, job, device) can cause a reservation to be left hanging on a drive.  Once this happens, only the system using the adapter port with the proper WWPN will be able to use the tape drive
or release the reservation.                                             
                                                                        
- Although it may not be common; changes to the fabric can make it      
impossible for a system to release a reservation. For example:          
                                                                        
Job Runs and TAP01 is reserved                                        
Job ends abnormally and does not release reservation                  
Fibre cable is moved to a different adapter port                      
At this point there is no way any of the host systems can release this
reservation...                                                          
                                                                        
=============== KMM Information ================                        
Date: 15/04/16                                                          
Time: 1302                                                              
Description: CPS Discussion Item                                        
Action: Used                                                            
Contributed: Yes                                                        
Content Source: CPS                                                     
Content: 9UNPUH                                                         
================================================                     
Paul
-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of Steinmetz, Paul
Sent: Thursday, April 02, 2015 2:54 PM
To: 'Midrange Systems Technical Discussion'
Subject: RE: CPP6316 - Hardware configuration change detected, followed by CPF414C - Command not allowed
IBM support and development both agree that this is a timing issue.
It's difficult to obtain the required Tape Flight Recorders, which development needs to research and possibly resolve the issue.
It was suggested to change or lengthen the amount of time BRMS is waiting for TAPMLB01.
By default, TAPMLB01 device description wait times are set to *job.
Initial mount wait time  . . . . . :   *JOB
End of volume mount wait time  . . :   *JOB
I don't want to change TAPMLB01 device description, this setting could get lost on recreates.
The BRMS DUPMEDBRM job wait time  is derived from the class for the subsystem in which the job is running, which is currently 30 seconds.
 Default wait time in seconds  . . . . . . . . . . :   30
Increasing the wait time for this class could impact other processes.
How else could I increase the BRMS DUPMEDBRM default wait time of 30 seconds, to say 60 or 120?
Checking all my classes, most set to 30.
I did find class QBATCH, set to 120, but I don't think this is being used.
Decades ago, we created our own versions of QINTER and QBATCH.
I'm not sure why, but all our batch subsystems, are running at 30, not 120.
A routing entry for the subsystem determines which Class is used.
I could create a new Class for BRMS only, add a new routing entry to the subsystem, thus only BRMS jobs would use the Class with longer default wait time.
What are others class default wait times set to?
Your thoughts?
WRKCLS CLS(*ALL/*ALL)
Paul
-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of rob@xxxxxxxxx
Sent: Wednesday, April 01, 2015 4:01 PM
To: Midrange Systems Technical Discussion
Subject: RE: CPP6316 - Hardware configuration change detected, followed by CPF414C - Command not allowed
In the tape library there is an operation (via the web interface) to download some tape logs.  Normally this is something that I collect and ship off to IBM instead of look at myself.  IDK if you can look through these and see something like:  too many concurrent tape drive operations.
But, seriously, auto tape cleaning was blowing up our backups.  It would actually stop during the middle of the backup, eject the backup tape, and do a cleaning.  It would be nice if it could do this and then continue on with the backup but that was NOT our experience.  There's a difference between "hey a cleaning would be nice around now" and "cough, gag, I'm aborting the backup".  Sort of like the difference between a warning and a hard halt.
I know you said that the previous time cleaning was not an issue, but perhaps there was a resource conflict due to too many simultaneous backups.
Rob Berendt
--
IBM Certified System Administrator - IBM i 6.1 Group Dekko Dept 1600 Mail to:  2505 Dekko Drive
          Garrett, IN 46738
Ship to:  Dock 108
          6928N 400E
          Kendallville, IN 46755
http://www.dekko.com
From:   "Steinmetz, Paul" <PSteinmetz@xxxxxxxxxx>
To:     "'Midrange Systems Technical Discussion'" 
<midrange-l@xxxxxxxxxxxx>
Date:   04/01/2015 03:40 PM
Subject:        RE: CPP6316 - Hardware configuration change detected, 
followed by     CPF414C - Command not allowed
Sent by:        "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxx>
Rob,
Auto clean works like a champ, once it was properly configured, proper slot, correct bar code label, et.
The Jan failure did not include an auto clean, but still same failure.
Paul
-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of rob@xxxxxxxxx
Sent: Wednesday, April 01, 2015 3:22 PM
To: Midrange Systems Technical Discussion
Subject: RE: CPP6316 - Hardware configuration change detected, followed by CPF414C - Command not allowed
CPP6316 suggests doing an IOP reset because a configuration change may have occurred without doing an IOP reset.  Or it just may be a hardware issue.
We do not autoclean.  And I think that was because of issues like this. We just do a manual clean every so many weeks.
You could add more tape drives, we have 10 in one library and four in another.
Rob Berendt
--
IBM Certified System Administrator - IBM i 6.1 Group Dekko Dept 1600 Mail
to:  2505 Dekko Drive
          Garrett, IN 46738
Ship to:  Dock 108
          6928N 400E
          Kendallville, IN 46755
http://www.dekko.com
From:   "Steinmetz, Paul" <PSteinmetz@xxxxxxxxxx>
To:     "'Midrange Systems Technical Discussion'" 
<midrange-l@xxxxxxxxxxxx>
Date:   04/01/2015 03:07 PM
Subject:        RE: CPP6316 - Hardware configuration change detected, 
followed by     CPF414C - Command not allowed
Sent by:        "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxx>
Rob,
I had a similar issue once when a drive was replaced and then the new 
drive didn't report in properly.
This is not the case.
I think this is a tape library timing issue, where, if the library does 
not complete its request in the allotted time, a failure will occur.
In this case, I think a request was made to mount a volume, the library 
was busy with other processing, so the mount did not occur in the allotted 
time, thus a failure occurred.
Your thoughts, how and who should this be reported to?
Paul
-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces@xxxxxxxxxxxx] On Behalf Of 
rob@xxxxxxxxx
Sent: Wednesday, April 01, 2015 2:13 PM
To: Midrange Systems Technical Discussion
Subject: RE: CPP6316 - Hardware configuration change detected, followed by 
CPF414C - Command not allowed
I just replaced one of the LTO4HH drives in one of our libraries.  Would 
often drop connection well into a save.
When they replaced the drive I had some 'fun'.  The serial number of the 
drive (not the library) changed.
Drive is controlled by VIOS.  That VIOS lpar does the fiber tape drive.
WRKMLBSTS and vary off for all lpars of IBM i.
WRKCFGSTS *DEV TAP* and vary off for all lpars of IBM i.
replace drive
Bounce VIOS lpar.  (probably could have done 'iop reset' if I new vios
better.)
WRKHDWRSC *STG
http://mytaplib, record serial number of drives.
STRSST delete old resource for replaced drive.  Rename resource of new 
drive to that of replaced drive.  I'm OCD on this as I like the resource 
names to match the drive names.  And I like them to be consistent across 
all lpars.
WRKMLBSTS ensure that it is still varied off WRKCFGSTS *DEV TAP* on just 
one lpar.  Vary on replaced tape drive (not library).  Use 'itdt' to 
stress test drive.  Passed.  Vary off TAP* devices.
WRKMLBSTS and vary it on.
make sure that only the replaced drive is 'allocated unprotected'.  Do 
long extensive BRMS save.  When complete, expire the tape.
WRKMLBSTS for all lpars and make sure library is ready and drives are all 
allocated unprotected.
Rob Berendt
--
IBM Certified System Administrator - IBM i 6.1 Group Dekko Dept 1600 Mail 
to:  2505 Dekko Drive
          Garrett, IN 46738
Ship to:  Dock 108
          6928N 400E
          Kendallville, IN 46755
http://www.dekko.com
From:   "Steinmetz, Paul" <PSteinmetz@xxxxxxxxxx>
To:     "'Midrange Systems Technical Discussion'" 
<midrange-l@xxxxxxxxxxxx>
Date:   04/01/2015 01:54 PM
Subject:        RE: CPP6316 - Hardware configuration change detected, 
followed by     CPF414C - Command not allowed
Sent by:        "MIDRANGE-L" <midrange-l-bounces@xxxxxxxxxxxx>
I had a repeat of this issue/failure
We had the exact same series of messages, (from our 2/1 failure) along 
with a DUPMEDBRM failure, different LPAR.
There was also a drive auto clean request issued 2:31 am, may or may not 
be related.
So the library was busy moving volumes for 4 drives, 2 lpars, between 2:30 
and 2:32 am.
CPP6316 - Hardware configuration change detected. 
BRM4138 - Media duplication completed with errors.
CPF414C - Command not allowed 
I think the issue is related to month end tape processing.
Both LPARS were in the process of starting a DUPMEDBRM with in the same 
minute, 2:30 am.
Unfortunately no drive dumps obtained.
 
                                        Serial                  Resource 
 
Name              Type      Model       Number                  Name 
TAPMLB01          3573      040         04-7808387              TAPMLB01 
 
 
Log ID  . . . . . . . . . :   800C83F7   Sequence . . . . . . . : 1641240 
Date  . . . . . . . . . . :   04/01/15   Time . . . . . . . . . : 02:32:03 
 
Reference code  . . . . . :   9220       Secondary code . . . . : 00000000 
 
Table ID  . . . . . . . . :   94290310   IPL source/state . . . :   B/3 
System Ref Code . . . . . :   94299220 
 
Server of origin  . . . . :   8205-E6C 10-5815R 
Class . . . . . . . . . . :   Permanent 
Hardware configuration change detected. 
Below is IBM's PMR explanation from the 2/1 failure.
"The issue is that the library reported it was offline in the middle of a 
function. The question is why. There is nothing from the IBM i that would 
account for this.
The User command was running, we were processing a mount function, the 
library was taking commands and sending data,  we were able to identified 
the location of the desired tape, we had the tape device, but when we 
issued the move command the library returned offline. The library for some 
reason decided to respond offline.  Which then  leads to the PAL and the 
CPF414C.
We know the PAL is not the correct one. The SK 2/0412 error was mapped to 
a hardware configure error and it should be mapped to a library state 
change. But in either case the end result would be the same... CPF414C 
library not in library mode. We have this PAL change documented.
So if you or your staff can not account for a reason why the library would 
have been offline, then we would need the library and drive  logs 
collected at the time of the failure to provide why the library responded 
the way it did.
If this issue can be reproduced we will need the following collected at 
the time of failure....
 Call QTADMPDV TAPMLBxx
 Library and drive logs captured from the Library GUI Service functions 
(Tape Support would assist with the operation if needed)"
Anyone experience something similar?
Any thoughts on which support to contact?
Paul
-----Original Message-----
From: Steinmetz, Paul 
Sent: Monday, February 02, 2015 1:26 PM
To: 'Midrange Systems Technical Discussion'
Subject: CPP6316 - Hardware configuration change detected, followed by 
CPF414C - Command not allowed 
I had this error occur over the weekend, during a DUPMEDBRM.
The Dup failed, and the recovery.
TAPMLB01 then recovered with no operator intervention.
Following DUPMEDBRM, and saves were successful.
Has anyone ever experienced this issue?
Message ID . . . . . . :   CPP6316 
Date sent  . . . . . . :   02/01/15      Time sent  . . . . . . : 06:10:20 
 
 
Message . . . . :   Hardware configuration change detected. 
 
Cause . . . . . :   Device *N has reported an error that indicates that 
the 
  configuration may have been changed without the IOP being reset. There 
may 
  also be a hardware problem with the device. 
Recovery  . . . : 
    Reset the IOP.  If the problem continues press F14 to work with the 
  problem. 
 
Technical description . . . . . . . . : 
    IOP resource  . . . . . . . . . . . . : CMB03 
    IOA resource  . . . . . . . . . . . . : DC02 
    Device type . . . . . . . . . . . . . : 3573
    Reference code  . . . . . . . . . . . : X'9220' 
    Error log ID  . . . . . . . . . . . . : X'8021F782' 
    Problem log ID  . . . . . . . . . . . : 1503221191 
Message ID . . . . . . :   CPF414C 
Date sent  . . . . . . :   02/01/15      Time sent  . . . . . . : 06:10:21 
 
 
Message . . . . :   Command not allowed 
 
Cause . . . . . :   Library device TAPMLB01 is not in library mode. 
Recovery  . . . :   Switch library device TAPMLB01 to library mode and 
retry 
  the operation. 
 
 
 Thank You
_____
Paul Steinmetz 
IBM i Systems Administrator 
Pencor Services, Inc. 
462 Delaware Ave 
Palmerton Pa 18071 
610-826-9117 work 
610-826-9188 fax 
610-349-0913 cell 
610-377-6012 home 
psteinmetz@xxxxxxxxxx 
http://www.pencor.com/
 
As an Amazon Associate we earn from qualifying purchases.