TS014956323
Problem has been seen on both z/OS 2.4 and z/OS 2.5 and z/OS 3.1 and with either svc or linkage=system DEQ and takes from about 4 to 15 runs to recreate in batch rexx/uss.
The recreation occurs with PGM=IKJEFT1A and we are executing an authorized function using REXX.
System configured using only IBM’s GRS as Star.
Rexx snippet (full rexx and jcl and svcd to follow):
env.0=1
env.1="_BPX_SHAREAS=MUST"
stdin.0=0
cmd = 'sha256 //DD:DASHA '
bpxrc=bpxwunix(cmd,stdin.,stdout.,stderr.,env.);
Description:
Our SSI end of task call, SSOBFUNC= SSOBEOT, issues a series of ENQ/DEQ services as it progresses through its logic. The QNAME RNAME is devised as such to serialize certain processes at task termination for an ASID. So it will obtain the ENQ, perform some process, and then issue the DEQ. As expected, if the DEQ does not release the resource, other tasks terminating will attempt to gain the resource and will wait indefinitely until the DEQ releases it. As an experiment we reset one TCB’s TCBAREQ and the hung batch TSO rexx step completes normally.
During our research of the situation, a ‘D GRS,C’ shows the owning TASK and the waiting TASK. When doing a SUMM format we can see the TCB identified as owning the resource has issued the DEQ, but it does not appear to have completed. We still see the SVRB with a WLIC of x’00020030’ and by looking at our own working storage we can see that the instruction following the DEQ has not executed.
We also observed that TCBAREQ will be on in one or more TCBs in the dump. Since we are able to intermittently recreate the situation, Pat modified the EOT code to capture TCBNDSP3 prior to issuing the DEQ. The TCBAREQ bit is not ON prior to the DEQ. We believe this mixture of authorized and unauthorized tasks executing in the batch TSO address space is related to the hang. It seems possible that our DEQ is not allowed to complete because the task is not dispatchable.
The task waiting for the resource identified by WLIC of x’00020038’ and by the D GRS,C command did not have TCBAREQ on. Below are snippets from a dump that was taken a few seconds after we recognized a hang occurred.
Information from the SUMM FORMAT relating to the TCB with the WLIC of x’00020030’ (DEQ)
Task non-dispatchability flags from TCBFLGS5:
Secondary non-dispatchability indicator
Task non-dispatchability flags from TCBNDSP2:
SVC Dump is executing for another task
Task non-dispatchability flags from TCBNDSP3:
TSO authorized request processing active
Task flags from TCBFLGS8:
Task is terminating and its mainline processing will not run again (TCBDYING)
Task is terminating and its mainline processing will not run again (TCBENDNG)
Information from the SUMM FORMAT relating to the TCB with the WLIC of x’00020038’ (ENQ)
Task non-dispatchability flags from TCBFLGS4:
Top RB is in a wait
Task non-dispatchability flags from TCBFLGS5:
Secondary non-dispatchability indicator
Task non-dispatchability flags from TCBNDSP2:
SVC Dump is executing for another task
Task flags from TCBFLGS8:
Task is terminating and its mainline processing will not run again (TCBDYING)
Task is terminating and its mainline processing will not run again (TCBENDNG)
And as we mentioned we are using IBM GRS star