Skip to content
This repository has been archived by the owner on Jan 31, 2022. It is now read-only.

Feature Request: HwVFAT::biasAllVFATs(...) throws exception if RPC Error Code != 0 #295

Closed
1 of 2 tasks
bdorney opened this issue Jul 9, 2019 · 4 comments
Closed
1 of 2 tasks

Comments

@bdorney
Copy link
Contributor

bdorney commented Jul 9, 2019

Brief summary of issue

Would like HwVFAT::biasAllVFATs to check the RPC response and raise an exception if it is nonzero.

Target branch is generic-AMC-RPC-v3-short-term

Types of issue

  • Bug report (report an issue with the code)
  • Feature request (request for change which adds functionality)

Expected Behavior

Change lines:

# HW Dependent Configuration
if self.parentOH.parentAMC.fwVersion > 2:
# Baseline config
self.confVFAT3s(self.parentOH.link,mask)
# Run mode
if(enable):
self.writeAllVFATs("CFG_RUN",0x1,mask)
else:
self.writeAllVFATs("CFG_RUN",0x0,mask)
else:

To:

        # HW Dependent Configuration
        if self.parentOH.parentAMC.fwVersion > 2:
            # Baseline config
            rpcResp = self.confVFAT3s(self.parentOH.link,mask)
            if rpcResp != 0:
                raise Exception("{}RPC response was non-zero, failed to configure VFATs on OH{}{}".format(colors.RED,self.parentOH.link,colors.ENDC))

            # Run mode
            if(enable):
                self.writeAllVFATs("CFG_RUN",0x1,mask)
            else:
                self.writeAllVFATs("CFG_RUN",0x0,mask)
        else:

This will ensure a failure to configure is properly reported.

Current Behavior

HwVFAT::biasAllVFATs(...) will launch the following:

def biasAllVFATs(self, mask=0x0, enable=False):
# HW Dependent Configuration
if self.parentOH.parentAMC.fwVersion > 2:
# Baseline config
self.confVFAT3s(self.parentOH.link,mask)
# Run mode
if(enable):
self.writeAllVFATs("CFG_RUN",0x1,mask)
else:
self.writeAllVFATs("CFG_RUN",0x0,mask)
else:
# Run Mode
if (enable):
self.writeAllVFATs("ContReg0", 0x37, mask=mask)
else:
#what about leaving any other settings?
#not now, want a reproducible routine
self.writeAllVFATs("ContReg0", 0x36, mask=mask)
# User specified values - rely on the user to load self.paramsDefVals
for key in self.paramsDefVals.keys():
self.writeAllVFATs(key,self.paramsDefVals[key],mask)
return

Which will call the xhal function from the DAQ machine:

https://github.com/cms-gem-daq-project/xhal/blob/721e3e81684457d6d2a74d50b603aa72a720de9b/xhalcore/src/common/rpc_manager/vfat3.cc#L20-L24

And if the error key is detected this will return 1 instead of 0.

However in HwVFAT::biasAllVFATs there's no check if this succeed successfully. And in many cases if this xhal function returns 1 the VFATs will not be properly configured, for example see the break statement here:

https://github.com/cms-gem-daq-project/ctp7_modules/blob/92eeadc1993dfd85a19cd0a187f76eefb5029e06/src/vfat3.cpp#L148-L162

For sure there will be a terminal output and a "keen" user/developer notices this they can understand a problem occurred:

[gemuser@gem904qc8daq gemdaq]$ confChamber.py --shelf=1 -s2 -g11
2019.07.09.16.01
Open pickled address table if available  /opt/cmsgemos/etc/maps//amc_address_table_top.pickle...
Initializing AMC gem-shelf01-amc02
opened connection
Configuring VFATs on (shelf1, slot2, OH11) with chamber_vfatDACSettings dictionary values
Caught an error: Error reading settings
biased VFATs on (shelf1, slot2, OH11)
Set CFG_THR_ARM_DAC to 100
Chamber Configured

But this will print Chamber Configured and no stacktrace is thrown. So it looks like the call succeeded when it doesn't.

Context

It's pretty easy to miss a chamber not being configured correctly due to a lack of a check on the RPC response.

Your Environment

@bdorney
Copy link
Contributor Author

bdorney commented Jul 10, 2019

With proposed solution we have an example of a not working call:

% confChamber.py --shelf=1 -s4 -g0
2019.07.10.18.30
Open pickled address table if available  /opt/cmsgemos/etc/maps/amc_address_table_top.pickle...
Initializing AMC gem-shelf01-amc04
opened connection
Caught an error: One of the unmasked VFATs is not Synced. goodVFATs: 0  notmask: ffffff
RPC response was non-zero, failed to configure VFATs on OH0
Traceback (most recent call last):
  File "/path/to/venv/cc7/py2.7/lib/python2.7/site-packages/gempython/scripts/confChamber.py", line 67, in <module>
    configure(args,vfatBoard)
  File "/path/to/venv/cc7/py2.7/lib/python2.7/site-packages/gempython/vfatqc/utils/confUtils.py", line 39, in configure
    vfatBoard.biasAllVFATs(args.vfatmask)
  File "/path/to/vfat_user_functions_xhal.py", line 92, in biasAllVFATs
    raise Exception("RPC response was non-zero, failed to configure VFATs on OH{}".format(self.parentOH.link))
Exception: RPC response was non-zero, failed to configure VFATs on OH0

And an example of a working call:

% confChamber.py --shelf=1 -s4 -g2
2019.07.10.18.31
Open pickled address table if available  /opt/cmsgemos/etc/maps/amc_address_table_top.pickle...
Initializing AMC gem-shelf01-amc04
opened connection
Configuring VFATs on (shelf1, slot4, OH2) with chamber_vfatDACSettings dictionary values
biased VFATs on (shelf1, slot4, OH2)
Set CFG_THR_ARM_DAC to 100
Chamber Configured

bdorney pushed a commit to bdorney/cmsgemos that referenced this issue Jul 10, 2019
bdorney pushed a commit to bdorney/cmsgemos that referenced this issue Jul 10, 2019
@jsturdy
Copy link
Contributor

jsturdy commented Jul 10, 2019

What is the (desired and current) outcome in a case where an action could be done for, e.g., multiple links and only one would throw?
Would the non-throwing actions complete? Or would everything terminate at the time of the action, requiring all actions to be repeated following any necessary recovery action?

@bdorney
Copy link
Contributor Author

bdorney commented Jul 11, 2019

What is the (desired and current) outcome in a case where an action could be done for, e.g., multiple links and only one would throw?

Right now the only place where a multi link configuration is attempted in the same call would be:

https://github.com/cms-gem-daq-project/vfatqc-python-scripts/blob/dae6fb9a1d65f0d7081dc040832faf1c5f77123e/confAllChambers.py#L8-L61

Here each link generates a new rpc connection (obviously this is probably not desireable). If a single link's configuration call generates an exception the try block would catch it here:

https://github.com/cms-gem-daq-project/vfatqc-python-scripts/blob/dae6fb9a1d65f0d7081dc040832faf1c5f77123e/confAllChambers.py#L134-L164

so right now this would kill a configure command for all links.

The desirable action is of course:

  1. for the user to know which links are not configured properly, and
  2. for non-throwing links to be configured

I'm not sure how to have this in the legacy branch(es) without significant refactoring.

Would the non-throwing actions complete? Or would everything terminate at the time of the action, requiring all actions to be repeated following any necessary recovery action?

Non-throwing actions would not complete if they did not already complete by the time the exception was raised.

bdorney pushed a commit to bdorney/cmsgemos that referenced this issue Jul 16, 2019
@jsturdy
Copy link
Contributor

jsturdy commented Jul 31, 2019

Closed by #298

@jsturdy jsturdy closed this as completed Jul 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.