Contents:
Use the "restartdaq <-w>" command. If you only want to stop the DAQ, call "stopdaq".
Or open the icon labeled "Restart DAQ" on the DAQ console:
Using the script in a terminal will give you a printout
Use the "startami" command. This will start a second client if you are not on the main DAQ machine and it will restart the DAQ AMI client if run on the main DAQ machine (after asking for confirmation, so you will need to use a terminal). Should the server side by unhappy, you will need to restart the DAQ.
Make sure that all dss nodes are selected. If you needed to take a node out due to problems, you need to edit the <hutch>.cnf file. You can use "serverStat <ip>" to get the name for the node that causes a problem and then edit the list of dss_nodes to exclude this node. If the problematic node is the last one, you might see that you have to reselect the Bld upon a restart. This means that the DAQ will make you allocate twice (the first time it'll fail with a complaint about a Bld change).
use "serverStat <DAQ device alias> <command>". "cycle" will power cycle the node with some time between off/on in the script. It will tie up the terminal, so if you have to deal with several nodes, you can also call "serverStat <DAQ device alias> off" and "serverStat <DAQ device alias> on" explicitly. Remember to wait a few second between off/on. After the script returns from turning the node(s) on, continue to run "serverStat <ip/node name>" until both pings work. If you can ssh into the node(s), you can restart the DAQ.
Use "serverStat <ip>" to check if both interfaces of the node in question are up. This script will also tell you which machine has the issue.
One/both of the pings fails: use "serverStat <ip/node name> cycle" to power cycle the machine. After the script returns, continue to run "serverStat <ip/node name>" until both pings work. If you can ssh into the node, you can restart the DAQ.
Otherwise: decide if you'd rather restart the DAQ and hope for the best. Power cycling a machine takes a few minutes.
The IP is a dss-node: here you have an additional option: you can edit the <hutch>.cnf file to take out the node: look for "dss_nodes = [....]" and take out the problematic node.
Depending on the data rate, you can run with 2 or 3 nodes (cspad + other detectors: 3 nodes, two EPIX: 2 nodes). As we run all the data into a single ami session and the best mapping allows max one ami node/dss node, you have less ami power if you have less dss nodes.
use "serverStat <DAQ device alias>" to check on the health of the node. Most likely it is prudent to power-cycle this node. Does this not help, you should power cycle the detector/camera/device itself as well. If the problematic detector is a big CsPad, please note that after you turn the detector off, you will need to also power cycle the concentrator as it will not see all the quads when you power up again!
This assume that the detector has recently worked. Otherwise, if the detector configures, but does produce damage an all events, check the trigger timing settings here:
Should your detector not configure, it is either not turned on, needs power cycling or some cable is not patched correctly. Please let CDS folks know which detectors you would like to use so we can test them beforehand.
Troubleshooting ipimbs is described on this page:IPIMB Troubleshooting for Controls IPIMBs
Trusting that there actually are photons on the detector that could be seen, the second point are the timing settings. Find the values you should use here:
Go to the EVR configuration (with the exception of pgp triggered devices as the EPIX), if you are using aliases in your DAQ, it should be straightforward to find the right trigger channel. Look at all available EVR cards. For the run trigger setting for e.g. CsPad/Cs140k detectors, it might be best to contact your POC.
check the status of the data moving here: Data Mover Monitoring, this page can be seen one the main "pswww" page as well. Take out the nodes with the issue, assuming your problem is limited to a single node. If it's wider spread, it might warrant a call.
The used dss nodes are listed in the <hutch>.cnf file in the dss_nodes line. For XPP, if you have to take out the first dss_node in the list, you need to kill the source process running on that note, one way to do that is to use serverStat to reboot that node. This process will remain after the DAQ has been stopped and if you start a second one, weird things will happen.
13_11:46:45:2015-Nov-13 11:46:45,[STATUS ],[STATUS_DATA],-79.95,-79.35,-79.95,-79.75,-156.15,52.25,-860.95,-870.25,-774.25,33.05,49.85,49.85,276,273,275,36,0,0,1200,00:00:04
12_15:30:30:2015-Nov-12 15:30:30,[VERBOSE ],RxDetector::FrameAcquisitionFrameProcessor(83, 0x7f55d403bc20,0x7f55d42d2230) - starting
12_15:30:30:FrameReady: Frame #84 size=7372800 timestamp=59989 ms
13_11:38:45:2015-Nov-13 11:38:45,[STATUS ],[STATUS_DATA],-79.95,-79.35,-79.95,-79.65,-109.25,-109.95,34.05,42.65,45.45,34.75,49.85,49.85,276,273,275,37,0,0,1200,20:50:31
13_11:39:45:2015-Nov-13 11:39:45,[STATUS ],[STATUS_DATA],-79.95,-79.35,-79.95,-79.65,-109.25,-109.95,34.05,42.35,45.45,34.85,49.85,49.85,276,273,275,36,0,0,1200,20:51:31
13_11:40:45:2015-Nov-13 11:40:45,[STATUS ],[STATUS_DATA],-79.95,-79.35,-79.95,-79.75,-156.15,52.05,-860.85,-870.25,-774.15,34.75,49.85,49.85,276,273,275,37,0,0,1200,00:00:04
13_11:41:45:2015-Nov-13 11:41:45,[STATUS ],[STATUS_DATA],-79.95,-79.35,-79.85,-79.65,-156.15,52.05,-860.85,-870.25,-774.15,33.95,49.85,49.85,276,273,275,36,0,0,1200,00:00:04
To recover from that, you need to stop the DAQ process (ctrl+x in the telnet localhost 30099 terminal window) and open the Rayonix software capxure.
Reboot the detector controller (there is a button for that). Re-enable cooling. You might have to quite this process and start again to see the now reasonable temperatures. Once this looks fine, you can quit this software and restart the DAQ process: (ctrl+R in the telnet localhost 30099 terminal window)
"serverStat" at this moment works in all hutches when using the machine name or IP, but the "DAQ alias" interpretation feature might not quite work. We hope to improve on this soon.
Expert Troubleshooting (limited permissions)