**************************************************** Monitor the 5.0.x Robot Dewars: Just like monitor_dewars.com, once a page is sent, the program exits. Somebody needs to manually go and start it again if you want to get another page. You start it like this: ssh mcfuser@bl831.als.lbl.gov monitor_robots.com >&! /dev/null & exit To stop it just do this: ssh mcfuser@bl831.als.lbl.gov killall monitor_robots.com exit By default: - all three robots are monitored - a page will be sent if any monitored robot level sensor reads > 2.1 or < 0.4 cm for more than 10 minutes straight. - the page "victim" is whomever is listed as "on call" in http://bcsb.lbl.gov/Schedules/WeeklySchedule.pdf You can specify which robots you are interested in by naming the beamlines and you can specify a different "victim" by mentioning them by name. EG: monitor_dewars.com 503 Snell >&! /dev/null & will page Gyorgy if 5.0.3's robot misbehaves monitor_dewars.com 501 503 >&! /dev/null & will page Anthony and Corie (at the moment) if either 5.0.1 or 5.0.3 misbehaves (but not 5.0.2) monitor_dewars.com Anthony >&! /dev/null & will page Anthony if any robot misbehaves ******************************************* To Launch the Pager Program: rsh mcfuser@gateway page_when_up.com >>& /data/log/page_when_up.log & ********************************* The sumup.com scripts are running on all beamlines: 501, 502, 503, 821, 822, 831, 1231 (It used to be necessary to "setenv beamline 822" before launching sumup.com on that beamline.) Right now, all the sumup.com scripts are running on: bl machine account directory 5.0.1 bl501k2 jamesh /home/staff/jamesh/public_html/501 5.0.2 bl502k2 jamesh /home/staff/jamesh/public_html/502 5.0.3 bl503k2 jamesh /home/staff/jamesh/public_html/503 8.2.1 bl821k2 jamesh /home/staff/jamesh/public_html/821 8.2.2 bl822k2 jamesh /home/staff/jamesh/public_html/822 8.3.1 bl831 mcfuser /home/mcfuser/public_html 12.3.1 bl1231 sibyls /home/sibyls/public_html The restart procedure for the 5.0.x and 8.2.x pages is (e.g., 501): ssh root@bl501k2 su - jamesh cd public_html/501 ../sumup.com continuous < /dev/null >&! sumup.log & exit The restart proceedure for 831 is: rsh mcfuser@gateway killall sumup.com cd public_html/ ./sumup.com continuous >&! /dev/null & ******************************************************** Restarting the LN2 pressure Logger: ssh root@gateway cd /var/www/html/pressure ./lN2_mon.com >& /dev/null & ************************************** Restarting the axis2 Web Camera Server: ~jamesh/beamline/http_bridge.tcl axis2 80 12345 >& /dev/null & **************************************************** HOW TO CHANGE GATEWAY MACHINE FROM, say crush8 to crush9 swap the CAT5 line from the outside world hub from ethernet card in crush8 to crush9 logon to crush9 as root in /root is a file called "become_server.com" open the file, then cut and paste commands (or just run it, it's now an executable) Reboot the computer ************************** back way into BL831 if gateway is down is through bcsb-staff-1.als.lbl.gov, then ssh to bl831a bcsb-gateway is currently offline. ******************************************************* in /root of dataserver is "things_to_restart.com" ***************************************************** NFS hang problem symtoms are: adxv is not updating "index" hangs (so will "df" and "ps") Solution: ssh root@graphics1 mount /data Double-mounts are allowed, and pretty much safe as an interim solution. To do a clean unmount, you need to issue a "Ctrl-C" or "kill" to everything trying to use /data, and then execute several "umount -f /data" commands until it no longer gives you an error. Then you can "mount /data" again. More complicated, so my middle-of-the-night advice is just to "mount /data" From: JMHolton@lbl.gov Date: Saturday, July 24, 2004 0:19 am Subject: graphics1 "froze" again ****************************************************************** Here's how to restart detectors without killing and restarting the WindowsNT "Romote Detector OP" (copied from nuke): mcfuser@graphics1:/home/mcfuser 2% foreach module ( detector1 detector2 detector3 detector4 ) foreach? # send the reset signal foreach? echo -n "$module restart " foreach? echo "restart" | sock_exchange.tcl $module 8038 1 foreach? echo "" foreach? end *****************************************************************