(a presentation given by Igor on 5/11/2016)

iRods (complexity 8)

file catalog server runs on psexport03 as a daemon 

database in mysql (named irods_icat on node psdb with raid 10 on sad) (complexity 2)

interfaces:

  • command line
  • web-services (web-portal talks to web-services, both on pswww) (complexity 5)
  • pyrods interface (python)

connects to hpss

state of file (on-tape, on-disk) recorded in irods

wilko trying to upgrade from v2 to v4.

checksumming has been a problem in v2, should be better in v3

File Migration

6 main movers plus movers for FFB, run as daemons

neh: psana101,102,103

feh: psana201,202,203

log files are in /u1/psdm/mvr/logs  (useful for debugging)

migration database called regdb (runs on psdb) with 3 tables (dss->ffb, ffb->ana, ana->hpss)

movers also query stuff from irods database

6instruments * 6 nodes * 2(smd,xtc) = 72 movers

20 IOC movers

only 36 ffb->ana movers (don't need factor of 2 for (smd,xtc): done serially)

movers are coordinated by database (poll).  records file creation times, copy finish times etc.

all python code except for bbcp

Data migration monitoring is available through pswww

on pswww can subscribe to email notifications for one of 6 cases if times get delayed (for the 3 stages (dss->ffb, ffb->ana, ana->hiss (irods) 2 cases ("not started" or "taking too long"))

don't currently have equivalent monitoring for ana->nersc stage or for hdf translation

Experiment Registration

start with a proposal number (e.g. LL12, first L is "lcls", X is in-house, P is "PCS" with stdconfig, S is SSRL).  L and X are most common.  Live in URAWI user-portal database (complex)

use URAWI to manage posix groups and schedule of experiment.  Andrea is main URAWI person.

pswww has a page that extracts information from URAWI.  things in this where information is missing show up in red.

using information from URAWI to register proposals in regdb database (but different database tables than movers).  the experiment number (e.g. e0335 is in regdb)

we don't write to URAWI only read, and logbook "group manager" adds 

experiments are registered via the "experiment registry database" (visible on pswww)

experiment registry is in regdb

sometimes uraw is down, or has incorrect information, so registering of the experiments.

they go through the pswww interface and fill out forms once per week (10-20 minutes) to register new experiments using the "create new" option

a cron job runs on psexport (not sure which one) that reads the experiment registry database and updates the ExpNameDb file and creates .tar.gz.  .rpm still done manually.

wilko has script (daemon on psexport04: dm-active-exp-folders) that looks for active experiments wakes every few minutes and looks in experiment reg database and creates directories

"experiment switch" on pswww in regdb

Access Control

for both data files and web-apps and metadata

hierarchical:

  • super-users: wilko, gapon, perazzo, cpo... (ps-data)
  • per-instrument scientists: psd-amo-sci can determine membership of ps-amo
  • per-instrument: ps-amo, ps-cxi,...  (these can do experiment switch)
  • PI can control access to per-experiment groups
  • per-experiment: amo12316...
  • user: alvarosg...

psdatmgr - owns data

web-apps: all the stuff you get to through psww (logbook, file manager)

pswww has an authorization database manager: provides an interface to write into LDAP

for web apps the access control is controlled by roledb.  roledb has model of "triplets":  "applications" (e.g. logbook) "roles" (e.g. editor) "privileges" (read, post), OR for ldap the triplet might be (ldap/admin/manage).  these triplets are visible/creatable in the authorization database manager on pswww.

user has a uid or posix group.   there is also the idea of a "context": experiment or all-experiments.  this doesn't support "all AMO experiments".  in the database groups are prefixed with "gid:" which is a little clunky.

still on psdb, but not in regdb

Databases

all on psdb

  • daq_config
  • regdb (many tables, see above)
  • roles
  • logbook
  • interface_db (hdf translation, keeps requests, lists of files in run, state (submitted))
  • iface_db_ffb (monitoring translator)
  • webportal (several different things)

See this Database design slide in confluence.

Web Infrastructure

Production webserver: pswww, but two machines behind the scenes: pswww1/pswww2.  pswww is only one in DMZ-pub subnet, with only ports 80 and 443 open (http/https).  Two network interfaces on pswww: eth0, and eth0:0. switch between them on the two machines with "ifdown eth0:0" on one machine and "ifup eth0:0" on the other (called a "virtual IP address").

pswww-dev is also open to internet (used for experimentation).  runs on machine pswww3.  no redundancy for this one, but could be used as a backup for pswww.

authentication is done by pswebkdc.  pswww[1-2] know about this.   points to two machines with active failover: pswebkdc1,2.  both of these talk to kerberos satellite server @PCDS.

pswebkdc-dev is pswebkdc3

psdb points to machines psdb4,5 and there is replication using "SQL MASTER-MASTER REPLICATION".  These two machines are directly connected with a cable for this replication.

also have second psdb-user which has same model for psdb1,2 is used for less critical stuff.

psdb3 is a development machine (like other dev machines above)

two ldap machines: psldap1,2.

What's on PSWWW?

apache 2.2 (rhel5) and 2.4 (rhel7) configuration stored in /etc/httpd/.  main config files is in conf/httpd.conf.  Specified "DocumentRoot" that maps "root directory" of URLs to point to /var/www/html/.  this is a link to NFS /reg/g/psdm/web/pswww/html/.  This stuff is not in SVN but should be.  some account info in apps/config/regdbconfig.inc.php.

RegDB is in svn.  RegDB/web/html is deployed by deployment script as apps/regdb/html.  Allows php to work.  Similarly REGDB/web/lib is deployed as apps/lib/regdb/.  Igor first deployes it into apps-dev which then deploys it into apps.  Script to do this is in /reg/g/psdm/web/apps/releases/relclone clones apps-dev.  /reg/g/psdm/web/apps/releases/development-gapon/.relbase points to afs. relclone copies from this dev area into production.

Apache

Apache modules are in /etc/httpd/modules/.  there is also a conf.modules.d used by rhel7.  these .conf modules are loaded by apache in numeric order, as specified by the first two numbers in the filename.  conf.d/ssl.crt has the https ssl certificates, and must be updated once per year.  john bartelt is the person to ask for these ssl certificates.  conf/kerberos has "key tab" for remctl to talk to NIS server to create groups (file is named maint-pcds.keytab).  HTTP.keytab is for talking to the KDC web server?

We use "pluggable" authentication servers (e.g. ws-auth, ws-pam).  confd/.htpasswd manages authentication techniques for various special accounts.  use htpasswd to update this file.  "icws" maps to ws-auth.  special account for a particular service must be registered in ws-auth/<your-service>.  URLs containing /ws-auth/myproxy are forwarded to a destination address for a service (e.g. http://psanaphi103:5060) using "ProxyPass" keyword for apache. This file might be in /etc/httpd/conf.d/psdm_ws.conf.

kerberos is configured in /etc/krb.5.conf (e.g. default "realm" SLAC.STANFORD.EDU").

SVN is /reg/g/psdm/svnrepo/conf/auth-kerb.conf has list of people allowed to access svn stuff.

/etc/httpd/conf.d/svn.conf has some other SVN configuration stuff.  /reg/g/psdm/svnrepo/confi/auth-readonly.conf has everything readonly.

/etc/httpd/conf.d/auth_kerb.conf loads something.  buildbot.conf forwards to http://psdev106:8010/ where buildbot runs.  userdir.conf has the redirect for users' public_html.  For pam had to create special file /etc/pamd/httpd also /etc/pam.d/sshd  (given to Igor by central IT people).  web servers are started by /etc/rc.local (doesn't work for new RHEL7 systemd stuff).

/reg/g/psdm/psdm/psdatmgr/startup/$HOSTNAME has startup stuff.

LogBook

Has many services.  can see what services are triggered by looking at browser debugging communication (igor uses chrome for this).  Inventory database ("IREP") client-API is implemented in python.  Could copy ideas from irep code to get fuller python interface for logbook (Chuck has a primitive version for his hashtag/bot prototype).

Small H5 Translation Advice

click translate (look at "developer tools" in chrome)

hdf5_request_new (same service for standard/monitoring, but have different parameters)

pswww.slac.stanford.edu/apps/portal/ws/hdf_request_new.php, check authentication, invokes icws

apache has "icws" in a package called interface-controller-web-service written in python by Andy ("pylons" precursor to flask).  should rewrite icws in flask

there is a service polling the database (on psexport03 (standard) 04 (monitoring) as daemon) and does bsub.

daemon monitor jobs and queries status of jobs

the advantage of going through the web service 

javascript /var/www/html/apps/portal/js/HDF5_Translator.js is used for both monitoring/standard

new table in mysql database

make sure searching for hashtags is fast using database "indexing".

icws talks to mysql "interface_db"

  • No labels