Blog

SCS Bullet pointss for week ending 2013/11/15

Scientific Computing Systems has implemented DNS split zone views to improve SLAC's security by not exposing data for non-internet routable machines. This also conforms with best practice recommendations for DNS configuration.

James Williams and several members of Scientific Computing Services attended a meeting with Rosio Alvarez, Adam Stone, and Gary Jung from LBNL to compare scientific computing at both labs.   The discussion, which included computing technologies, a comparison of services, and cost recovery models, will be valuable in formulating directions for scientific computing services at SLAC. 

Scientific Computing Services updated the Hierarchical Storage Interface (HSI) to the current release (4.0.1.3.p1).  HSI is a command line interface for admins and users of the High Performance Storage System (HPSS).  It offers a familiar Unix-style environment for working within the HPSS environment.  This software provides SLAC with efficient access to database information and the ability to script data migrations moves.

Adding 41 blade nodes (656 CPU cores) and additional Infiniband network hardware. Installation is complete and new nodes are in production.

Ordered storage server configuration with 60x4TB drives. Storage is now in production (/nfs/slac/g/ki/ki23)

 

Wei Yang (Scientific Computing Services), with support from Andrew Hanushevsky and Richard Mount (Scientific Computing Applications), published a paper at CHEP2013 on "Using Solid State Disk Array as a Cache for LHC ATLAS Data Analysis". It described the cache architecture and its positive impact on ATLAS data analysis, which also improves SLAC's batch system utilization.

Scientific Computing Services updated the webauth login pages in support of the Drupal project.  This new, improved version provides a consistent interface for SLAC web services.

Staff in Scientific Computing Services, Enterprise Applications and Scientific Computing Applications implemented a process to synchronize the Unix account password expiration dates with an internal (RES) database to improve the process for the administration of LCLS user accounts. This enhancement also provides better visibility of account information to the wider SLAC community.  

Scientific Computing Services upgraded its Automated Cartridge System Library Software (ACSLS) from version 7.3.1 to 8.2.  This upgrade ensures that SCS stays on a supported release of the software and dovetails with the release required to inter-operate with HPSS.  Furthermore, the ACSLS upgrade allows the Computing Division to continue with plans to partition the tape library for offsite tape backup copies.  Lastly, the upgrade permits the use of a new tape drive called the T10000D, which can store 70 percent more data on the same tape cartridge used by our current T10000C drives.  Future use of the T10000D tape drives would provide greater overall tape capacity without the addition of another tape library.

Scientific Computing Services upgraded its High Performance Storage System (HPSS) software from version 7.3.3.7 to 7.3.3.8.  This provides some bug fixes and will keep HPSS compatible with the recent Automated Cartridge System Library Software (ACSLS) upgrade.  In addition, certain HPSS tape servers had their disk caches reconfigured for increased capacity to allow HPSS to better handle incoming scientific data traffic. 

Scientific Computing Services completed several upgrades to the 1536-core hequ compute cluster. The nodes are now running RHEL6 on a new SLAC subnet with outbound TCP connectivity to support the Open Science Grid. The cluster management interface was also modified to use standard IPMI-based console software, removing the need for obsolete serial cable connections. The increase in the number RHEL6 cores will benefit ATLAS jobs and MPI users running on the "bullet" Infiniband cluster.

Scientific Computing Services has automated network link aggregation for UNIX services that need high-bandwidth or high-availability. These server configurations bond multiple links to the SLAC network and can use independent switch modules for an additional level of redundancy. The associated IP address settings can be easily updated, reducing the amount of time required to move these servers between different subnets.

Scientific Computing Services installed LSF 9.1.1, which is an upgrade to the LSF 9.1 software for the batch environment. This version provides some additional features,  including better support for  MPI users. 

Scientific Computing Services has provided on-going Drupal Unix Operating System support, including installing  and managing PHP, installing two additional Drupal virtual machines, IP networking reconfiguration for 10 existing Drupal hosts, and installing and verifying configuration management infrastructure to insure identical Drupal installations.  Drupal is part of the Lab's Web Intranet Portal initiative.

PPA Lustre filesystem capacity will be doubled from ~170TB to ~340TB usable space. Servers will be relocated and connected to the bullet cluster via 40Gb/s Infiniband.

Filesystem upgrade complete. https://confluence.slac.stanford.edu/display/SCSPub/PPA+Lustre+filesystem+2014+upgrade

Purchase req created for compute cluster expansion. 1648 additional cores with Infiniband. Networking hardware has arrived and is being installed.

Expansion is complete and new cores are now in production.

 

RT462268 Storage for MCC

Ordered 240TB storage configuration for MCC

Storage server with ~28TB has been ordered for ACD

Scientific Computing Services has deployed new interactive login Virtual Machines for KIPAC. These VMs are a lifecycle replacement for an older pool of machines and provide customers with more network bandwidth and compute power.

Scientific Computing Services completed the Unix infrastructure sections of the Quarterly FISMA Data Call.   SCS staff modified some of the existing reporting tools, providing more readable reports and records for these data calls, and improving the general auditing process.

Scientific Computing Services has acquired the hardware and software for a development GPFS parallel file system.  Installation and setup of this development environment will begin this month, with GPFS software testing to commence soon thereafter.  This will enable SCS team members to learn how to use GPFS to manage future disk storage for SLAC's scientific community. 

Scientific Computing Services completed its work to support the IPv6 project. This included adding IPv6 configuration support to the configuration management system and adding IPv6 support to SLAC's outgoing DNS servers.  This moves SLAC toward compliance with the DOE mandate for IPv6 readiness.