Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Machine Learning at NERSC: Strategy, Tools and Applications

Date: May 16, 2pm

Speaker: Prabhat (NERSC)


 

Past Seminars

...

Optimization for Transportation Efficiency

Date: May 2, 2pm

Location: Sycamore Conference Room (040-195)

Speaker: John Fox

Abstract: Plug-in hybrid and all-electric vehicles offer potential to transfer energy demands from liquid petroleum fuels to grid-sourced electricity. We are investigating optimization methods to improve the efficiency and resource utilization of Plug-in Hybrid Electric Vehicles (HEVs).  Our optimization uses information about a known or estimated vehicle route to predict energy demands and optimally manage on-board battery and fuel energy resources to maximally use grid-sourced electricity and minimally use petroleum resources for a given route.  Our convex optimization method uses a simplified car model to find the optimal strategy over the whole route, which allows for re-optimization on the fly as updated route information becomes available.  Validation between the simplified model and a more complete vehicle technology model simulation developed at Argonne National Laboratory was accomplished by "driving" the complete car simulation with the simplified control model.  By driving on routes with the same total energy demand but different demand profiles we show fuel efficiency gains of 5-15% on mixed urban/suburban routes compared to a Charge Depleting Charge Sustaining (CDCS) battery controller. The method also allows optimizing the economic lifetime of the vehicle battery by considering the stress on the battery from charge and discharge cycles in the resource optimization.

View file
nameJfoxhybrid.pdf
height250

 


Detecting Simultaneous Changepoints Across Multiple Data Sequences

...

Abstract: Motivated by applications in genomics, finance, and biomolecular simulation, we introduce a Bayesian model called BASIC for changepoints that tend to co-occur across multiple related data sequences. We design efficient algorithms to infer changepoint locations by sampling from and maximizing over the posterior changepoint distribution. We further develop a Monte Carlo expectation-maximization procedure for estimating unknown prior hyperparameters from data. The resulting framework accommodates a broad range of data and changepoint types, including real-valued sequences with changing mean or variance and sequences of counts or binary observations. We use the resulting BASIC framework to analyze DNA copy number variations in the NCI-60 cancer cell lines and to identify important events that affected the price volatility of S&P 500 stocks from 2000 to 2009.

 

View file
nameZhouFan_BASIC.pdf
height250

...

Abstract: Recent advances in machine learning have made significant contributions to drug discovery. Deep neural networks in particular have been demonstrated to provide significant boosts in predictive power when inferring the properties and activities of small-molecule compounds. However, the applicability of deep learning has been limited by the requirement for large amounts of training data. In this work, we demonstrate how one-shot learning can be used to significantly lower the amounts of data required to make meaningful predictions in drug discovery applications. We introduce a new architecture, the iterative refinement long short-term memory, that, when combined with graph convolutional neural networks, significantly improves learning of meaningful distance metrics over small-molecules. Our models are open-sourced as part of DeepChem, an open framework for deep-learning in drug discovery and quantum chemistry.

 

Bio: Bharath Ramsundar received a BA and BS from UC Berkeley in EECS and Mathematics and was valedictorian of his graduating class in mathematics. He is currently a PhD student in computer science at Stanford University with the Pande group. His research focuses on the application of deep-learning to drug-discovery. In particular, Bharath is the creator and lead-developer of DeepChem, an open source package that aims to democratize the use of deep-learning in drug-discovery and quantum chemistry. He is supported by a Hertz Fellowship, the most selective graduate fellowship in the sciences.

...

View file
namegan_presentation_SLAC.pdf
height250

 

Models and Algorithms for Solving Sequential Decision Problems under Uncertainty

...

Abstract: Today's state-of-the-art machine learning models require massive labeled training sets--which usually do not exist for real-world applications. Instead, I’ll discuss a newly proposed machine learning paradigm--data programming--and a system built around it, Snorkel, in which the developer focuses on writing a set of labeling functions, which are just scripts that programmatically label data. The resulting labels are noisy, but we model this as a generative process—learning, essentially, which labeling functions are more accurate than others—and then use this to train an end discriminative model (for example, a deep neural network in TensorFlow).  Given certain conditions, we show that this method has the same asymptotic scaling with respect to generalization error as directly-supervised approaches. Empirically, we find that by modeling a noisy training set creation process in this way, we can take potentially low-quality labeling functions from the user, and use these to train high-quality end models. We see this as providing a general framework for many weak supervision techniques, and at a higher level, as defining a new programming model for weakly-supervised machine learning systems.

View file
nameAlexRatner_SLAC_ml_reading_share.pptx
height250

ProxImaL: Efficient Image Optimization using Proximal Algorithms

...