Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
maxLevel2
exclude2 Comments

Upcoming Seminar

...

Aug 29: Object-Centric Machine Learning

Sept 5: MacroBase: A Search Engine for Fast Data Streams

Date: AugSept. 295, 2pm

Speaker:

Leo GuibasDeep knowledge of the world is necessary if we are to have autonomous and intelligent agents and artifacts that can assist us in everyday activities, or even carry out tasks entirely independently. One way to factorize the complexity of the world is to associate information and knowledge with stable entities, animate or inanimate, such as persons or vehicles, etc -- what we generally refer to as "objects."
In this talk I'll survey a number of recent efforts whose aim is to create and annotate reference representations for (inanimate) objects based on 3D models with the aim of delivering such information to new observations, as needed. In this object-centric view, the goal is to learn about object geometry, appearance, articulation, materials, physical properties, affordances, and functionality. We acquire such information in a multitude of ways, both from crowd-sourcing and from establishing direct links between models and signals, such as images, videos, and 3D scans -- and through these to language and text. The purity of the 3D representation allows us to establish robust maps and correspondences for transferring information among the 3D models themselves -- making our current 3D repository, ShapeNet, a true network. 
While neural network architectures have had tremendous impact in image understanding and language processing, their adaptation to 3D data is not entirely straightforward. The talk will also briefly discuss current approaches in designing deep nets appropriate for operating directly on irregular 3D data representations, such as meshes or point clouds, both for analysis and synthesis -- as well as ways to learn object function from observing multiple action sequences involving objects -- in support of the above program.

Sept 5: MacroBase: A Search Engine for Fast Data Streams

Date: Sept. 5, 2pm

Speaker: Sahaana Suri (Stanford)

While data volumes generated by sensors, automated process, and application telemetry continue to rise, the capacity of human attention remains limited. To harness the potential of these large scale data streams, machines must step in by processing, aggregating, and contextualizing significant behaviors within these data streams. This talk will describe progress towards achieving this goal via MacroBase, a new analytics engine for prioritizing attention in this large-scale "fast data" that has begun to deliver results in several production environments. Key to this progress are new methods for constructing cascades of analytic operators for classification, aggregation, and high-dimensional feature selection; when combined, these cascades yield new opportunities for dramatic scalability improvements via end-to-end optimization for streams spanning time-series, video, and structured data. MacroBase is a core component of the Stanford DAWN project (http://dawn.cs.stanford.edu/), a new research initiative designed to enable more usable and efficient machine learning infrastructure.

 

Sept. 26: Optimal Segmentation with Pruned Dynamic Programming

Date: Sept. 26, 2pm

Speaker: Jeffrey Scargle (NASA)

Bayesian Blocks (1207.5578) is an O(N**2) dynamic programming algorithm to compute exact global optimal segmentations of sequential data of arbitrary mode and dimensionality. Multivariate data, generalized block shapes, and higher dimensional data are easily treated. Incorporating a simple pruning method yields a (still exact) O(N) algorithm allowing fast analysis of series of ~100M data points. Sample applications include analysis of X- and gamma-ray time series, identification of GC-islands in the human genome, data-adaptive triggers and histograms, and elucidating the Cosmic Web from 3D galaxy redshift data.

...

Sahaana Suri (Stanford)

While data volumes generated by sensors, automated process, and application telemetry continue to rise, the capacity of human attention remains limited. To harness the potential of these large scale data streams, machines must step in by processing, aggregating, and contextualizing significant behaviors within these data streams. This talk will describe progress towards achieving this goal via MacroBase, a new analytics engine for prioritizing attention in this large-scale "fast data" that has begun to deliver results in several production environments. Key to this progress are new methods for constructing cascades of analytic operators for classification, aggregation, and high-dimensional feature selection; when combined, these cascades yield new opportunities for dramatic scalability improvements via end-to-end optimization for streams spanning time-series, video, and structured data. MacroBase is a core component of the Stanford DAWN project (http://dawn.cs.stanford.edu/), a new research initiative designed to enable more usable and efficient machine learning infrastructure.

 

Sept. 26: Optimal Segmentation with Pruned Dynamic Programming

Date: Sept. 26, 2pm

Speaker: Jeffrey Scargle (NASA)

Bayesian Blocks (1207.5578) is an O(N**2) dynamic programming algorithm to compute exact global optimal segmentations of sequential data of arbitrary mode and dimensionality. Multivariate data, generalized block shapes, and higher dimensional data are easily treated. Incorporating a simple pruning method yields a (still exact) O(N) algorithm allowing fast analysis of series of ~100M data points. Sample applications include analysis of X- and gamma-ray time series, identification of GC-islands in the human genome, data-adaptive triggers and histograms, and elucidating the Cosmic Web from 3D galaxy redshift data.


Past Seminars

...

Object-Centric Machine Learning

Date: Aug. 29, 2pm

Speaker: Leo Guibas

Deep knowledge of the world is necessary if we are to have autonomous and intelligent agents and artifacts that can assist us in everyday activities, or even carry out tasks entirely independently. One way to factorize the complexity of the world is to associate information and knowledge with stable entities, animate or inanimate, such as persons or vehicles, etc -- what we generally refer to as "objects."

In this talk I'll survey a number of recent efforts whose aim is to create and annotate reference representations for (inanimate) objects based on 3D models with the aim of delivering such information to new observations, as needed. In this object-centric view, the goal is to learn about object geometry, appearance, articulation, materials, physical properties, affordances, and functionality. We acquire such information in a multitude of ways, both from crowd-sourcing and from establishing direct links between models and signals, such as images, videos, and 3D scans -- and through these to language and text. The purity of the 3D representation allows us to establish robust maps and correspondences for transferring information among the 3D models themselves -- making our current 3D repository, ShapeNet, a true network. 
While neural network architectures have had tremendous impact in image understanding and language processing, their adaptation to 3D data is not entirely straightforward. The talk will also briefly discuss current approaches in designing deep nets appropriate for operating directly on irregular 3D data representations, such as meshes or point clouds, both for analysis and synthesis -- as well as ways to learn object function from observing multiple action sequences involving objects -- in support of the above program.

Reconstruction Algorithms for Next-Generation Imaging: Multi-Tiered Iterative Phasing for Fluctuation X-ray Scattering and Single-Particle Diffraction

Date: Aug. 15, 2pm

Location: Tulare (B53-4006) (NOTE CHANGE IN ROOM!)

Speaker: Jeffrey Donatelli (CAMERA, Berkeley)

Abstract: The development of X-ray free-electron lasers has enabled new experiments to study uncrystallized biomolecules that were previously infeasible with traditional X-ray sources. One such emerging experimental technique is fluctuation X-ray scattering (FXS), where one collects a series of diffraction patterns, each from multiple particles in solution, using ultrashort X-ray pulses that allow snapshots to be taken below rotational diffusion times of the particles. The resulting images contain angularly varying information from which angular correlations can be computed, yielding several orders of magnitude more information than traditional solution scattering methods. However, determining molecular structure from FXS data introduces several challenges, since, in addition to the classical phase problem, one must also solve a hyper-phase problem to determine the 3D intensity function from the correlation data. In another technique known as single-particle diffraction (SPD), several diffraction patterns are collected, each from an individual particle. However, the samples are delivered to the beam at unknown orientations and may also be present in several different conformational states. In order to reconstruct structural information from SPD, one must determine the orientation and state for each image, extract an accurate 3D model of the intensity function from the images, and solve for the missing complex phases, which are not measured in diffraction images.
In this talk, we present the multi-tiered iterative phasing (M-TIP) algorithm for determining molecular structure from both FXS and SPD data. This algorithm breaks up the associated reconstruction problems into a set of simpler subproblems that can be efficiently solved by applying a series of projection operators. These operators are combined in a modular iterative framework which is able to simultaneously determine missing parameters, the 3D intensity function, the complex phases, and the underlying structure from the data. In particular, this approach is able to leverage prior knowledge about the structural model, such as shape or symmetry, to obtain a reconstruction from very limited data with excellent global convergence properties and high computational efficiency. We show results from applying M-TIP to determine molecular structure from both simulated data and experimental data collected at the Linac Coherent Light Source (LCLS).
View file
nameDonatelli_Phasing.pdf
height250

Exploratory Studies in Neural Network-based Modeling and Control of Particle Accelerators

Date: Aug 1, 2pm

Speaker: Auralee Edelen (CSU)

Particle accelerators are host to myriad control challenges: they involve a multitude of interacting systems, are often subject to tight performance demands, in many cases exhibit nonlinear behavior, sometimes are not well-characterized due to practical and/or fundamental limitations, and should be able to run for extended periods of time with minimal interruption. One avenue toward improving the way these systems are controlled is to incorporate techniques from machine learning. Within machine learning, neural networks in particular are appealing because they are highly flexible, they are well-suited to problems with nonlinear behavior and large parameter spaces, and their recent success in other fields (driven largely by algorithmic advances, greater availability of large data sets, and improvements in high performance computing resources) is an encouraging indicator that they are now technologically mature enough to be fruitfully applied to particle accelerators. This talk will highlight a few recent efforts in this area that were focused on exploring neural network-based approaches for modeling and control of several particle accelerator subsystems, both through simulation and experimental studies. 

View file
nameAug1_20170801_Edelen_AI_Seminar_SLAC_slides.pdf
height250

Estimating behind-the-meter solar generation with existing measurement infrastructure

Date: July 11, 2pm

Speaker: Emre Kara

Real-time PV generation information is crucial for distribution system operations such as switching, 
state-estimation, and voltage management. However, most behind-the-meter solar installations are not 
monitored.Typically, the only information available to the distribution system operator is the installed 
capacity of solar behind each meter; though in many cases even the presence of solar may be unknown. 
We present a method for disaggreagating behind-the-meter solar generation using only information that 
is already available in most distribution systems. Specifically, we present a contextually supervised source 
separation strategy adopted to address the behind-the-meter solar disaggregation problem. We evaluate
the model sensitivities to different input parameters such as the number of solar proxy measurements, number 
of days in the training set, and region size. 

View file
nameEmreKara_Estimating%20the%20behind-the-meter%20solar%20generation%20with%20existing%20infrastructure.pdf
height250

Development and Application of Online Optimization Algorithms

Date: June 27, 3pm

Location: Kings River, B52-306 (Note change in time and place!)

Speaker: Xiabiao Huang

Automated tuning is an online optimization process.  It can be faster and more efficient than manual tuning and can lead to better performance. It may also substitute or improve upon model based methods. Noise tolerance is a fundamental challenge to online optimization algorithms. We discuss our experience in developing a high efficiency, noise-tolerant optimization algorithm, the RCDS method, and the successful application of the algorithm to various real-life accelerator problems. Experience with a few other online optimization algorithms are also discussed.

View file
nameXiaobiaoHuang_RCDS.pdf
height250

Machine Learning at NERSC: Past, Present, and Future

...