NERSCPowering Scientific Discovery Since 1974

2020 NERSC Summer Internships

NERSC hosts a number of internships every summer. Applicants must be students, actively enrolled in undergraduate or graduate programs. These are paid internships, but we are unable to provide additional support for travel or housing. Desired technical qualifications are specified with each project description.

To apply for one of these internships, please reach out to the listed NERSC mentors directly and send your CV/resume.

Application Performance

Automating Performance Analysis

This position has been filled for Summer 2020.

Project description: A number of HPC codes that run at NERSC and elsewhere need to quantify their performance regularly. The standard method is manually implementing tooling (instrumentation APIs) and interactively using profilers (generally, dynamic instrumentation). NERSC has been developing a generic adaptor framework to simplify recording these metrics (timing, memory, hardware counters, rooflines) and uploading these metrics to a database that tracks these metrics over time. The framework for collecting metrics is currently available in stand-alone or hybrid C/C++/CUDA/Python/Fortran applications and supports manual instrumentation and binary re-writing. The end goal is for this framework to fully support dynamic instrumentation and also include instrumentation at the compiler-level. The candidate(s) will have a number of sub-project options to work on, including but not limited to: streamlining continuous-integration (Python, CMake/CTest/CDash), improving visualization (Python/Jupyter, web), adding tools/capabilities (C++), writing/extending cross-language bindings (C, Python, Fortran), and validation.
Desired Skills: Performance analysis, experience in one or more of the following: C, C++, Python, CUDA, Fortran, GPUs, web, CMake/CTest 
NERSC mentors: Jonathan Madsen, Brandon Cook

Advanced Technologies

Power management in high-performance computing (HPC)

CS domain: Energy-efficient high-performance computing
Project description: The first of the Department of Energy's exascale systems will consume over 20 Megawatts in power. Power consumption in supercomputing is a growing area of concern that limits its size and scale. In this project, the summer intern will explore and evaluate the efficacy of power management methods in the context of hardware and applications of interest at the National Energy Research Supercomputing Center (NERSC). This work aims to help inform improvement in the design and operation of existing and future systems. The exact project will be determined with the supervisor but will likely involve one of the following two areas.

1. Evaluate power controls and counters on general-purpose Graphics Processing Units (GPUs)

A growing amount of floating-point instructions per second (FLOPs) performance of HPC applications on the top supercomputers today is obtained from the GPUs in systems provisioned with CPUs and GPUs. It is necessary to understand the power consumption for representative HPC applications/benchmarks on the GPUs and how the controls in the hardware affect it along with the performance. This information can help the design of sophisticated power management schemes in the future. Besides, the goal of the project is also to understand the maturity in support and granularity of these power controls and counters in the GPUs.

2. Understanding the opportunities for power-steering between CPUs and GPUs for NERSC applications

Given that a sizable portion of the FLOPs in GPU-enabled HPC applications come from the GPUs, CPUs tend to wait while dissipating power for the GPUs to complete.  We want to evaluate the possibility of throttling CPU power in such scenarios in the context of NERSC applications. Further, we want to investigate if the saved power from the CPU throttling can be directed to the GPUs for additional performance gains. This will either entail power-performance analysis leading to a simulation/model or developing a prototype framework that allows CPU/GPU throttling and steering.

Desired Skills/Background: Familiarity with Linux environments and programming in C/C++ or Python, familiarity with running HPC applications that use MPI, OpenMP or heterogeneous programming paradigms is strongly preferred, familiarity with performance modeling and analysis or framework development using hardware counters or controls is highly beneficial, senior undergraduate or graduate student in computer science or a related field 
NERSC/ATG mentors: Sridutt Bhalachandra <[email protected]>, Nicholas Wright <[email protected]>

Performance analysis for next-generation HPC architectures

CS domain: Performance benchmarking and analysis
Project description: Understanding and predicting the performance of NERSC's workload on various processors and networks has been a fundamental step in the selection of every NERSC HPC system. However, hardware trends toward complex heterogeneous architectures with specialized accelerators now require more advanced modeling and analysis to produce accurate predictions. This project will improve NERSC's performance modeling capabilities through a combination of

1) expanding the suite of runtime experiments,  2) applying new and existing performance analyses to NERSC applications, 3) developing tools and procedures to measure specific performance properties, or 4) identifying ways to derive useful insights from system-wide monitoring (i.e. LDMS).

Desired Skills/Background: Familiarity with Linux environments and programming in C/C++ or Python, familiarity with running HPC applications that use MPI, familiarity with basic concepts in performance profiling and performance models, senior undergraduate or graduate student in computer science or computational science 
NERSC/ATG mentors: Brian Austin <[email protected]>, Nicholas Wright <[email protected]>

Data, Analytics, and Machine Learning

Deep Learning benchmarking and infrastructure

Science/CS domain: software development, benchmarking, reproducible workflows
Project Description: Help support our scientific deep learning workflows on world-class supercomputers! We need an enthusiastic intern to contribute to one or more of the following areas: developing and running HPC ML benchmarks from MLPerf and scientific applications, developing and evaluating productive workflow solutions for distributed training and hyper-parameter optimization, and profiling and optimizing real scientific applications.
Desired Skills/Background: DL libraries (TensorFlow, PyTorch), benchmarking, GPUs, hyper-parameter optimization, HPC
NERSC/DAS mentors: Steve Farrell ([email protected]), Mustafa Mustafa ([email protected]), Wahid Bhimji ([email protected])

Deep Learning for protein understanding and design

Science/CS domain: bio-engineering, deep learning, sequential models
Project Description: Synthetic biology is a rapidly growing field with many promising applications in energy, medicine, and materials. In this project, in collaboration with scientists from the Joint BioEnergy Institute and the Agile Biofoundry, you will work on developing machine learning models to aid in the design of proteins for producing renewable biofuels and other bioproducts. You will explore models such as state-of-the-art deep neural networks for sequential-data and natural language tasks and techniques like self-supervised learning to extract knowledge from protein sequence data.
Desired Skills/Background: Deep Learning, interest or experience with proteins/biosciences
NERSC/DAS mentors: Steve Farrell ([email protected]), Wahid Bhimji ([email protected])

Faster PDE Solutions using Deep Learning

Many scientific codes running at NERSC involve finding solutions to some sort of partial differential equation (PDE). Modern libraries to solve PDEs -- such as GMRES[1,2] -- are based on iterative techniques, whose performance (i.e. number of iterations) are highly dependent on the initial “guess” -- also called a preconditioner. The current state of the art involves constructing preconditioners based on physical and empirical insight, and requires an experienced user. In this project we will use a simple, yet important PDE as a test-bed: the Stokes equation. This PDE appears in a plethora of problems throughout science, thus faster times-to-solution will have a significant impact. Here we will use deep learning (DL) frameworks such as PyTorch and TensorFlow to monitor an implementation of GMRES in situ as it simulates solutions to the Stokes equation, and train it to improve the preconditioner. The end goal of this project is a DL library implemented on NERSC’s Cori-GPU system, to perform in-situ performance tuning on the GMRES PDE solver.
Desired Skills: C++, Python, some familiarity with DL libraries (TensorFlow, PyTorch) and GPUs
NERSC mentors: Johannes Blaschke ([email protected])

[1]: M. Cai, A. Nonaka, B. E. Griffith, J. B. Bell, and A. Donev, "Efficient Variable-Coefficient Finite-Volume Stokes Solvers," Commun. Comput. Phys., 16, 1263-1297, 2014 https://doi.org/10.4208/cicp.070114.170614a
[2]: Source code available here: https://github.com/AMReX-FHD/FHDeX, documentation on linear PDE solvers available here: https://amrex-codes.github.io/amrex/docs_html/LinearSolvers_Chapter.html

Spatial and temporal super-resolution for climate science

CS Domain: Deep Learning
Project description: High-resolution hi-fidelity climate models are extremely expensive to simulate, even with advances in HPC technology. Downscaling (or super-resolving) climate datasets using physical or statistical models have remained challenging tasks. In this project we will use existing SRGAN architectures to super-resolve regional climate model output for critical variables such as winds, humidity, precipitation etc. Such highly resolved datasets will serve the community well for making predictions of extreme weather events at scales of relevance to society for planning, mitigation and adaptation.
Desired Skills / Background: Deep Learning, Machine Learning, Python, Climate Science (not necessary but beneficial)
NERSC/DAS mentors: Karthik Kashinath ([email protected]), Mustafa Mustafa ([email protected]), Adrian Albert ([email protected])

Segmentation of hi-resolution sparse climate data

CS Domain: Deep Learning
Project description: The ClimateNet project is generating an expert-labeled high-quality curated dataset for extreme weather phenomena in global climate model output. Recent work has shown that transfer learning can be used successfully for important deep learning tasks in climate science (classification and regression). See, for example, the recent Nature article on DL for ENSO forecasting (https://www.nature.com/articles/s41586-019-1559-7). Here we propose to train DL segmentation models in a two-step fashion; first using heuristic-based-labels (using the leading heuristic algorithms for tropical cyclones, atmospheric rivers and extra-tropical cyclones) and then to train using the expert-labelled curated dataset from ClimateNet. We will show how the two can be combined to fine-tune hi-fidelity segmentation models. A larger goal of this project is to look at how segmentation methods can be extended to high dimensional data with sparse signals. By leveraging ideas from signal processing, geometric deep learning, we hope to be able to segment and track events with much better success.
Desired Skills / Background: Deep Learning, Signal Processing, Geometric Learning, Python, Climate Science (not necessary but beneficial)
NERSC/DAS mentors: Karthik Kashinath ([email protected]), Adrian Albert ([email protected]), Mayur Mudigonda ([email protected]), Mustafa Mustafa ([email protected])

Spatio-temporal deep learning for turbulent flows

CS Domain: Deep Learning
Project description: Simulating turbulent flows is extremely expensive given the wide range of scales to be simulated and the complex models involved for sub-grid-scale physics. ML and DL are beginning to show promise for replacing or augmenting complex turbulent flow models (CFD). However, much work remains to be done before these become operational, including in enforcing physical laws (conservation laws and other physical constraints). In this project we will develop physics-informed ML and DL models for turbulent flow modeling building upon previous work in our group in this area. 
Desired Skills / Background: Deep Learning, Machine Learning, Python, Climate Science (not necessary but beneficial)
NERSC/DAS mentors: Karthik Kashinath ([email protected]), Mustafa Mustafa ([email protected]), Adrian Albert ([email protected])

Self-supervised representation learning of photometric redshifts

CS Domain: Deep Learning

Project Description: Sky surveys supply us with large amounts of data that can be used to glean information about the content, evolution, and expansion rate of the universe. One important task is determining distances to different observed galaxies. This, however, is a challenging task to achieve accurately using only the photometric data which is being collected in these large sky surveys. Other methods such as spectroscopic redshift measurements provide very accurate distance measurements but at a higher cost, and are thus done for a limited number of galaxies. The challenge here is to use this limited number of “labels” to estimate distances of galaxies from the photometric data collected in sky surveys. In this project we aim to use self-supervised representation learning methods to achieve that.

Desired Skills/Background: experience with TensorFlow and/or PyTorch. Experience in implementing deep learning models and reproducing literature results. Experience in physics or cosmology is a plus but is not required.

Mentors: Mustafa Mustafa ([email protected]), George Stein ([email protected]), Peter Harrington ([email protected]), Zarija Lukic ([email protected])

GPU I/O Evaluation for ML & AI workloads

Science/CS domain: GPU I/O performance evaluation and optimization
Project Description: In the current ML and AI workloads as well as large-scale simulations that use GPUs, the CPU (host) memory or a shared memory between CPU and GPU is used as a buffer for I/O between GPUs and file systems. With increasing data sizes, the usage of this intermediate buffer becomes a bottleneck. New technologies, such as Nvidia’s GPUDirectStorage, are offering solutions for transferring data between GPUs and file systems directly. In this internship, the student will first evaluate the current solutions by profiling I/O performance. The student will then evaluate the impact of new optimization strategies, such as asynchronous I/O. The project will use ML & AI workloads such as CosmoFlow and a subsurface simulation workload on a GPU testbed at NERSC. The I/O software includes HDF5 and the latest version of asynchronous I/O virtual object layer (VOL) connector for HDF5. Depending on the availability of GPUDirectStorage technology, further evaluation of I/O in the chosen workloads and potential benefits.
Desired Skills/Background: Good understanding of GPUs, I/O, and background to Python
NERSC/DAS mentor: Quincey Koziol <[email protected]> and Suren Byna ([email protected])

Building a Multi-Petabyte Data Portal

CS domain: Software development, web front end, databases
Project description: Scientists at NERSC have PBs of data they need to manage and search across multiple storage layers of the file system. We would like to work with a motivated intern to help develop a responsive and intuitive web portal that can help scientists manage millions of files and guide them to the appropriate destination. The ideal intern would be able to code for the full web stack, from front-end html/css to back-end data wrangling. 
Desired Skills/Background: react, d3, python, spark or dask or similar, databases
NERSC/DAS mentors: Annette Greiner <[email protected]>, Lisa Gerhardt <[email protected]>

Data Science Engagement

A Compute Portal for Experimental Facilities

CS Domain: Software Development, Web, Databases, Authentication
Project Description: Experimental and observational facilities use NERSC to process PBs of data each year. These facilities require a fine-grained and custom access level to their data while their workflows are quite reusable. This project aims to provide a portal for 1-3 standardized workflows (with visual feedback) of an experimental facility that can scale to 100s of users without them ever touching the scheduler or terminal. The intern will benefit from and/or contribute to NERSC’s APIs and container solutions like SPIN or SHIFTER. Frequent interaction with an experimental facility and their domain scientists will guide the development of the interface. A successful project might even become a permanently visible service on the nersc.gov domain.
It’s time to give the power of HPC to the experimentalist, don’t you agree?
Desired Skills/Background: Python, Flask, Databases, Docker, Globus
NERSC/DSEG mentors: Bjoern Enders <[email protected]>, Deborah Bard <[email protected]>