General IPDPS Info

Sponsors

IN COOPERATION WITH

GOLD INDUSTRY PARTNER

SILVER INDUSTRY PARTNER

2018 Advance Program

Please visit the IPDPS website regularly for updates, since there may be schedule revisions. Authors who have corrections should send email to contact@ipdps.org giving full details.

MONDAY - 21 May 2018

DAYSMondayTuesdayWednesdayThursdayFriday

MONDAY WORKSHOPS
ALL DAY*
* See each individual
workshop program
for schedule details

 

 

IPDPS 2018 WORKSHOPS – MONDAY 21 MAY
1 HCW

Heterogeneity in Computing Workshop

2

RAW

Reconfigurable Architectures Workshop

3

HiCOMB

High Performance Computational Biology

4

GABB

Graph Algorithms Building Blocks

5

EduPar

NSF/TCPP W. on Parallel and Distributed Computing Education

6

HIPS

High Level Programming Models and Supportive Environments

7

HPBDC

High-Performance Big Data, Deep Learning, and Cloud Computing

8

AsHES

Accelerators and Hybrid Exascale Systems

9

PDCO

Parallel / Distributed Computing and Optimization

10

HPPAC

High-Performance, Power-Aware Computing

11

APDCM

Advances in Parallel and Distributed Computational Models

12

ParLearning

Parallel and Distributed Computing for Large-Scale Machine Learning and Big Data Analytics

 

TUESDAY - 22 May 2018

DAYSMondayTuesdayWednesdayThursdayFriday

Opening Session
8:00 AM - 8:30 AM

Opening Session

Keynote Session
8:30 AM - 9:30 AM

Keynote

Session Chair: Anne Benoit


Michael Bender

Stony Brook University

 

The Algorithmics of Write Optimization

 

Abstract: Write-optimized dictionaries (WODs), such as LSM trees and B^epsilon trees, are increasingly used in databases and file systems. Read more

Morning Break 9:30 AM -10:00 AM

PhD Forum
All day

PhD Forum Posters

On Display All Day Tuesday and Wednesday

Parallel Technical
Sessions 1, 2, 3, & 4

10:00 AM - 12:00 PM

SESSION 1: Graph Algorithms 1

Session Chair: Johannes Langguth

 

MIDAS: Multilinear Detection at Scale
Saliya Ekanayake (Virginia Tech), Jose Cadena (Virginia Tech), Udayanga Wickramasinghe (Indiana University Bloomington), Anil Kumar Vullikanti (Virginia Tech)

 

Optimizing Parallel Graph Connectivity Computation via Subgraph Sampling
Michael Sutton (The Hebrew University of Jerusalem), Tal Ben-Nun (ETH Zurich), Amnon Barak (The Hebrew University of Jerusalem)

 

Parallel Algorithms through Approximation: b-Edge Cover
Alex Pothen (Purdue University), Arif Khan (Pacific Northwest National Lab), S. M. Ferdous (Purdue University)

 

A Parallel Algorithm for Bayesian Network Inference using Arithmetic Circuits
Md Vasimuddin (Indian Institute of Technology Bombay), Sriram P. Chockalingam (Georgia Institute of Technology), Srinivas Aluru (Georgia Institute of Technology)

 

 

SESSION 2: Large-Scale Applications 1

Session Chair: Alan Sussman

 

Cataloging the Visible Universe through Bayesian Inference at Petascale

Jeffrey Regier (University of California, Berkeley), Kiran Pamnany (Intel), Keno Fischer (Julia Computing), Andreas Noack (Massachusetts Institute of Technology), Maximilian Lam (University of California, Berkeley), Jarrett Revels (Massachusetts Institute of Technology), Steve Howard (University of California, Berkeley), Ryan Giordano (University of California, Berkeley), David Schlegel (Lawrence Berkeley National Laboratory), Jon McAuliffe (University of California, Berkeley), Rollin Thomas (Lawrence Berkeley National Laboratory), Prabhat (Lawrence Berkeley National Laboratory)

 

Efficient, Parallel At-Scale Correlation Analysis for Atom Probe Tomography on Hybrid Architectures
Hao Lu (Oak Ridge National Laboratory), Sudip Seal (Oak Ridge National Laboratory), Gregory Muzyn (University of Tennessee), Wei Guo (Oak Ridge National Laboratory), Jonathan Poplawsky (Oak Ridge National Laboratory)

 

A Fast and Massively-Parallel Solver for Nonlinear Tomographic Image Reconstruction
Mert Hidayetoglu (University of Illinois Urbana-Champaign), Carl Pearson (University of Illinois Urbana-Champaign), Izzat El Hajj (University of Illinois Urbana-Champaign), Levent Gurel (University of Illinois Urbana-Champaign), Weng Cho Chew (University of Illinois Urbana-Champaign), Wen-Mei Hwu (University of Illinois Urbana-Champaign)

 

Real-Time Massively Distributed Multi-Object Adaptive Optics Simulations for the European Extremely Large Telescope
Hatem Ltaief (KAUST), Ali Charara (KAUST), Damien Gratadour (LESIA - Observatoire de Paris), Nicolas Doucet (LESIA - Observatoire de Paris), Bilel Hadri (KAUST Supercomputing Lab), Eric Gendron (LESIA - Observatoire de Paris), Saber Feki (KAUST), David Keyes (KAUST)

 

 

SESSION 3: Performance / QoS / Resilience

Session Chair: Shuaiwen Leon Song

 

Performance Isolation of Data-Intensive Scale-out Applications in a Multi-tenant Cloud
Palden Lama (University of Texas at San Antonio), Shaoqi Wang (University of Colorado, Colorado Springs), Xiaobo Zhou (University of Colorado, Colorado Springs), Dazhao Cheng (UNC Charlotte)

 

QoS Support for Scientific Workflows using Software-Defined Storage Resource Enclaves
Suman Karki (Washington State University), Bao Nguyen (Washington State University), Xuechen Zhang (Washington State University)

 

Scalable Data Resilience for In-Memory Data Staging
Shaohua Duan (Rutgers Discovery Informatics Institute), Pradeep Subedi (Rutgers Discovery Informatics Institute), Keita Teranishi (Sandia National Laboratories), Philip Davis (Rutgers Discovery Informatics Institute), Hemanth Kolla (Sandia National Laboratories), Marc Gamell (Intel), Manish Parashar (Rutgers Discovery Informatics Institute)

 

Performance and Scalability of Lightweight Multi-Kernel based Operating Systems
Balazs Gerofi (RIKEN Advanced Institute For Computational Science), Rolf Riesen (Intel), Masamichi Takagi (RIKEN Advanced Institute For Computational Science), Taisuke Boku (University of Tsukuba), Yutaka Ishikawa (RIKEN Advanced Institute For Computational Science), Robert W. Wisniewski (Intel)

 

SESSION 4: Memory Designs and Optimizations

Session Chair: Antonino Tumeo

 

Architectural support for unlimited memory versioning and renaming
Eran Gilad (Technion), Tehila Mayzels (Technion - Israel Institute of Technology), Elazar Raab (Technion - Israel Institute of Technology), Mark Oskin (University of Washington), Yoav Etsion (Technion - Israel Institute of Technology)

 

CTA-Aware Prefetching and Scheduling for GPU
Gunjae Koo (University of Southern California), Hyeran Jeon (San Jose State University), Zhenhong Liu (University of Illinois at Urbana–Champaign), Nam Sung Kim (University of Illinois at Urbana–Champaign), Murali Annavaram (University of Southern California)

 

CIAO: Cache Interference-Aware Throughput-Oriented Architecture and Scheduling for GPUs
Jie Zhang (Yonsei University), Shuwen Gao (Intel), Nam Sung Kim (University of Illinois at Urbana-Champaign), Myoungsoo Jung (Yonsei University)

 

Millipede: Die-Stacked Memory Optimizations for Big Data Machine Learning Analytics
Nitin (NVIDIA), Mithuna Thottethodi (Purdue University), T. N. Vijaykumar (Purdue University)

Parallel Technical Sessions 5, 6, 7, & 8
1:30 PM - 3:30 PM

SESSION 5: Scheduling

Session Chair: Ioana Banicescu

 

Scheduling Monotone Moldable Jobs in Linear Time
Klaus Jansen (University of Kiel), Felix Land (Cristian-Albrechts-Universität zu Kiel)

 

The Power to Schedule a Parallel Program
Kunal Agrawal (Washington University in Saint Louis), Seth Gilbert (National University of Singapore)

 

Scheduling Parallel Tasks under Multiple Resources: List Scheduling vs. Pack Scheduling
Hongyang Sun (Vanderbilt University), Redouane Elghazi (ENS Lyon), Ana Gainaru (Vanderbilt University), Guillaume Aupy (INRIA), Padma Raghavan (Vanderbilt University)

 

Parallel scheduling of DAGs under memory constraints
Loris Marchal (CNRS), Hanna Nagy (Technical University of Cluj-Napoca), Bertrand Simon (ENS Lyon), Frédéric Vivien (INRIA)

 

SESSION 6: Learning

Session Chair: Assefaw Gebremedhin

 

Evaluating Active Learning with Cost and Memory Awareness
Dmitry Duplyakin (University of Utah), Jed Brown (University of Colorado Boulder), Donna Calhoun (Boise State University)

 

Semantics-Preserving Parallelization of Stochastic Gradient Descent
Saeed Maleki (Microsoft), Madanlal Musuvathi (Microsoft), Todd Mytkowicz (Microsoft)

 

Efficient Gradient Boosted Decision Tree Training on GPUs
Zeyi Wen (National University of Singapore), Bingsheng He (National University of Singapore), Ramamohanarao Kotagiri (The University of Melbourne), Shengliang Lu (National University of Singapore), Jiashuai Shi (National University of Singapore)

 

BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU
Yuwei Hu (TuSimple Inc), Jidong Zhai (Tsinghua University), Dinghua Li (TuSimple Inc), Yifan Gong (TuSimple Inc), Yuhao Zhu (University of Rochester), Wei Liu (TuSimple Inc), Lei Su (TuSimple Inc), Jiangming Jin (TuSimple Inc)

 

SESSION 7: Compilers and Libraries

Session Chair: Frank Mueller

 

Lightweight MPI Communicators with Applications to Perfectly Balanced Quicksort
Michael Axtmann (Karlsruhe Institute of Technology), Armin Wiebigke (Karlsruhe Institute of Technology), Peter Sanders (Karlsruhe Institute of Technology)

 

Improving Network Throughput with Global Communication Reordering
Wim Lavrijsen (LBNL), Costin Iancu (LBNL), Xing Pan (NC State)

 

Highly Efficient Compensation-based Parallelism for Wavefront Loops on GPUs
Kaixi Hou (Virginia Tech), Hao Wang (Virginia Tech), Wu-Chun Feng (Virginia Tech), Jeffrey Vetter (Oak Ridge National Lab), Seyong Lee (Oak Ridge National Lab)

 

Development and application of a hybrid programming environment on an ARM/DSP system for High Performance Computing
Gaurav Mitra (Texas Instruments Inc.), Jonathan Bohmann (Southwest Research Institute), Ian Lintault (nCore HPC), Alistair Rendell (Australian National University)

 

SESSION 8: Optimizations for Emerging Storage Systems

Session Chair: Gokhan Memik

 

GC-aware Request Steering with Improved Performance and Reliability for SSD-based RAIDs
Suzhen Wu (Xiamen University), Weidong Zhu (Xiamen University), Guixin Liu (Xiamen University), Hong Jiang (University of Texas-Arlington), Bo Mao (Xiamen University)

 

A Set-aware Key-Value Store on Shingled Magnetic Recording Drives with Dynamic Band
Ting Yao (Wuhan National Laboratory For Optoelectronics, Huazhong University of Science & Technology), Jiguang Wan (Wuhan National Laboratory For Optoelectronics, Huazhong University of Science & Technology), Ping Huang (Temple University), Xubin He (Temple University), Yiwen Zhang (Wuhan National Laboratory For Optoelectronics, Huazhong University of Science & Technology), Zhihu Tan (Huazhong University of Science and Technology), Changsheng Xie (Huazhong University of Science and Technology)

 

Software-Hardware Managed Last-level Cache Allocation Scheme for Large-Scale NVRAM-based Multicores Executing Parallel Data Analytics Applications
Masab Ahmad (University of Connecticut), Halit Dogan (University of Connecticut), Fabio Checconi (IBM), Xinyu Que (IBM), Daniele Buono (IBM), Omer Khan (University of Connecticut)

 

MOCA: Memory Object Classification and Allocation in Heterogeneous Memory Systems
Aditya Narayan (Boston University), Tiansheng Zhang (Boston University), Shaizeen Aga (University of Michigan), Satish Narayanasamy (University of Michigan), Ayse Coskun (Boston University)

Afternoon Break 3:30 PM - 4:00 PM

PLENARY SESSION:
Best Papers

4:00 PM - 6:00 PM

Best Paper Nominees - Plenary

Session Chair: Ümit Çatalyürek

 

Communication-free Massively Parallel Graph Generation
Daniel Funke (Karlsruhe Institute of Technology), Sebastian Lamm (Karlsruhe Institute of Technology), Peter Sanders (Karlsruhe Institute of Technology), Christian Schulz (University of Vienna), Darren Strash (Colgate University), Moritz von Looz (Karlsruhe Institute of Technology)

 

Understanding and Modeling Lossy Compression Schemes on HPC Scientific Data
Tao Lu (New Jersey Institute of Technology), Qing Liu (New Jersey Institute of Technology), Xubin He (Temple University), Huizhang Luo (New Jersey Institute of Technology), Eric Suchyta (Oak Ridge National Laboratory), Norbert Podhorszki (Oak Ridge National Laboratory), Scott Klasky (Oak Ridge National Laboratory), Matthew Wolf (Oak Ridge National Laboratory), Tong Liu (Temple University)

 

UBIS: Utilization-aware cluster scheduling
Karthik Kambatla (Facebook), Vamsee Yarlagadda (Cloudera), Íñigo Goiri (Microsoft), Ananth Grama (Purdue University)

 

Hardware Transactional Memory meets Persistent Memory
Daniel Castro (Instituto Superior Técnico & INESC-ID), Paolo Romano (Instituto Superior Técnico & INESC-ID), João Barreto (Instituto Superior Técnico & INESC-ID)

Industry Event
7:00 PM

Emu Technology

 

Innovative Algorithms Held Back by Conventional Computers?

Learn about a new Big Data platform with 10x+ performance improvements for addressing the challenges of “data intensive problems”

 

Presented by Janice McMahon of Emu Technology

 

Read more

WEDNESDAY - 23 May 2018

DAYSMondayTuesdayWednesdayThursdayFriday

Keynote Session
8:30 AM – 9:30 AM

Keynote

Session Chair: Viktor Prasanna


Keren Bergman

Columbia University

 

Empowering flexible and scalable high performance architectures with embedded photonics

 

Abstract: The recent explosive growth in data analytics applications that rely on machine and deep learning techniques are seismically changing the landscape of high performance architectures.
Read more

Morning Break 9:30 AM - 10:00 AM

PhD Forum
All day

PhD Forum Posters

On Display All Day Tuesday and Wednesday

Parallel Technical
Sessions 9, 10, 11, & 12

10:00 AM - 12:00 PM

SESSION 9: Numerical Algorithms

Session Chair: Saday Sadayappan

 

Large Bandwidth-Efficient FFTs on Multicore and Multi-Socket Systems
Doru Thom Popovici (Carnegie Mellon University), Tze Meng Low (Carnegie Mellon University), Franz Franchetti (Carnegie Mellon University)

 

Lattice H-Matrices on Distributed-Memory Systems
Akihiro Ida (The University of Tokyo)

 

Evaluating the Performance and Cost of Accelerating Seismic Processing with CUDA, OpenCL, OpenACC, and OpenMP
Tiago Lobato Gimenes (HPG Lab), Flávia Pisani (LMCAD-Unicamp), Edson Borin (Unicamp)

 

Avoiding Synchronization in First-Order Methods for Sparse Convex Optimization
Aditya Devarakonda (University of California, Berkeley), Fountoulakis Kimon (University of California, Berkeley), James Demmel (University of California, Berkeley), Michael Mahoney (University of California, Berkeley)

 

SESSION 10: GPU Hashing and Searching

Session Chair: George Slota

 

A Dynamic Hash Table for the GPU
Saman Ashkiani (University of California, Davis), Martin Farach-Colton (Rutgers University), John D. Owens (University of California, Davis)

 

GPU LSM: A Dynamic Dictionary Data Structure for the GPU
Saman Ashkiani (University of California, Davis), Shengren Li (University of California, Davis), Martin Farach-Colton (Rutgers University), Nina Amenta (University of California, Davis), John D. Owens (University of California, Davis)

 

WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes
Daniel Jünger (University of Mainz), Christian Hundt (University of Mainz), Bertil Schmidt (University of Mainz)

 

Quotient Filters: Approximate Membership Queries on the GPU
Afton Geil (University of California, Davis), Martin Farach-Colton (Rutgers University), John Owens (University of California, Davis)

 

SESSION 11: Domain-Specific, Runtime and Autotuning

Session Chair: Cosmin Oancea

 

BabelFlow: An Embedded Domain Specific Language for Parallel Analysis and Visualization
Steve Petruzza (SCI Institute - University of Utah), Sean Treichler (Stanford University), Valerio Pascucci (SCI Institute - University of Utah), Peer-Timo Bremer (Lawrence Livermore National Lab)

 

Online Tuning of Parallelism Degree in Parallel Nesting Transactional Memory
Jingna Zeng (IST), Paolo Romano (INESC-ID/IST), Joao Barreto (INESC-ID/Technical University Lisbon), Luis Rodrigues (IST/INESC-ID), Seif Haridi (SICS)

 

Work-Stealing, Locality-Aware Actor Scheduling
Saman Barghi (University of Waterloo), Martin Karsten (University of Waterloo)

 

Indigo: A Domain-Specific Language for Fast, Portable Image Reconstruction
Michael Driscoll (University of California, Berkeley), Benjamin Brock (University of California, Berkeley), Frank Ong (University of California, Berkeley), Jonathan Tamir (University of California, Berkeley), Hsiou-Yuan Liu (University of California, Berkeley), Michael Lustig (University of California, Berkeley), Armando Fox (University of California, Berkeley), Katherine Yelick (University of California, Berkeley and Lawrence Berkeley National Laboratory)

 

SESSION 12: Resource Management

Session Chair: Loris Marchal

 

Swallow: Joint Online Scheduling and Coflow Compression in Datacenter Networks
Qihua Zhou (Nanjing University of Posts and Telecommunications), Peng Li (The University of Aizu), Kun Wang (Nanjing University of Posts and Telecommunications), Deze Zeng (China University of Geoscience), Song Guo (The Hong Kong Polytechnic University), Minyi Guo (Shanghai Jiao Tong University)

 

Auto-tuning Streamed Applications on Intel Xeon Phi
Peng Zhang (National University of Defense Technology), Jianbin Fang (National University of Defense Technology), Tao Tang (College of computer science, National University of Defense Technology, China), Canqun Yang (NUDT), Zheng Wang (Lancaster University)

 

Analyzing Resource Trade-offs in Hardware-overprovisioned Supercomputers
Ryuichi Sakamoto (The University of Tokyo), Tapasya Patki (Lawrence Livermore National Laboratory), Thang Cao (The University of Tokyo), Masaaki Kondo (The University of Tokyo), Koji Inoue (Kyushu University), Masatsugu Ueda (Kyushu University), Daniel Ellsworth (Lawrence Livermore National Laboratory), Barry Rountree (Lawrence Livermore National Laboratory), Martin Schulz (Lawrence Livermore National Laboratory)

 

Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications
Vivek Balasubramanian (Rutgers University), Matteo Turilli (Rutgers University), Weiming Hu (The Pennsylvania State University), Matthieu Lefebvre (Princeton University), Wenjie Lei (Princeton University), Guido Cervone (The Pennsylvania State University), Jeroen Tromp (Princeton University), Shantenu Jha (Rutgers University)

Parallel Technical Sessions 13, 14, 15, & 16
1:30 PM – 3:30 PM

SESSION 13: Tensors

Session Chair: Siva Rajamanickam

 

A Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats
Peter Ahrens (Massachusetts Institute of Technology), Helen Xu (Massachusetts Institute of Technology), Nicholas Schiefer (Massachusetts Institute of Technology)

 

Communication Lower Bounds for Matricized Tensor Times Khatri-Rao Product
Grey Ballard (Wake Forest University), Nicholas Knight (New York University), Kathryn Rouse (Wake Forest University)

 

Blocking Optimization Techniques for Sparse Tensor Computation
Jee Choi (IBM), Xing Liu (Intel), Shaden Smith (University of Minnesota), Tyler Simon (University of Maryland, Baltimore County)

 

TTLG - An Efficient Tensor Transposition Library for GPUs
Jyothi Vedurada (Indian Institute of Technology Madras), Arjun Suresh (The Ohio State University), Aravind Sukumaran Rajam (The Ohio State University), Jinsung Kim (The Ohio State University), Changwan Hong (The Ohio State University), Sriram Krishnamoorthy (Pacific Northwest National Lab), V. Krishna Nandivada (IIT Madras), Ajay Panyala (Pacific Northwest National Lab), Rohit Srivastava (The Ohio State University), P Sadayappan (The Ohio State University)

 

SESSION 14: Large Scale Applications 2

Session Chair: Taisuke Boku

 

Do Developers Understand Floating Point?
Peter Dinda (Northwestern University), Conor Hetland (Northwestern University)

 

sDPF-RSA: Utilizing Floating-point Computing Power of GPUs for Massive Digital Signature Computations
Jiankuo Dong (School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China), Fangyu Zheng (State Key Laboratory of Information Security, Institute of Information Engineering, CAS, Beijing, China), Niall Emmart (School of Computer Science, University of Massachusetts, Amherst), Jingqiang Lin (State Key Laboratory of Information Security, Institute of Information Engineering, CAS, Beijing, China), Charles Weems (School of Computer Science, University of Massachusetts, Amherst)

 

Rethinking large-scale economic modeling for efficiency: optimizations for GPU and Xeon Phi clusters
Simon Scheidegger (University of Zurich), Dmitry Mikushin (University of Zurich), Felix Kübler (University of Zurich), Olaf Schenk (Institute of Computational Science, Faculty of Informatics, Universit ́a della Svizzera italiana)

 

A Fast Scalable Implicit Solver with Concentrated Computation for Nonlinear Time-evolution Problems on Low-order Unstructured Finite Elements
Tsuyoshi Ichimura (The University of Tokyo), Kohei Fujita (The University of Tokyo), Masashi Horikoshi (Software and Solutions Group, Intel K.K.), Larry Meadows (Data Center Group, Intel Corporation), Kengo Nakajima (The University of Tokyo), Takuma Yamaguchi (The University of Tokyo), Kentaro Koyama (Frontier Computing Center, Fujitsu Limited), Hikaru Inoue (Frontier Computing Center, Fujitsu Limited), Akira Naruse (NVIDIA Corporation), Keisuke Katsushima (The University of Tokyo), Muneo Hori (The University of Tokyo), Maddegedara Lalith (The University of Tokyo)

 

SESSION 15: Data Operations

Session Chair: Jianfeng Zhan

 

Characterizing Scheduling Delay for Low-latency Data Analytics Workloads
Wei Chen (University of Colorado, Colorado Springs), Aidi Pi (University of Colorado, Colorado Springs), Shaoqi Wang (University of Colorado, Colorado Springs), Xiaobo Zhou (University of Colorado, Colorado Springs)

 

Runtime Scheduling Policies for Distributed Graph Algorithms
Jesun Firoz (Pacific Northwest National Lab), Marcin Zalewski (Pacific Northwest National Lab), Martina Barnas (Indiana University Bloomington), Andrew Lumsdaine (Pacific Northwest National Lab and University of Washington)

 

Communication Efficient Checking of Big Data Operations
Lorenz Hübschle-Schneider (Karlsruhe Institute of Technology), Peter Sanders (Karlsruhe Institute of Technology)

 

What Size Should your Buffers to Disks be?
Guillaume Aupy (INRIA), Olivier Beaumont (INRIA), Lionel Eyraud-Dubois (INRIA)

 

SESSION 16: Power and Temperature

Session Chair: Wu Feng

 

THOR: THermal-aware Optimizations for extending ReRAM lifetime
Majed Valad Beigi (Northwestern University), Gokhan Memik (Northwestern University)

 

CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading
Lifeng Nai (Google), Ramyad Hadidi (Georgia Institute of Technology), He Xiao (Georgia Institute of Technology), Hyojong Kim (Georgia Institute of Technology), Jaewoong Sim (Intel), Hyesoon Kim (Georgia Institute of Technology)

 

GreenSprint: Effective Computational Sprinting in Green Data Centers
Haoran Cai (HUST), Qiang Cao (HUST), Hong Jiang (University of Texas at Arlington)

 

Joint Server and Network Energy Saving in Data Centers for Latency-Sensitive Applications
Liang Zhou (University of California Riverside), Chih-Hsun Chou (University of California Riverside), Laxmi Bhuyan (University of California Riverside), K. K. Ramakrishnan (University of California Riverside), Daniel Wong (University of California Riverside)


Afternoon Break 3:30 PM - 4:00 PM

Plenary Program

4:00 PM – 6:00 PM

Details To Be Announced

PhD Forum
Special Session

6:00 PM

Posters on Display

Reception

6:30 PM – 7:30 PM

Details To Be Announced

Symposium Banquet

7:30 PM

Details To Be Announced

 

THURSDAY - 24 May 2018

DAYSMondayTuesdayWednesdayThursdayFriday

Keynote Session
8:30 AM - 9:30 AM

Keynote

Session Chair: Bora Uçar

 

Bruce Hendrickson
Lawrence Livermore National Laboratory

 

The Day After Tomorrow: The Looming
Post-Exascale Crisis

 

Abstract: This is the best of times for high-performance computing. Our simulations continue to grow in complexity and sophistication, and they drive progress in the physical sciences and engineering, improving the world in myriad ways. Read more

Morning Break 9:30 AM - 10:00 AM

Parallel Technical Sessions 17, 18, 19, & 20
10:00 AM - 12:00 PM

SESSION 17: Graph Algorithms 2

Session Chair: Fredrik Manne

 

Implicit Decomposition for Write-Efficient Connectivity Algorithms
Naama Ben-David (Carnegie Mellon University), Guy Blelloch (Carnegie Mellon University), Jeremy Fineman (Georgetown University), Phillip B. Gibbons (Carnegie Mellon University), Yan Gu (Carnegie Mellon University), Charles McGuffey (Carnegie Mellon University), Julian Shun (Massachusetts Institute of Technology)

 

Distributed Symmetry Breaking in Graphs with Bounded Diversity
Leonid Barenboim (Open University of Israel), Tzalik Maimon (Open University of Israel)

 

Complete Visitability for Autonomous Robots on Graphs
Aisha Aljohani (Kent State University), Pavan Poudel (Kent State University), Gokarna Sharma (Kent State University)

 

Local Mixing Time: Distributed Computation and Applications
Anisur Rahaman Molla (NISER, Bhubaneswar), Gopal Pandurangan (University of Houston)

 

SESSION 18: Performance Modeling and Analysis

Session Chair: Jee Choi

 

Roofline Guided Design and Analysis of a Multi-stencil CFD Solver for Multicore Performance
Bahareh Mostafazadeh Davani (University of California, Irvine), Ferran Marti (University of California, Irvine), Feng Liu (University of California, Irvine), Aparna Chandramowlishwaran (University of California, Irvine)

 

Taming the ``Monster'': Overcoming Program Optimization Challenges on SW26010 Through Precise Performance Modeling
Shizhen Xu (Tsinghua Unviersity), Yuanchao Xu (Tsinghua University), Wei Xue (Tsinghua University), Xipeng Shen (North Carolina State University), Xiaomeng Huang (Tsinghua University), Guangwen Yang (Tsinghua University)

 

Performance and Accuracy Trade-offs of HPC Application Modeling and Simulation
Zhou Tong (Florida State Univerisity), Scott Pakin (Los Alamos National Lab), Mike Lang (Los Alamos National Lab), Xin Yuan (Florida State University)

 

PADDLE: Performance Analysis using a Data-driven Learning Environment
Jayaraman J. Thiagarajan (Lawrence Livermore National Laboratory), Rushil Anirudh (Lawrence Livermore National Laboratory), Bhavya Kailkhura (Lawrence Livermore National Laboratory), Nikhil Jain (Lawrence Livermore National Laboratory), Tanzima Islam (Western Washington University), Abhinav Bhatele (Lawrence Livermore National Laboratory), Jae-Seung Yeom (Lawrence Livermore National Laboratory), Todd Gamblin (Lawrence Livermore National Laboratory)

 

SESSION 19: Memory and Data Access

Session Chair: Jeffrey Young

 

Efficient Solving of Scan Primitive on Multi-GPU Systems
Adrian Perez Dieguez (University of Coruña), Margarita Amor (University of Coruña), Doallo Ramón (University of Coruña), Akira Nukada (Tokyo Institute of Technology), Satoshi Matsuoka (Tokyo Institute of Technology)

 

Quantifying the Performance and Energy-Efficiency Impact of Hardware Transactional Memory on Scientific Applications on Large-Scale NUMA Systems
Jinsu Park (UNIST), Woongki Baek (UNIST)

 

GPU-Accelerated Large-Scale Genome Assembly
Sayan Goswami (Louisiana State University), Kisung Lee (Louisiana State University), Shayan Shams (Louisiana State University), Seung-Jong Park (Louisiana State University)

 

GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method
Gregory Herschlag (Duke University), Seyong Lee (Oak Ridge National Laboratory), Jeffery Vetter (Oak Ridge National Laboratory), Amanda Randles (Duke University)

 

SESSION 20: Exception Handling & Error Detection

Session Chair: Peter Strazdins

 

SlimFast: Reducing Metadata Redundancy in Sound and Complete Dynamic Data Race Detection
Yuanfeng Peng (University of Pennsylvania), Christian Delozier (University of Pennsylvania), Ariel Eizenberg (University of Pennsylvania), William Mansky (Princeton University), Joseph Devietti (University of Pennsylvania)

 

Sword: A Bounded Memory-Overhead Detector of OpenMP Data Races in Production Runs
Simone Atzeni (University of Utah), Ganesh Gopalakrishnan (University of Utah), Zvonimir Rakamaric (University of Utah), Ignacio Laguna (Lawrence Livermore National Laboratory), Gregory Lee (Lawrence Livermore National Laboratory), Dong Ahn (Lawrence Livermore National Laboratory)

 

Unobtrusive Asynchronous Exception Handling with Standard Java Try/Catch Blocks
Mostafa Mehrabi (The University of Auckland), Nasser Giacaman (The University of Auckland), Oliver Sinnen (The University of Auckland)

 

COMPI: Concolic Testing for MPI Applications
Hongbo Li (University of California, Riverside), Sihuan Li (University of California, Riverside), Zachary Benavides (University of California, Riverside), Zizhong Chen (University of California, Riverside), Rajiv Gupta (University of California, Riverside)

Parallel Technical Sessions 21, 22, 23 & 24
1:30 PM - 3:30 PM

SESSION 21: Graph Algorithms 3

Session Chair: Ana Lucia Varbanescu

 

Experimental Design of Work Chunking for Graph Algorithms on High Bandwidth Memory Architectures
George M Slota (Rensselaer Polytechnic Institute), Siva Rajamanickam (Sandia National Labs)

 

Distributed Louvain Algorithm for Graph Community Detection
Sayan Ghosh (Washington State University), Mahantesh Halappanavar (PNNL), Antonino Tumeo (PNNL), Ananth Kalyanaraman (Washington State University), Hao Lu (ORNL), Daniel Chavarria-Miranda (PNNL), Arif Khan (PNNL), Assefaw Gebremedhin (Washington State University)

 

Application Codesign of Near-Data Processing for Similarity Search
Vincent T. Lee (University of Washington), Amrita Mazumdar (University of Washington), Carlo C. Del Mundo (University of Washington), Armin Alaghi (University of Washington), Luis Ceze (University of Washington), Mark Oskin (University of Washington)

 

SESSION 22: Linear Solvers

Session Chair: Aparna Chandramowlishwaran

 

A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices
Piyush Sao (Georgia Institute of Technology), Sherry Li (Lawrence Berkeley National Laboratory), Richard Vuduc (Georgia Institute of Technology)

 

A new GPU algorithm to compute a level set-based analysis for the parallel solution of sparse triangular systems.
Ernesto Dufrechou (Facultad de Ingeniería), Pablo Ezzatti (Udelar)

 

Performance of Hierarchical-matrix BiCGStab Solver on GPU clusters
Ichitaro Yamazaki (University of Tennessee), Ahmad Abdelfattah (University of Tennessee), Akihiro Ida (The University of Tokyo), Satoshi Ohshima (Kyushu University), Stanimire Tomov (University of Tennessee), Rio Yokota (Tokyo Institute of Technology), Jack Dongarra (University of Tennessee)

 

Convergence Models and Surprising Results for the Asynchronous Jacobi Method
Jordi Wolfson-Pou (Georgia Institute of Technology), Edmond Chow (Georgia Institute of Technology)

 

SESSION 23: Runtime Systems and Libraries

Session Chair: Stefan Lankes

 

Overhead-Conscious Format Selection for SpMV-Based Applications
Yue Zhao (North Carolina State University), Weijie Zhou (North Carolina State University), Xipeng Shen (North Carolina State University), Graham Yiu (IBM Toronto Software Lab)

 

Cudele: An API and Framework for Programmable Consistency and Durability in a Global Namespace
Michael Sevilla (University of California, Santa Cruz), Ivo Jimenez (University of California, Santa Cruz), Noah Watkins (University of California, Santa Cruz), Jeff Lefevre (University of California, Santa Cruz), Shel Finkelstein (University of California, Santa Cruz), Peter Alvaro (University of California, Santa Cruz), Patrick Donnelly (Red Hat, Inc.), Carlos Maltzahn (University of California, Santa Cruz)

 

SELECT: A Distributed Publish/Subscribe Notification System for Online Social Networks
Nuno Apolónia (Universitat Politècnica de Catalunya), Stefanos Antaris (University of Cyprus, Cyprus), Sarunas Girdzijauskas (KTH Royal Institute of Technology), George Pallis (University of Cyprus, Cyprus), Mario Dikaiakos (University of Cyprus, Cyprus)

 

A Lightweight Communication Runtime for Distributed Graph Analytics
Hoang-Vu Dang (University of Illinois at Urbana-Champaign), Roshan Dathathri (The University of Texas at Austin), Gurbinder Gill (The University of Texas at Austin), Alex Brooks (University of Illinois at Urbana-Champaign), Nikoli Dryden (University of Illinois at Urbana-Champaign), Andrew Lenharth (Microsoft), Loc Hoang (The University of Texas at Austin), Keshav Pingali (The University of Texas at Austin), Marc Snir (University of Illinois at Urbana-Champaign)

 

SESSION 24: Networks and Communication

Session Chair: Christian Schindelhauer

 

Intra-Cluster Coalescing and CTA Scheduling to Reduce GPU NoC Pressure
Lu Wang (Ghent University), Xia Zhao (Ghent University), David Kaeli (Northeastern University), Lieven Eeckhout (Ghent University)

 

HybridPass: Hybrid Scheduling for Mixed Flows in Datacenter Networks
Bo Peng (Shanghai Jiao Tong University), Jianguo Yao (Shanghai Jiao Tong University), Zhengwei Qi (Shanghai Jiao Tong University), Haibing Guan (Shanghai Jiao Tong University)

 

Scalable Power-Efficient Kilo-Core Photonic-Wireless NoC Architectures
Avinash Kodi (Ohio University), Kyle Shiflett (Ohio University), Savas Kaya (Ohio University), Ahmed Louri (George Washington University), Soumyasanta Laha (Ohio University)

 

Designing Efficient Shared Address Space Reduction Collectives for Multi-/Many-cores
Jahanzeb Maqbool Hashmi (The Ohio State University), Sourav Chakraborty (The Ohio State University), Mohammadreza Bayatpour (The Ohio State University), Hari Subramoni (The Ohio State University), Dhabaleswar Panda (The Ohio State University)


Afternoon Break 3:30 PM - 4:00 PM

Parallel Technical Sessions 25, 26, 27 & 28
4:00 PM - 6:00 PM

SESSION 25: Distributed Computing

Session Chair: Gokarna Sharma

 

Tiny Groups Tackle Byzantine Adversaries
Mercy Jaiyeola (Mississippi State University), Kyle Patron (Palantir Technologies), Jared Saia (University of New Mexico), Qian Zhou (Mississippi State University), Maxwell Young (Mississippi State University)

 

Skueue: A Scalable and Sequentially Consistent Distributed Queue
Michael Feldmann (Paderborn University), Christian Scheideler (Paderborn University), Alexander Setzer (Paderborn University)

 

Self-Stabilizing Supervised Publish-Subscribe Systems
Michael Feldmann (Paderborn University), Christina Kolb (Paderborn University), Christian Scheideler (Paderborn University), Thim Strothmann (Paderborn University)

 

Spartan: A Framework For Sparse Robust Addressable Networks
John Augustine (Indian Institute of Technology Madras), Sumathi Sivasubramanian (Indian Institute of Technology Madras)

 

SESSION 26: Graph Algorithms 4

Session Chair: Ananth Kalyanamaran

 

Beyond binary search: parallel in-place construction of implicit search tree layouts
Kyle Berney (University of Hawaii at Manoa), Henri Casanova (University of Hawaii at Manoa), Alyssa Higuchi (University of Hawaii at Manoa), Ben Karsin (University of Hawaii at Manoa), Nodari Sitchinava (University of Hawaii at Manoa)

 

An Energy-Efficient Single-Source Shortest Path Algorithm
Sara Karamati (Georgia Institute of Technology), Jeffrey Young (Georgia Institute of Technology), Richard Vuduc (Georgia Institute of Technology)

 

Scalable Breadth-First Search on a GPU Cluster
Yuechao Pan (University of California, Davis), Roger Pearce (Lawrence Livermore National Laboratory), John Owens (University of California, Davis)

 

SESSION 27: Communication Performance

Session Chair: DK Panda

 

Chameleon: Online Clustering of MPI Program Traces
Amir Bahmani (North Carolina State University), Frank Mueller (North Carolina State University)

 

Trade-off Study of Localizing Communication and Balancing Network Traffic on Dragonfly System
Xin Wang (Illinois Institute of Technology), Misbah Mubarak (Argonne National Labs), Robert Ross (Argonne National Laboratory), Zhiling Lan (Illinois Institute of Technology)

 

Level-Spread: A New Job Allocation Policy for Dragonfly Networks
Yijia Zhang (Boston University), Ozan Tuncer (Boston University), Fulya Kaplan (Boston University), Katzalin Olcoz (Universidad Complutense de Madrid), Vitus J. Leung (Sandia National Laboratories), Ayse K. Coskun (Boston University)

 

SESSION 28: Storage & FileSystem

Session Chair: Xubin He

 

A Migratory Heterogeneity-Aware Data Layout Scheme for Parallel File Systems
Shuibing He (Wuhan University), Xian-He Sun (Illinois Institute of Technology), Yang Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences), Chenzhong Xu (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)

 

LALCA: Locality-Aware Lock Contention Avoidance for NVMe-based Scale-out Storage System
Myoungwon Oh (SK Telecom), Sejin Park (SK Telecom), Jugwan Eom (SK Telecom), Seungmin Kim (SK Telecom), Sangjae Kim (SK Telecom), Kang-Won Lee (SK Telecom), Heon Y Yeom (Seoul National University)

 

Mitigating Traffic-based Side Channel Attacks in Bandwidth-efficient Cloud Storage
Pengfei Zuo (Huazhong University of Science and Technology), Yu Hua (Huazhong University of Science and Technology), Cong Wang (City University of Hong Kong), Wen Xia (Huazhong University of Science and Technology), Shunde Cao (Huazhong University of Science and Technology), Yukun Zhou (Huazhong University of Science & Technology), Yuanyuan Sun (Huazhong University of Science and Technology)

 

Chameleon: An Adaptive Wear Balancer for Flash Clusters
Nannan Zhao (Virginia Tech), Ali Anwar (Virginia Tech), Yue Cheng (George Mason University), Mohammed Salman (Virginia Tech), Daping Li (Huazhong University of Science and Technology), Jiguang Wan (Huazhong University of Science and Technology), Changsheng Xie (Huazhong University of Science and Technology), Xubin He (Temple University), Feiyi Wang (Oak Ridge National Laboratory), Ali R. Butt (Virginia Tech)


FRIDAY - 25 May 2018

DAYSMondayTuesdayWednesdayThursdayFriday

FRIDAY WORKSHOPS
ALL DAY*
* See each individual
workshop program
for schedule details

 

IPDPS 2018 WORKSHOPS – FRIDAY 25 MAY
13 CHIUW

Chapel Implementers and Users Workshop

14

PDSEC

Parallel and Distributed Scientific and Engineering Computing

15

JSSPP

Job Scheduling Strategies for Parallel Processing

16

iWAPT

International Workshop on Automatic Performance Tunings

17

ParSocial

Parallel and Distributed Processing for Computational Social Systems

18

GraML

Graph Algorithms and Machine Learning

19

CEBDA

Convergence of Extreme Scale Computing and Big Data Analysis

20

MPP

Parallel Programming Model: Special Edition on Edge/Fog/In-Situ Computing

 

PASCO

Parallel Symbolic Computation CANCELLED

21

PMAW

Programming Models and Algorithms Workshop

22

ROME

Runtime and Operating Systems for the Many-core Era

 

2018 Keynote Speakers


Michael A. Bender
Stony Brook University

Tuesday, May 22nd

Title: The Algorithmics of Write Optimization

Abstract: Write-optimized dictionaries (WODs), such as LSM trees and B^epsilon trees, are increasingly used in databases and file systems. Such data structures support very fast insertions without sacrificing lookup performance.

This talk explains how WODs can substantially reduce the I/O cost of many workloads, enabling some applications to scale by orders of magnitude. In contrast, traditional data structures, such as B-trees, are often I/O bound on these workloads. The talk explores write-optimization from the perspective of foundational theory, parallelization, and applications.

Bio: Michael A. Bender is a professor of computer science at Stony Brook University. His research interests span the areas of data structures and algorithms, I/O-efficient computing, parallel computing, and scheduling. He has coauthored over 130 articles on these and other topics. He has won several awards, including an R&D 100 Award, a Test-of-Time award, two Best Paper Awards, and five awards for graduate and undergraduate teaching.

Bender was Founder and Chief Scientist at Tokutek, Inc, an enterprise database company, which was acquired by Percona in 2015. He has held Visiting Scientist positions at both MIT and King's College London.

Bender received his B.A. in Applied Mathematics from Harvard University in 1992 and obtained a D.E.A. in Computer Science from the Ecole Normale Superieure de Lyon, France in 1993. He completed a Ph.D. on Scheduling Algorithms from Harvard University in 1998.



Keren Bergman
Columbia University

Wednesday, May 23rd

Title: Empowering flexible and scalable high performance architectures with embedded photonics

Abstract: The recent explosive growth in data analytics applications that rely on machine and deep learning techniques are seismically changing the landscape of high performance architectures. These techniques rely on graphics processing units (GPU) and manycore (CPU) technologies whose need for intense performance is pushing current interconnect networks to their limits. Driven by these applications, the execution performance along with the energy consumption of massive parallel systems is increasingly determined by how data is moved among the numerous compute and memory resources. Embedded photonic interconnect technologies can address critical data-movement challenges by delivering higher communication bandwidth densities at significantly improved energy efficiencies. New disaggregated architectures enabled by embedded photonics and optical bandwidth steering can reduce the system-wide energy consumption.

Bio: Keren Bergman is the Charles Batchelor Professor of Electrical Engineering at Columbia University where she also serves as the Scientific Director of the Columbia Nano Initiative. Prof. Bergman received the B.S. from Bucknell University in 1988, and the M.S. in 1991 and Ph.D. in 1994 from M.I.T. all in Electrical Engineering. At Columbia, Bergman leads the Lightwave Research Laboratory encompassing multiple cross-disciplinary programs at the intersection of computing and photonics. She is a Fellow of the Optical Society of America (OSA) and IEEE.



Bruce Hendrickson
Lawrence Livermore National Lab

Thursday, May 24th

Title: The Day After Tomorrow: The Looming Post-Exascale Crisis

Abstract: This is the best of times for high-performance computing. Our simulations continue to grow in complexity and sophistication, and they drive progress in the physical sciences and engineering, improving the world in myriad ways. Around the globe multiple plans are underway to achieve exascale performance by the early 2020's. Although considerable challenges remain, it seems that we can get to the exascale with familiar technologies and programming models.

But while the immediate future looks bright, significant problems lie ahead. With Dennard scaling long-dead, device technologies won't save us from crippling power requirements for a 10 exaflop machine. And with Moore's Law sputtering towards end-of-life, performance is not improving much with each new generation of chips. There has been considerable excitement about the promise of quantum computing, reversible computing and other exotic concepts. But it seems likely that these will not become general-purpose approaches to computing for many years, if ever.

Simultaneously, the demands on our simulations are increasing. As HPC gets used for critical decisions, we need to enhance our capabilities to understand uncertainties and to optimize designs. Meanwhile, new applications in biomedicine, social sciences and other data-centric fields are adding additional complexity to our ecosystem. These are welcome and exciting opportunities, but the are appearing at a time when our near-future is highly uncertain.

In this talk I will provide a perspective on these challenges and possible paths forward. I will argue that dramatic changes in computer architecture are unavoidable, and that these changes will be highly disruptive for our approach to algorithms and software. I will describe some of the research challenges that will need to be overcome to enable continued progress.

As with the change from serial to parallel computing in the early 90s, we are approaching an era of great uncertainty. We will need to be prepared to revisit many of our existing assumptions about High Performance Computing.

Bio: Bruce Hendrickson is Associate Director for Computation at Lawrence Livermore National Lab in Livermore California. In this role, he leads an organization of more than 1,000 computing professionals with responsibility for the full breadth of the Laboratory's computational needs including research, platforms, and services. He came to Livermore in 2017 after a long career at Sandia National Labs where he led the Center for Computational Research and managed Sandia's Advanced Simulation and Computing program. Hendrickson has degrees in Mathematics and Physics from Brown University and a Ph.D. in Computer Science from Cornell University. His research interests include computational science, parallel algorithms, combinatorial scientific computing, linear algebra, data mining, graph algorithms and computer architecture. He is a highly published and cited scientist and his research has garnered a number of international awards. Hendrickson is a former Hertz Fellow and is a Fellow of the Society for Industrial and Applied Mathematics and of the American Association for the Advancement of Science.



2018 Tutorial Presenter

Janice McMahon
Emu Technology

Tuesday, May 22nd

Bio: Janice McMahon has an extensive background in massively parallel computation, advanced computing architectures, algorithm and application mapping, and signal and image processing embedded computing. After attaining B.S. and M.S. degrees in Computer Science and Engineering from M.I.T., Ms. McMahon has worked in research, industry, and government environments within the high-performance computing industry. Within the research community, she has worked at MIT Lincoln Laboratory, Information Sciences Institute, and Reservoir Labs on advanced algorithms and architectures for a variety of high performance applications, as both researcher and project manager.  Within the computing industry, she has worked on state-of-the-art software and hardware architectures at MasPar Computer Corporation, Scientific Computing and Analysis, HPC Project (now Sylvan), and currently, Emu Technology. Throughout her 30-year career, she has been exposed to a large variety of advanced research and commercial computer architectures as well as a broad range of high performance applications and algorithms. Her technical specialties include parallel algorithm mapping and performance analysis.

Search IPDPS

 

2018 Registration

March 20th Deadline for Advance Registration
Extended to March 27th

 

Registration Details

Follow IPDPS

   

Tweets by @IPDPS

IPDPS 2017 Report



31st IEEE International Parallel &
Distributed Processing Symposium 
May 29 – June 2, 2017
Buena Vista Palace Hotel
Orlando, Florida USA

REPORT ON IPDPS 2017