IPDPS 2016 Conference

General IPDPS Info

IN COOPERATION WITH

SIGARCH & SIGHPC

IEEE Computer Society Technical Committee
on Computer Architecture

IEEE Computer Society Technical Committee
on Distributed Processing

HOST

IPDPS 2016 Advance Program

Please visit the IPDPS website regularly for updates, since there may be schedule revisions. Authors who have corrections should send email to contact@ipdps.org giving full details. Note that paper numbers are listed for easy reference.

MONDAY - 23 May 2016

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

MONDAY WORKSHOPS
ALL DAY*
* See each individual
workshop program
for schedule details

IPDPS 2016 WORKSHOPS – MONDAY 23 MAY
1	HCW	Heterogeneity in Computing Workshop
2	RAW	Reconfigurable Architectures Workshop
3	HIPS	Workshop on High-Level Parallel Programming Models & Supportive Environments
4	HiCOMB	Workshop on High Performance Computational Biology
5	APDCM	Advances in Parallel and Distributed Computational Models
6	ASHES (+ PLC)	Accelerators and Hybrid Exascale Systems
7	PCO	Parallel Computing and Optimization
8	GABB	Graph Algorithms Building Blocks
9	EduPar	NSF/TCPP Workshop on Parallel and Distributed Computing Education
10	HPDAV	High Performance Data Analysis and Visualization
11	VarSys	Variability in Parallel and Distributed Systems

Roundtable
5:00 PM

Round-table Workshop II: Heterogeneous Tasking

See Workshops page for details

Light Reception
6:00 PM – 7:30 PM

IPDPS 2016 Welcome Reception & TCPP Annual Meeting

TUESDAY - 24 May 2016

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Opening Session
8:00 AM - 8:30 AM

Opening Session: TBA

Keynote Session
8:30 AM - 9:30 AM

Keynote Speech

Session Chair: Xian-He Sun

Kai Li
Princeton University

Disruptive Research and Innovation

Abstract: Ever since Clayton Christensen coined the terms “disruptive technologies” and “disruptive innovations” in 1990s, researchers and entrepreneurs love the word “disruptive” because disrupting current knowledge or products help us accelerate knowledge discoveries and moving the society into a new era. What is disruptive research? What is disruptive innovation? How do they happen? ... Read more

Morning Break 9:30 AM -10:00 AM

PhD Forum
Starts on Tuesday

PhD Forum Posters
On Display All Day Tuesday and Wednesday

More details to be announced

Parallel Technical
Sessions 1, 2, 3, & 4
10:00 AM - 12:00 PM

Sesson 1
Graph Algorithms

Session Chair: Umit V. Catalyurek

1570222790
Subgraph Counting: Color Coding Beyond Trees
Venkatesan T Chakaravarthy (IBM Research, India); Mikhail Kapralov (IBM Research, USA); Prakash Murali (IBM Research, India); Fabrizio Petrini and Xinyu Que (IBM T.J. Watson Research Center, USA); Yogish Sabharwal (IBM Research, India); Baruch Schieber (IBM T.J. Watson Research Center, USA)

1570222987
A Practical Parallel Algorithm for Diameter Approximation of Massive Weighted Graphs
Matteo Ceccarello, Andrea Pietracaprina and Geppino Pucci (University of Padova, Italy); Eli Upfal (Brown University, USA)

1570223109
Rabbit Order: Just-in-time Parallel Reordering for Fast Graph Analysis
Junya Arai (Nippon Telegraph and Telephone Corporation, Japan); Hiroaki Shiokawa (University of Tsukuba, Japan); Takeshi Yamamuro (Nippon Telegraph and Telephone Corporation, Japan); Makoto Onizuka (Osaka University, Japan); Sotetsu Iwamura (Nippon Telegraph and Telephone Corporation, Japan)

1570223232
Distributed-Memory Algorithms for Maximum Cardinality Matching in Bipartite Graphs
Ariful Azad and Aydin Buluc (Lawrence Berkeley National Laboratory, USA)

Session 2
Software Environments and Tools

Session Chair: Karen L Karavanic

1570223069
Automatic Parallel Pattern Detection in the Algorithm Structure Design Space
Zia Ul Huda (TU Darmstadt and Laboratory for Parallel Programming, Germany); Ali Jannesari (German Research School for Simulation Sciences and RWTH Aachen University, Germany); Felix Wolf (TU Darmstadt, Germany)

1570223071
ARCHER: Effectively Spotting Data Races in Large OpenMP Applications
Simone Atzeni, Ganesh Gopalakrishnan and Zvonimir Rakamaric (University of Utah, USA); Dong Ahn, Gregory L Lee, Ignacio Laguna and Martin Schulz (Lawrence Livermore National Laboratory, USA); Joachim Protze (RWTH Aachen University, Germany); Matthias Mueller (RWTH Aachen University, USA)

1570223206
SEAK: Future-Proof Mission-Centric Benchmarking
Nathan Tallent, Joseph B Manzano, Nitin A. Gawande, Seunghwa Kang, Darren Kerbyson and Adolfy Hoisie (Pacific Northwest National Laboratory, USA); Joseph Cross (DARPA, USA)

1570223307
Design and Implementation of a Parallel Research Kernel for Assessing Dynamic Load-Balancing Capabilities
Evangelos Georganas (University of California, Berkeley, USA); Rob F Van der Wijngaart and Tim Mattson (Intel Corporation, USA)

Session 3
Network Architecture

Session Chair: Ron Brightwell

1570222351
VNRE: Flexible and Efficient Acceleration for Network Redundancy Elimination
Xiongzi Ge (University of Minnesota, Twin Cities, USA); Yi Liu (Huawei Corporation, P.R. China); Chengtao Lu (Xi'an Technological University, P.R. China); Jim Diehl (University of Minnesota, Twin Cities, USA); David Du (University of Minnesota, USA); Liang Zhang and Jian Chen (Huawei Corporation, P.R. China)

1570222825
Analyzing Network Health and Congestion in Dragonfly-based Systems
Abhinav Bhatele (Lawrence Livermore National Laboratory, USA); Nikhil Jain (University of Illinois at Urbana-Champaign, USA); Yarden Livnat and Valerio Pascucci (University of Utah, USA); Peer-Timo Bremer (Lawrence Livermore National Laboratory, USA)

1570223101
Random Regular Graph and Generalized De Bruijn Graph with K-Shortest Path Routing
Peyman Faizian, Md Atiqul Mollah and Xin Yuan (Florida State University, USA); Scott Pakin and Michael Lang (Los Alamos National Laboratory, USA)

1570223261
Deflection Containment for Bufferless Network-on-Chips
Xiyue Xiang and Nian-Feng Tzeng (University of Louisiana at Lafayette, USA)

Session 4
Application Optimization

Session Chair: Shirley V Moore

1570222164
RUPS: Fixing Relative Distances Among Urban Vehicles with Context-Aware Trajectories
Hongzi Zhu (Shanghai Jiao Tong University, P.R. China); Shan Chang (Donghua University, P.R. China); Li Lu (University of Electronic Science and Technology of China, P.R. China); Wei Zhang (Shanghai Jiao Tong University, P.R. China)

1570222486
HDT: A Hybrid Structure for Extreme-Resolution 3D Sparse Data Modeling
Mohammad M Hossain (Georgia Institute of Technology, USA); Thomas Tucker (Tucker Innovations, USA); Thomas Kurfess and Richard W Vuduc (Georgia Institute of Technology, USA)

1570222676
Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-Dimensional Intra-Tile Parallelization
Tareq Malas (KAUST, Saudi Arabia); Julian Hornich and Georg Hager (Friedrich-Alexander University of Erlangen-Nuremberg, Germany); Hatem Ltaief (KAUST and Extreme Computing Research Center, Saudi Arabia); Christoph Pflaum (Friedrich-Alexander University of Erlangen-Nuremberg, Germany); David Keyes (KAUST, Saudi Arabia)

1570223280
Order-Invariant Real Number Summation: Circumventing Accuracy Loss for Multimillion Summands on Multiple Parallel Architectures
Patrick E Small, Rajiv Kalia, Aiichiro Nakano and Priya Vashishta (University of Southern California, USA)

Roundtable
Workshop

12:00 PM – 1:30 PM

IPDPS 2016 Round-Table Workshop I:
PDC in Core Undergraduate Education

Dick Brown of St. Olaf College and Suzanne Matthews of West Point will lead discussion on this topic to of interest to the IPDPS community. For details...

Parallel Technical Sessions 5, 6, 7, & 8
1:30 PM - 3:30 PM

Session 5
Linear Algebra & Solvers

Session Chair: Cevdet Aykanat

1570221998
INV-ASKIT: A Parallel Fast Direct Solver for Kernel Matrices
Chenhan Yu and William March (The University of Texas at Austin, USA); Bo Xiao (Georgia Institute of Technology, USA); George Biros (The University of Texas at Austin, USA)

1570222429
A Fast Tridiagonal Solver for Intel MIC Architecture
Xinliang Wang, Wei Xue, Yangtong Xu and Weimin Zheng (Tsinghua University, P.R. China)

1570222694
A Relaxed Synchronization Approach for Solving Parallel Quadratic Programming Problems with Guaranteed Convergence
Kooktae Lee, Raktim Bhattacharya, Jyotikrishna Dass, V N S Prithvi Sakuru and Rabi Mahapatra (Texas A&M University, USA)

1570222703
Enhancing Scalability and Load Balancing of Parallel Selected Inversion Via Tree-Based Asynchronous Communication
Mathias Jacquelin (Lawrence Berkeley National Lab, USA); Lin Lin (University of California Berkeley, USA); Nathan Wichmann (Cray Inc., USA); Chao Yang (Lawrence Berkeley National Lab, USA)

Session 6
Fault Tolerance & Resilience

Session Chair: Kathryn Mohror

1570221913
Optimal Resilience Patterns to Cope with Fail-Stop and Silent Errors
Anne Benoit, Aurelien Cavelan and Yves Robert (ENS Lyon, France); Hongyang Sun (ENS Lyon and INRIA, France)

1570223102
Reducing Waste in Large Scale Systems Through Introspective Analysis
Leonardo Bautista-Gomez (Argonne National Laboratory, USA); Ana Gainaru (University of Illinois at Urbana-Champaign and National Center for Suppercomputing Applications, USA); Swann Perarnau (Argonne National Laboratory, USA); Devesh Tiwari and Saurabh Gupta (Oak Ridge National Laboratory, USA); Franck Cappello (Argonne National Laboratory, University of Illinois at Urbana Champaign and Inria, France); Christian Engelmann (Oak Ridge National Laboratory, USA); Marc Snir (Argonne National Laboratory, USA)

1570223110
Fault Modeling of Extreme Scale Applications Using Machine Learning
Abhinav Vishnu (Pacific Northwest National Laboratory, USA); Hubertus J. J. Van Dam (Brookhaven National Laboratory, USA); Nathan Tallent, Darren Kerbyson and Adolfy Hoisie (Pacific Northwest National Laboratory, USA)

1570223151
Efficient Checkpointing of Multi-Threaded Applications as a Tool for Debugging, Performance Tuning, and Resiliency
Max Grossman and Vivek Sarkar (Rice University, USA)

Session 7
Modeling and Evaluation

Session Chair: David Lowenthal

1570222116
X: A Comprehensive Analytic Model for Parallel Machines
Ang Li (Eindhoven University of Technology, The Netherlands); Shuaiwen Song (Pacific Northwest National Laboratory, USA); Eric Brugel (The State University of New Jersey, USA); Akash Kumar (Technische Universität Dresden, Germany); Daniel Gerardo Chavarria (Pacific Northwest National Laboratory, USA); Henk Corporaal (Technical University Eindhoven, The Netherlands)

1570222410
NiMC: Characterizing and Eliminating Network-Induced Memory Contention
Taylor L Groves (Sandia National Laboratories and University of New Mexico, USA); Ryan E Grant (Sandia National Laboratories and Center for Computing Research, USA); Dorian C Arnold (University of New Mexico, USA)

1570222656
An Early Performance Study of Large-scale POWER8 SMP Systems
Xing Liu, Daniele Buono, Fabio Checconi, Jee W Choi, Xinyu Que, Fabrizio Petrini, John Gunnels and Jeff Stuecheli (IBM T. J. Watson Research Center, USA)

1570223244
A Methodology for Modeling Dynamic and Static Power Consumption
Bhavishya Goel and Sally A. McKee (Chalmers University of Technology, Sweden)

Session 8
Graph Applications

Session Chair: Aydin Buluc

1570223087
Algorithmic Techniques for Solving Graph Problems on the Automata Processor
Indranil Roy (Micron Technology, Inc., USA); Nagakishore Jammula (Georgia Institute of Technology, USA); Srinivas Aluru (Georgia Institute of Technology and Indian Institute of Technology Bombay, USA)

1570222224
A Case Study of Complex Graph Analysis in Distributed Memory: Implementation and Optimization
George M Slota (The Pennsylvania State University, USA); Sivasankaran Rajamanickam (Sandia National Laboratories, USA); Kamesh Madduri (The Pennsylvania State University, USA)

1570222730
FastBFS: Fast Breadth-First Graph Search on a Single Server
Shuhan Cheng, Guangyan Zhang, Jiwu Shu and Qingda Hu (Tsinghua University, P.R. China)

1570223136
GraphPad: Optimized Graph Primitives for Parallel and Distributed Platforms
Michael Anderson (Intel Corporation, USA); Narayanan Sundaram (Intel Labs, USA); Nadathur Satish (Intel Corporation, USA); Md. Mostofa Ali Patwary (Intel Labs, USA); Theodore L. Willke, II and Pradeep Dubey (Intel Corporation, USA)

Afternoon Break 3:30 PM - 4:00 PM

Parallel Technical
Sessions 9, 10, 11, & 12
4:00 PM - 6:00 PM

Session 9
Cloud Resource Allocation

Session Chair: Xinghui Zhao

1570222099
On First Fit Bin Packing for Online Cloud Server Allocation
Xueyan Tang, Yusen Li, Runtian Ren and Wentong Cai (Nanyang Technological University, Singapore)

1570222556
Smoothed Online Resource Allocation in Multi-Tier Distributed Cloud Networks
Lei Jiao (Bell Labs, Ireland); Antonia Tulino (Bell Labs and Università Federico II, Napoli, USA); Jaime Llorca (Bell Labs, Alcatel-Lucent, USA); Yue Jin (Alcatel-Lucent, Ireland); Alessandra Sala (Bell Labs, Alcatel-Lucent, Ireland)

1570222638
Dynamic Acceleration of Parallel Applications in Cloud Platforms by Adaptive Time-Slice Control
Song Wu, Zhenjiang Xie and Haibao Chen (Huazhong University of Science and Technology, P.R. China); Sheng Di (Argonne National Laboratory, USA); Xinyu Zhao and Hai Jin (Huazhong University of Science and Technology, P.R. China)

1570223282
Mystic: Predictive Scheduling for GPU Based Cloud Servers Using Machine Learning
Yash Ukidave, Xiangyu Li and David Kaeli (Northeastern University, USA)

Session 10
Memory Management

Session Chair: Nikos Hardavellas

1570222349
TintMalloc: Reducing Memory Access Divergence Via Controller-Aware Coloring
Xing Pan, Yasaswini Jyothi Gownivaripalli and Frank Mueller (NCSU, USA)

1570223050
Markov Chain-based Adaptive Scheduling in Software Transactional Memory
Pierangelo Di Sanzo, Marco Sannicandro, Bruno Ciciani and Francesco Quaglia (Sapienza – Università di Roma, Italy)

1570223303
MEMTUNE: Dynamic Memory Management for In-memory Data Analytic Platforms
Luna Xu (Virginia Tech, USA); Min Li and Li Zhang (IBM T. J. Watson Research Center, USA); Ali R. Butt (Virginia Tech, USA); Yandong Wang (IBM T.J. Watson Research Center, USA); Zane Zhenhua Hu (IBM Platform Computing, Canada)

1570223371
High-Performance Hybrid Key-Value Store on Modern Clusters with RDMA Interconnects and SSDs: Non-blocking Extensions, Designs, and Benefits
Dipti Shankar, Xiaoyi Lu, Nusrat Islam, Md. Wasi-ur-Rahman and Dhabaleswar Panda (The Ohio State University, USA)

Session 11
Scheduling and Resource Management

Session Chair: Hank Hoffmann

1570222813
GreenMatch: Renewable-Aware Workload Scheduling for Massive Storage Systems
Xiaoyang Qu and Jiguang Wan (Huazhong University of Science and Technology, P.R. China); Jun Wang (University of Central Florida, USA); Liqiong Liu, Dan Luo and Changsheng Xie (Huazhong University of Science and Technology, P.R. China)

1570222947
CATA: Criticality Aware Task Acceleration for Multicore Processors
Emilio Castillo (Barcelona Supercomputing Center, Spain); Miquel Moreto (Barcelona Supercomputing Center and Universitat Politècnica de Catalunya, Spain); Marc Casas and Lluc Alvarez (Barcelona Supercomputing Center, Spain); Enrique Vallejo (University of Cantabria, Spain); Kallia Chronaki and Rosa M. Badia (Barcelona Supercomputing Center, Spain); Jose L Bosque and Ramon Beivide (University of Cantabria, Spain); Eduard Ayguade (Universitat Politècnica de Catalunya and Barcelona Supercomputing Center, Spain); Jesús Labarta (Barcelona Supercomputing Center, Spain); Mateo Valero (Universidad Politécnica de Cataluña, Spain)

1570222965
TECfan: Coordinating Thermoelectric Cooler, Fan, and DVFS for CMP Energy Optimization
Wenli Zheng, Kai Ma and Xiaorui Wang (The Ohio State University, USA)

1570222631
Utility Maximizing Thread Assignment and Resource Allocation
Pan Lai, Rui Fan, Wei Zhang and Fang Liu (Nanyang Technological University, Singapore)

Session 12
Scientific Applications (1)

Session Chair: Kamesh Madduri

1570222419
A Hybrid Decomposition Parallel Algorithm for Multi-Scale Simulation of Viscoelastic Fluids
Xiao-Wei Guo, Xin-hai Xu, Qian Wang, Hao Li, Xiao-Guang Ren, Liyang Xu and Xuejun Yang (National University of Defense Technology, P.R. China)

1570223174
A Hartree-Fock Application Using UPC++ and the New DArray Library
David Ozog (University of Oregon, USA); Amir Kamil, Yili Zheng and Paul H. Hargrove (Lawrence Berkeley National Laboratory, USA); Jeff Hammond (Intel Labs, USA); Allen D. Malony (University of Oregon, USA); Wibe De Jong and Katherine Yelick (Lawrence Berkeley National Laboratories, USA)

1570223276
A Fast Selected Inversion Algorithm for Green's Function Calculation in Many-body Quantum Monte Carlo Simulations

Chengming Jiang, Zhaojun Bai and Richard Scalettar (University of California, Davis, USA)

Industry
Tutorial

6:00 PM – 8:00 PM

NVIDIA Tutorial for University Educators: Teach GPU-Accelerated Computing with the New NVIDIA Teaching Kit

Dr. Wen-Mei Hwu from University of Illinois (UIUC) will lead a hands-on tutorial that introduces the GPU Teaching Kit for Accelerated Computing for use in university courses… Read more

WEDNESDAY - 25 May 2016

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Keynote Session
8:30 AM – 9:30 AM

Keynote Speech

Session Chair: Jeffrey K Hollingsworth

Thomas Pawlowski
Micron

Memory, Storage and Processing in Future Parallel and Distributed Processing Systems

Morning Break 9:30 AM - 10:00 AM

Parallel Technical Sessions 13, 14, 15, & 16
10:00 AM - 12:00 PM

Session 13
Clustering & Partitioning

Session Chair: Ananth Kalyanaraman

1570222892
A New Approximation Algorithm for Matrix Partitioning in Presence of Strongly Heterogeneous Processors
Olivier Beaumont (Inria, France); Lionel Eyraud-Dubois (INRIA Bordeaux Sud-Ouest and University of Bordeaux, France); Thomas Lambert (Inria, France)

1570222968
Structural Clustering: A New Approach to Support Performance Analysis At Scale
Matthias Weber, Ronny Brendel and Tobias Hilbrich (Technische Universität Dresden, Germany); Kathryn Mohror and Martin Schulz (Lawrence Livermore National Laboratory, USA); Holger Brunst (Technische Universitaet Dresden, Germany)

1570223138
PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures
Md. Mostofa Ali Patwary (Intel Labs, USA); Nadathur Satish (Intel Corporation, USA); Narayanan Sundaram (Intel Labs, USA); Jialin Liu (Lawrence Berkeley National Laboratory, USA); Peter Sadowski (UC Irvine, USA); Evan Racahc, Surendra Byna, Wahid Bhimji, Craig Tull and Mr Prabhat (Lawrence Berkeley National Laboratory, USA); Pradeep Dubey (Intel Corporation, USA)

1570222754
DataNet: A Data Distribution-aware Method for Sub-dataset Analysis on Distributed File Systems
Jun Wang, Jiangling Yin, Jian Zhou and Xuhong Zhang (University of Central Florida, USA)

Session 14
Accelerated Computing

Session Chair: Erik Saule

1570219787
Synchronization Trade-offs in GPU Implementations of Graph Algorithms
Rashid Kaleem (University of Texas at Austin, USA); Anand Venkat (University of Utah, USA); Sreepathi Pai (ICES, UT Austin, USA); Mary Hall (University of Utah, USA); Keshav Pingali (University of Texas at Austin, USA)

1570221276
Eliminating Intra-warp Load Imbalance in Irregular Nested Patterns Via Collaborative Task Engagement
Farzad Khorasani, Bryan Rowe, Rajiv Gupta and Laxmi Bhuyan (University of California Riverside, USA)

1570221827
Compiler-Assisted Workload Consolidation for Efficient Dynamic Parallelism on GPU
Hancheng Wu, Da Li and Michela Becchi (University of Missouri - Columbia, USA)

1570223116
OpenACC to FPGA: A Framework for Directive-based High-Performance Reconfigurable Computing
Seyong Lee and Jungwon Kim (Oak Ridge National Laboratory, USA); Jeffrey S Vetter (Oak Ridge National Laboratory and Georgia Institute of Technology, USA)

Session 15
Memory Hieracrchy

Session Chair: Nathan Tallent

1570222440
Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy
Wooil Kim (University of Illinois, USA); Sanket Tavarageri (The Ohio State University, USA); Ponnuswamy Sadayappan (Ohio State University, USA); Josep Torrellas (University of Illinois at Urbana-Champaign, USA)

1570222639
Refree: A Refresh-Free Hybrid DRAM/PCM Main Memory System
Bahareh Pourshirazi and Zhichun Zhu (University of Illinois at Chicago, USA)

1570223085
Re-NUCA: A Practical NUCA Architecture for ReRAM Based Last-Level Caches
Jagadish Kotra, Mohammad Arjomand, Diana Guttman, Mahmut Taylan Kandemir and Chita R. Das (The Pennsylvania State University, USA)

1570223229
Evaluating and Improving Thread-Level Speculation in Hardware Transactional Memories
Juan Salamanca (University of Campinas, Brazil); J. Nelson Amaral (University of Alberta, Canada); Guido Araujo (University of Campinas, Brazil)

Session 16
Optimization Techniques

Session Chair: Martin Schulz

1570222603
Enabling Application Scalability and Reproducibility by Reducing System Noise with SMT
Edgar A. Leon, Ian Karlin and Adam Moody (Lawrence Livermore National Laboratory, USA)

1570223002
Key/Value-enabled Flash Memory for Complex Scientific Workflows with On-line Analysis and Visualization
Stefan Eilemann, Fabien Delalondre, Jon Bernard, Judit Planas and Felix Schürmann (Ecole Polytechnique Fédérale de Lausanne, Switzerland); John Biddiscombe (CSCS, Swiss National Supercomputing Centre, Switzerland); Costas Bekas and Alessandro Curioni (IBM Zurich Research Laboratory, Switzerland); Bernard Metzler (IBM Research GmbH, Switzerland); Peter Kaltstein, Peter Morjan and Joachim Fenkes (IBM Deutschland Research and Development GmbH, Germany); Ralph Bellofatto and Lars Schneidenbach (IBM T. J. Watson Research Center Yorktown Heights, USA); Chris Ward (IBM, United Kingdom); Blake Fitch (IBM, USA)

1570223182
Fast Classification of MPI Applications Using Lamport's Logical Clocks
Zhou Tong (Florida State University, USA); Scott Pakin and Michael Lang (Los Alamos National Laboratory, USA); Xin Yuan (Florida State University, USA)

1570222183
Online-Autotuning of Parallel SAH kD-Trees
Martin Tillmann, Philip Pfaffe, Christopher Kaag and Walter F. Tichy (Karlsruhe Institute of Technology, Germany)

Parallel Technical Sessions 17, 18, 19, & 20
1:30 PM - 3:30 PM

Session 17
Communication Efficiency & Avoidance Algorithms

Session Chair: Sivasankaran Rajamanickam

1570221490
Polynomial-time Construction of Optimal MPI Derived Datatype Trees
Robert Ganian, Martin Kalany and Stefan Szeider (Vienna University of Technology, Austria); Jesper Larsson Träff (Vienna University of Technology and Faculty of Informatics, Institute of Information Systems, Austria)

1570222842
Write-Avoiding Algorithms
Erin Carson (New York University, USA); James Demmel (University of California at Berkeley, USA); Laura Grigori (INRIA, France); Nicholas Knight and Penporn Koanantakool (University of California at Berkeley, USA); Oded Schwartz (Hebrew University, Israel); Harsha Vardhan Simhadri (Lawrence Berkeley National Lab, USA)

1570222974
Communication Efficient Algorithms for Top-k Selection Problems
Lorenz Hübschle-Schneider and Peter Sanders (Karlsruhe Institute of Technology, Germany)

1570223179
Minimal Aggregated Shared Memory Messaging on Distributed Memory Supercomputers
Benjamin Jamroz and John M Dennis (National Center for Atmospheric Research, USA)

Session 18
Distributed Algorithms

Session Chair: Shuaiwen Song

1570221725
Never Say Never Probabilistic & Temporal Failure Detectors
Dacfey Dzung (ABB Ltd. Corporate Research, Switzerland); Rachid Guerraoui (Swiss Federal Institute of Technology, Switzerland); David Kozhaya (EPFL, Switzerland); Yvonne-Anne Pignolet (ABB Ltd. Corporate Research, Switzerland)

1570222546
Gathering a Closed Chain of Robots on a Grid
Daniel Jung, Matthias Fischer, Friedhelm MeyerAufDerHeide, Sebastian Abshoff and Andreas Cord-Landwehr (University of Paderborn, Germany)

1570223003
On Competitive Algorithms for Approximations of Top-k-Position Monitoring of Distributed Streams
Manuel Malatyali, Alexander Mäcker and Friedhelm Meyer auf der Heide (Heinz Nixdorf Institute, University of Paderborn, Germany)

1570223027
Towards a Restrained Use of Non-equivocation for Achieving Iterative Approximate Byzantine Consensus
Li Chuanyou (Southeast University, P.R. China); Michel Hurfin (INRIA, France); Yun Wang (Southeast University, P.R. China); Lei Yu (Wuhan University, P.R. China)

Session 19
I/O and Storage

Session Chair: Fabrizio Petrini

1570223043
Storage-Optimized Data-Atomic Algorithms for Handling Erasures and Errors in Distributed Storage Systems
Erez Kantor (Northeastern, USA); Kishori Konwar and Nancy Lynch (CSAIL, MIT, USA); Muriel Médard and N. Prakash (MIT, USA); Alexander Shvartsman (University of Connecticut, USA)

1570219258
Fast Error-bounded Lossy HPC Data Compression with SZ
Sheng Di (Argonne National Laboratory, USA); Franck Cappello (Argonne National Laboratory, University of Illinois at Urbana Champaign, USA and Inria, France)

1570221528
I/O Aware Power Shifting
Lee Savoie and David Lowenthal (University of Arizona, USA); Bronis R. de Supinski, Tanzima Islam, Kathryn Mohror, Barry L Rountree and Martin Schulz (Lawrence Livermore National Laboratory, USA)

1570222409
On the Root Causes of Cross-application I/O Interference in HPC Storage Systems
Orcun Yildiz (INRIA Rennes, France); Matthieu Dorier (Argonne National Laboratory, USA); Shadi Ibrahim (INRIA Rennes, France); Robert Ross (Argonne National Laboratory, USA); Gabriel Antoniu (INRIA Rennes - Bretagne Atlantique, France)

Session 20
Scientific Applications (2)

Session Chair: Darren Kerbyson

1570222681
Exploiting Variant-based Parallelism for Data Mining of Space Weather Phenomena
Michael Gowanlock, David Blair and Victor Pankratius (Massachusetts Institute of Technology, USA)

1570222898
Solving Open MIP Instances with ParaSCIP on Supercomputers Using Up to 80,000 Cores
Yuji Shinano (Zuse Institute Berlin, Germany); Tobias Achterberg (Gurobi GmbH, Germany); Timo Berthold and Stefan Heinz (Fair Isaac Germany GmbH, Germany); Thorsten Koch (Zuse Institute Berlin, Germany); Michael Winkler (Gurobi GmbH, Germany)

1570223398
AAlign: A SIMD Framework for Pairwise Sequence Alignment on X86-Based Multi- And Many-core Processors
Kaixi Hou, Hao Wang and Wu-chun Feng (Virginia Tech, USA)

1570223482
Mendel: A Distributed Storage Framework for Similarity Searching Over Sequencing Data
Cameron Tolooee, Sangmi Pallickara and Asa Ben-Hur (Colorado State University, USA)

Afternoon Break 3:30 PM - 4:00 PM

Community Summit
4:00 PM – 5:30 PM

COMMUNITY SUMMIT: The Road Ahead for the IPDPS Community

This Community Summit, hosted by the IPDPS Steering Committee, will be an opportunity to discuss ideas for keeping pace with the times and continuing to build on the strengths of IPDPS. We will launch a program for gathering proposals and comments from the community with the plan to "summit" again in 2017 to see where things stand and what we learned. Return here closer to the conference for details.

PhD Forum Special Session

5:30 PM – 7:00 PM

Posters on Display

IPDPS Attendees Invited to View Posters and Talk with Student Presenters

JPDC Reception

6:00 PM – 7:00 PM

Hosted by Elsevier:

Introducing the new edition of Journal of Parallel & Distributed Processing

Symposium Banquet

After 7:00 PM

Banquet will open with short concert by Chinese String Band

Hosted by IPDPS 2016 General Chair Xian-He Sun

THURSDAY - 26 May 2016

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

Keynote Session
8:30 AM - 9:30 AM

Keynote Speech

Session Chair: Michela Taufer

Katrin Heitmann
Argonne National Laboratory

Unlocking the Mysteries of the Universe with Supercomputers

Abstract: Cosmology is in a scientifically very exciting phase. Two decades of surveying the sky have culminated in the celebrated "Cosmological Standard Model''. Yet, two of its key pillars, dark matter and dark energy -- together accounting for 95% of the mass-energy of the Universe -- remain mysterious. Deep fundamental questions… Read More

Morning Break 9:30 AM - 10:00 AM

PLENARY SESSION:
Best Papers
10:00 AM - 12:00 PM

Session Best Papers

Session Chair: Jeff K Hollingsworth

1570222433
ZNN - A Fast and Scalable Algorithm for Training 3D Convolutional Networks on Multi-Core and Many-Core Shared Memory Machines
Aleksandar Zlateski and Kisuk Lee (Massachusetts Institute of Technology, USA); H. Sebastian Seung (Princeton University, USA)

1570222925
Stochastic Matrix-Function Estimators Scalable Big-Data Kernels with High Performance
Peter Staar and Panagiotis Barkoutsos (IBM Zurich Research Laboratory, Switzerland); Roxana Istrate (IBM ZRL, Switzerland); A. Cristiano I. Malossi (IBM ZRL, Switzerland); Ivano Tavernelli, Nikolaj Moll and Heiner Giefers (IBM ZRL, Switzerland); Christoph Hagleitner (IBM ZRL, Switzerland); Costas Bekas and Alessandro Curioni (IBM ZRL, Switzerland)

1570223076
Discrete Cache Insertion Policies for Shared Last Level Cache Management on Large Multicores
Aswinkumar Sridharan (INRIA, France); André Seznec (Irisa/Inria, France)

1570223145
Massively Parallel First-Principles Simulation of Electron Dynamics in Materials
Erik Draeger and Xavier Andrade (Lawrence Livermore National Laboratory, USA); John Gunnels (IBM T. J. Watson Research Center, USA); Abhinav Bhatele (Lawrence Livermore National Laboratory, USA); Andre Schleife (University of Illinois, Urbana-Champaign, USA); Alfredo Correa (Lawrence Livermore National Laboratory, USA)

Parallel Technical Sessions 21, 22, 23 & 24
1:30 PM - 3:30 PM

Session 21
Numerical Algorithms

Session Chair: Yves Robert

1570223249
Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication
Penporn Koanantakool (University of California at Berkeley, USA); Ariful Azad, Aydin Buluc and Dmitriy Morozov (Lawrence Berkeley National Laboratory, USA); Sang-Yun Oh (University of California, Santa Barbara, USA); Leonid Oliker (Lawrence Berkeley National Laboratory, USA); Katherine Yelick (University of California at Berkeley, USA)

1570221707
Petascale Local Time Stepping for the ADER-DG Finite Element Method
Alexander Breuer (Technische Universität München, Germany); Alexander Heinecke (Intel Corporation, USA); Michael Bader (Technische Universität München, Germany)

1570222744
Asymptotic Optimality of Parallel Short Division
Niall Emmart and Charles Weems (University of Massachusetts, USA)

1570223388
High Performance Parallel Stochastic Gradient Descent in Shared Memory
Scott Sallinen (University of British Columbia, Canada); Nadathur Satish (Intel Corporation, USA); Mikhail Smelyanskiy and Samantika Sury (Intel Corporation, USA); Christopher Ré (Stanford University, USA)

Session 22
Graphs and Tensors

Session Chair: Bora Uçar

1570222751
Optimal Algorithms for Graphs and Images on a Shared Memory Mesh
Yujie An and Quentin Stout (University of Michigan, USA)

1570223119
Parallel Graph Coloring for Manycore Architectures
Mehmet Deveci, Erik G. Boman, Karen D Devine and Sivasankaran Rajamanickam (Sandia National Laboratories, USA)

1570223047
A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization
Shaden Smith and George Karypis (University of Minnesota, USA)

1570223065
Parallel Tensor Compression for Large-Scale Scientific Data
Woody Austin (University of Texas, USA); Grey Ballard and Tamara Kolda (Sandia National Laboratories, USA)

Session 23
Runtime Systems

Session Chair: Karen L Karavanic

1570222834
GinFlow: A Decentralised Adaptive Workflow Execution Manager
Javier Rojas Balderrama (University of Rennes 1 / INSERM, France); Matthieu Simonin (INRIA, France); Cedric Tedeschi (University of Rennes I / INRIA, France)

1570222935
Hierarchical Parallel Dynamic Dependence Analysis for Recursively Task-Parallel Programs
Nikolaos Papakonstantinou (FORTH-ICS, Greece); Foivos S. Zakkak (University of Crete and FORTH-ICS, Greece); Polyvios Pratikakis (FORTH-ICS, Greece)

1570223190
MPMD Framework for Offloading Load Balance Computation
Olga Pearce, Todd Gamblin, Bronis R. de Supinski and Martin Schulz (Lawrence Livermore National Laboratory, USA); Nancy Amato (Texas A&M University, USA)

1570223294
Integrating Abstractions to Enhance the Execution of Distributed Applications
Matteo Turilli (Rutgers University, USA); Feng Liu (University of Minnesota, USA); Zhao Zhang (University of California, Berkeley, USA); Andre Merzky (LSU, USA); Michael Wilde (University of Chicago, Argonne National Laboratory, USA); Jon Weissman (University of MInnesota, Twin Cities, USA); Daniel S. Katz (University of Chicago, USA); Shantenu Jha (Rutgers University, USA)

Session 24
GPUs

Session Chair: Michael Lam

1570222398
cusFFT: A High-Performance Sparse Fast Fourier Transform Algorithm on GPUs
Cheng Wang (University of Houston, USA); Sunita Chandrasekaran (University of Delaware, USA); Barbara Chapman (University of Houston, USA)

1570222616
Balancing Scalar and Vector Execution on GPU Architectures
Zhongliang Chen and David Kaeli (Northeastern University, USA)

1570223332
Exploiting Maximal Overlap for Non-Contiguous Data Movement Processing on Modern GPU-enabled System
Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Dip Sankar Banerjee, Hari Subramoni and Dhabaleswar Panda (The Ohio State University, USA)

1570223177
Online Algorithm-Based Fault Tolerance for Cholesky Decomposition on Heterogeneous Systems with GPUs
Jieyang Chen, Xin Liang and Zizhong Chen (University of California, Riverside, USA)

Afternoon Break 3:30 PM - 4:00 PM

Parallel Technical Sessions 25, 26, 27 & 28
4:00 PM - 6:00 PM

Session 25
Scheduling

Session Chair: Sanjay Chatterjee

1570222582
Reusable Resource Scheduling Via Colored Interval Covering
Venkat Chakravarthy and Sreyash D Kenkre (IBM Research, India); Sakib A. Mondal (Flipkart Internet Pvt Ltd, India); Vinayaka D Pandit and Yogish Sabharwa (IBM Research, India)

1570222696
Partitioned Feasibility Tests for Sporadic Tasks on Heterogeneous Machines
Shaurya Ahuja, Kefu Lu and Benjamin Moseley (Washington University in St. Louis, USA)

1570222844
Are Static Schedules So Bad ? A Case Study on Cholesky Factorization
Emmanuel Agullo (INRIA / LaBRI, France); Olivier Beaumont (Inria, France); Lionel Eyraud-Dubois (INRIA Bordeaux Sud-Ouest and University of Bordeaux,, France); Suraj Kumar (University of Bordeaux and INRIA Bordeaux, France)

Session 26
System Software

Session Chair: Andrew Lumsdaine

1570222819
Optimization of MPI Collective Communication on Fat-tree Networks
Sameer Kumar (IBM Research, India); Sameh Sharkawi (IBM Systems and Technology Group, USA); Nysal K. A. Jan (IBM Systems and Technology Group, India)

1570222828
On the Scalability, Performance Isolation and Device Driver Transparency of the IHK/McKernel Hybrid Lightweight Kernel
Balazs Gerofi, Masamichi Takagi and Atsushi Hori (RIKEN Advanced Institute for Computational Science, Japan); Gou Nakamura and Tomoki Shirasawa (Hitachi Solutions, Ltd., Japan); Yutaka Ishikawa (University of Tokyo, Japan)

1570223126
ZCCloud: Exploring Wasted Green Power for High-Performance Computing
Fan Yang (University of Chicago, USA); Andrew A Chien (University of Chicago and Argonne National Laboratory, USA)

1570223140
Agile Live Migration of Virtual Machines
Umesh Deshpande (IBM Research, USA); Danny Chan, Ten-Young Guh, James Edouard and Kartik Gopalan (State University of New York at Binghamton, USA); Nilton Bila (IBM Research, USA)

Session 27
Security & Fault Tolerance

Session Chair: Frederic Vivien

1570223105
Lazy Repair for Addition of Fault-tolerance
Yiyan Lin, Mohammad Roohitavaf and Sandeep Kulkarni (Michigan State University, USA)

1570222338
Security RBSG: Protecting Phrase Change Memory with Security-Level Adjustable Dynamic Mapping
Fangting Huang, Dan Feng and Wen Xia (Huazhong University of Science and Technology, P.R. China); Wen Zhou (Wuhan National Lab for Optoelectronics, School of Computer Science and Technology and Huazhong University of Science and Technology, P.R. China); Yucheng Zhang, Min Fu, Chuntao Jiang and Yukun Zhou (Huazhong University of Science and Technology, P.R. China)

1570223195
Mitigation of Denial of Service Attack with Hardware Trojans in NoC Architectures
Travis Boraten and Avinash Kodi (Ohio University, USA)

1570222408
CRC-based Memory Reliability for Task-parallel HPC Applications
Omer Subasi, Osman Unsal and Jesús Labarta (Barcelona Supercomputing Center, Spain); Gulay Yalcin (Abdullah Gul University and Barcelona Supercomputing Center, Turkey); Adrian Cristal (Barcelona Supercomputing Center, Spain)

Session 28
Data Streaming

Session Chair: Cynthia A Phillips

1570222365
Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers
Rajkumar Kettimuthu (Argonne National Lab, USA); Gagan Agrawal and Ponnuswamy Sadayappan (The Ohio State University, USA); Ian Foster (University of Chicago, USA)

1570223063
High Performance Pattern Matching Using the Automata Processor
Indranil Roy (Micron Technology, Inc., USA); Ankit Srivastava (Georgia Institute of Technology, USA); Marziyeh Nourian and Michela Becchi (University of Missouri-Columbia, USA); Srinivas Aluru (Georgia Institute of Technology and Indian Institute of Technology Bombay, USA)

1570223320
GPU-accelerated Outlier Detection for Continuous Data Streams
Chandima Hewanadungodage, Yuni Xia and John Lee (IUPUI, Indiana University – Purdue University Indianapolis, USA)

1570222234
Neptune: Real Time Stream Processing for Internet of Things and Sensing Environments
Thilina Buddhika and Shrideep Pallickara (Colorado State University, USA)

FRIDAY - 27 May 2016

DAYS • Monday • Tuesday • Wednesday • Thursday • Friday

FRIDAY WORKSHOPS
ALL DAY*
* See each individual
workshop program
for schedule details

IPDPS 2016 WORKSHOPS – FRIDAY 27 MAY
12	HPPAC	High-Performance, Power-Aware Computing
13	PDSEC	Workshop on Parallel and Distributed Scientific and Engineering Computing
14	DPDNS	Dependable Parallel, Distributed and Network-Centric Systems
15	LSPP	Large-Scale Parallel Processing
16	ParLearning	Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics
17	JSSPP	Workshop on Job Scheduling Strategies for Parallel Processing
18	iWAPT	International Workshop on Automatic Performance Tuning
19	CHIUW	Chapel Implementers and Users Workshop
20	HPBDC	High-Performance Big Data Computing
21	HPCMASPA	Monitoring and Analysis for High Performance Computing Systems Plus Applications
22	IPDRM	Emerging Parallel and Distributed Runtime Systems and Middleware
23	ParSocial	Parallel and Distributed Processing for Computational Social Systems

IPDPS 2015 Information on Keynote Speakers

IPDPS 2016 Tuesday
KEYNOTE SPEAKER

Kai Li
Princeton University
Disruptive Research and Innovation

Abstract: Ever since Clayton Christensen coined the terms "disruptive technologies" and "disruptive innovations" in 1990s, researchers and entrepreneurs love the word "disruptive" because disrupting current knowledge or products help us accelerate knowledge discoveries and moving the society into a new era. What is disruptive research? What is disruptive innovation? How do they happen? To answer such questions, in this talk, I will share my experience from co-leading the ImageNet project which built a knowledge base for computer vision and machine learning community, and from co-founding Data Domain, Inc. which built deduplication storage ecosystems to replace tape library infrastructure in data centers.

Bio: Kai Li is a Paul M. Wythes '55, P'86 and Marcia R. Wythes P'86 Professor at Princeton University, where he joined the faculty in 1986. He received his Ph.D. from Yale University, M.S. from Chinese Academy of Sciences, and B.S. from Jilin University. His research areas include operating systems, parallel and distributed systems, storage systems, and analysis of large data. He pioneered Distributed Shared Memory (DSM), allowing shared-memory programming on a cluster of computers. His group proposed user-level DMA mechanism for efficient cluster communication, which evolved into the RDMA standard of Infiniband. He co-led the ImageNet project which enabled the computer vision and machine learning community to accelerate their advances. He co-founded Data Domain, Inc. and led the innovation of deduplication storage system products to replace tape libraries at data centers. For Data Domain, he served in roles of chief executive officer, chief technology officer and chief scientist. He is an ACM fellow, an IEEE fellow and a member of National Academy of Engineering.

IPDPS 2016 Wednesday
KEYNOTE SPEAKER

Thomas Pawlowski
Micron
Memory, Storage and Processing in Future Parallel and Distributed Processing Systems

Abstract: This is perhaps the most exciting time in the short yet eventful 71 year history of Turing-complete computing. We are in the early but visible stage of an exponential explosion of data and analyses thereof. We simultaneously have witnessed the cessation of several exponential scaling-related trends and a slowdown of technology scaling itself. Technology scaling will be discussed in this talk. We will zero in on the salient features of a new epoch in the operation of processing systems. We will discuss the new balance in algorithms, architectures, technology selection, components and their usage. New technologies will be presented, showing the potential of some new concepts. Considerations for memory and storage scale-up and scale-out will be examined. Finally we will conclude with a view of our challenges and opportunities for research and collaboration.

Bio: J. Thomas Pawlowski is a Fellow and Chief Technologist with Micron's Architecture Development Group. His responsibilities include advising on new technologies, investments and system/memory/storage architectures. For the past twenty-five years at Micron Mr. Pawlowski has had the pleasure of making key technical contributions to many new memory and system architectures such as synchronous burst pipelined SRAM; hierarchical cache systems; Zero Bus Turnaround SRAM; abstracted memory; double data rate memory; Pseudo-Static RAM; high-speed NAND; double address rate memory; quad data rate SRAM; multi-channel memory; memories on SERDES buses; Reduced Latency DRAM; new refresh schemes; 3D memory; the Non-Deterministic Finite Automata Processor; abstraction protocols; new ECC concepts; processing near memory concepts; 3D Xpoint system architecture and others yet to be announced. Mr. Pawlowski earned a bachelor of applied science degree in electrical engineering, summa cum laude, from the University of Waterloo in Canada. He has well over 100 U.S. and in-flight patents and serves on several advisory boards and conference program committees.

IPDPS 2016 Thursday
KEYNOTE SPEAKER

Katrin Heitmann
Argonne National Laboratory
Unlocking the Mysteries of the Universe with Supercomputers

Abstract: Cosmology is in a scientifically very exciting phase. Two decades of surveying the sky have culminated in the celebrated ``Cosmological Standard Model''. Yet, two of its key pillars, dark matter and dark energy -- together accounting for 95% of the mass-energy of the Universe -- remain mysterious. Deep fundamental questions demand answers; to address these burning questions, survey capabilities are being exponentially improved. The new observations will pose tremendous challenges on many fronts -- from the sheer size of the data that will be collected to its modeling and interpretation. The interpretation of the data requires sophisticated simulations on the world's largest supercomputers.

In this talk I will introduce HACC, the Hardware/Hybrid Accelerated Cosmology Code, which is being developed to combat the tremendous computational challenge to simulate our Universe. HACC is a new and evolving cosmology N-body code framework, designed to run very efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models. HACC's design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available today. I present a description of the design philosophy of HACC and underlying code structure and outline some implementation details. I will also briefly describe the analysis challenges posed by the large data sets that the HACC simulations generate. Finally, I will discuss some results from our recent work on confronting the simulated with the real Universe.

Bio: Katrin Heitmann is a member of the scientific staff at Argonne National Laboratory in High Energy Physics and Mathematics and Computational Science Divisions. She is also a Senior Fellow at the Computation Institute and the Kavli Institute for Cosmological Physics at the University of Chicago. Her research focuses on physical cosmology, advanced statistical methods, and large scale computing. Heitmann received her PhD in 2000 at the University of Dortmund (Germany), held a postdoctoral position and later a staff position at Los Alamos National Laboratory before she joined Argonne in 2011. She is a member of the American Physical Society.