IPDPS 2022 Conference

General IPDPS Info

IN COOPERATION WITH

and

HOSTS

INDUSTRY PARTNERS

IPDPS 2022 Advance Program

The Main Conference Advance Program is available here. It includes the virtual program schedule for Tuesday, Wednesday and Thursday and provides the mapping of papers to the technical session in which they will be presented and the abstract for that paper.

The IPDPS 2022 Workshops listed here will be held on Monday, May 30th and Friday, June 3rd. See the individual website for the workshop’s program schedule. The Monday schedule will also include events for the PhD Forum.

The full schedule for all events will be available on the Virtual Platform to which all registrants will have access via the email address (user id) that they used to register for the conference. Prior to the start of the conference, registrants will receive email with an invitation to sign-in to the platform. Be sure to check spam if you do not see it in your regular mailbox. Registrants will also receive email giving them access to the proceedings.

The following is a list of the 123 contributed papers accepted for presentation as part of the main conference program to be held virtually on May 31st, June 1st, and June 2nd. The three days will include the keynote presentations as listed on the home page.

Updated April 18, 2022

A Fine-grained Prefetching Scheme for DGEMM Kernels on GPU with Auto-tuning Compatibility	Jialin Li, Computer network information center, Chinese Academy of Sciences Huang Ye, Computer network information center, Chinese Academy of Sciences Shaobo Tian, Computer network information center, Chinese Academy of Sciences Xinyuan Li, Computer network information center, Chinese Academy of Sciences Jian Zhang, Computer network information center, Chinese Academy of Sciences
A Framework to Exploit Data Sparsity in Tile Low-Rank Cholesky Factorization	Qinglei Cao, University of Tennessee Rabab Alomairy, King Abdullah University of Science and Technology Yu Pei, University of Tennessee George Bosilca, University of Tennessee Hatem Ltaief, King Abdullah University of Science and Technology David Keyes, King Abdullah University of Science and Technology Jack Dongarra, University of Tennessee
A General Offloading Approach for Processing-In-Memory Architectures	Dan Chen, Huazhong University of Science and Technology Hai Jin, Huazhong University of Science and Technology Long Zheng, Huazhong University of Science and Technology Yu Huang, Huazhong University of Science and Technology Pengcheng Yao, Huazhong University of Science and Technology Chuangyi Gui, Huazhong University of Science and Technology Qinggang Wang, Huazhong University of Science and Technology Haifeng Liu, Huazhong University of Science and Technology Haiheng He, Huazhong University of Science and Technology Xiaofei Liao, Huazhong University of Science and Technology Ran Zheng, Huazhong University of Science and Technology
A Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA	Hongkuan Zhou, University of Southern California Bingyi Zhang, University of Southern California Rajgopal Kannan, US Army Research Lab Viktor Prasanna, University of Southern California Carl Busart, US Army Research Lab
A Quantitative Study of the Spatiotemporal I/O Burstiness of HPC Application	Wenxiang Yang, College of Computer, National University of Defense Technology Xiangke Liao, College of Computer, National University of Defense Technology Dezun Dong, College of Computer, National University of Defense Technology Jie Yu, Computational Aerodynamics Institute, China Aerodynamics Research and Development Center
A Scalable Adaptive-Matrix Solver for Heterogeneous Architectures	Han Tran, University of Utah Milinda Fernando, University of Texas at Austin Kumar Saurabh, Iowa State University Baskar Ganapathysubramanian, Iowa State University Robert Kirby, University of Utah Hari Sundar, University of Utah
A self-stabilizing 1-minimal dominating set algorithm based on loop composition in networks of girth at least 7	Syohei Maruyama, Hiroshima University Yuichi Sudo, Hosei University Sayaka Kamei, Hiroshima University Hirotsugu Kakugawa, Ryukoku University
A Swap Dominated Tensor Re-Generation Strategy for Training Deep Learning Models	Zan Zong, Tsinghua University Lijie Wen, Tsinghua University Li Lin, Tsinghua University Leilei Lin, Capital Normal University
Accelerating Encrypted Computing on Intel GPUs	Yujia Zhai, University of California, Riverside Mohannad Ibrahim, North Carolina State University Yiqin Qiu, Intel Corporation Fabian Boemer, Intel Corporation Zizhong Chen, University of California, Riverside Alexey Titov, Intel Corporation Alexander Lyashevsky, Intel Corporation
Accuracy vs. Cost in Parallel Fixed-Precision Low-Rank Approximations of Sparse Matrices	Robert Ernstbrunner, University of Vienna Viktoria Mayer, University of Vienna Wilfried Gansterer, University of Vienna
Adaptive Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning	Tingting Tang, University of Southern California Ramy E. Ali, University of Southern California Hanieh Hashemi, University of Southern California Tynan Gangwani, University of Southern California Salman Avestimehr, University of Southern California Murali Annavaram, University of Southern California
Alias-Chain: Improving Blockchain Scalability via Exploring Content Locality among Transactions	Jintong Liu, Huazhong University of Science and Technology Shenggang Wan, Huazhong University of Science and Technology Xubin He, Temple University
An Efficient Block Validation Mechanism for UTXO-based Blockchains	Xiaohai Dai, Huazhong University of Science and Technology Bin Xiao, The Hong Kong Polytechnic University Jiang Xiao, Huazhong University of Science and Technology Hai Jin, Huazhong University of Science and Technology
An Efficient Vectorization Scheme for Stencil Computation	Kun Li, Institute of Computing Technology of Chinese Academy of Sciences Liang Yuan, Institute of Computing Technology of Chinese Academy of Sciences Yunquan Zhang, Institute of Computing Technology of Chinese Academy of Sciences Yue Yue, Institute of Computing Technology of Chinese Academy of Sciences Hang Cao, Institute of Computing Technology of Chinese Academy of Sciences
An End-to-end and Adaptive I/O Optimization Tool for Modern HPC Storage Systems	Bin Yang, Shandong University Yanliang Zou, Shanghai Tech University Weiguo Liu, Shandong University Wei Xue, Tsinghua University
An Integral-equation-oriented Vectorized SpMV Algorithm and its Application on CT Imaging Reconstructions	Weicai Ye, Sun Yat-sen University Chenghuan Huang, Sun Yat-sen University Jiasheng Huang, Sun Yat-sen University Jiajun Li, Sun Yat-sen University Yao Lu, Sun Yat-sen University Ying Jiang, Sun Yat-sen University
Archpipe: Fast and Flexible Pipelined Erasure-coded Archival Scheme for Heterogeneous Networks	Bin Xu, Huazhong University of Science and Technology Jianzhong Huang, Huazhong University of Science and Technology Qiang Cao, Huazhong University of Science and Technology Xiao Qin, Auburn University
As easy as ABC: Optimal (A)ccountable (B)yzantine (C)onsensus is easy!	Pierre Civit, Sorbonne University Seth Gilbert, NUS Singapore Vincent Gramoli, University of Sydney and EPFL Rachid Guerraoui, EPFL Jovan Komatovic, EPFL
Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching	András Strausz, ETH Zurich Flavio Vella, University of Trento, Italy Salvatore Di Girolamo, ETH Zurich Maciej Besta, ETH Zurich Torsten Hoefler, ETH Zurich
AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning'	Siddharth Singh, University of Maryland, College Park Abhinav Bhatele, University of Maryland, College Park
Batched sparse iterative solvers on GPU for the collision operator for fusion plasma simulations	Aditya Kashi, Karlsruhe Institute of Technology Pratik Nayak, Karlsruhe Institute of Technology Dhruva Kulkarni, Lawrence Berkeley National Laboratory Aaron Scheinberg, Jubilee Development Paul Lin, Lawrence Berkeley National Laboratory Hartwig Anzt, Karlsruhe Institute of Technology
Bit-GraphBLAS: Bit-Level Optimizations of Matrix-Centric Graph Processing on GPU	Jou-An Chen, North Carolina State University Ang Li, Pacific Northwest National Lab Nathan Tallent, Pacific Northwest National Lab Kevin Barker, Pacific Northwest National Lab Xipeng Shen, North Carolina State University Hsin-Hsuan Sung, North Carolina State University
Booster: An Accelerator for Gradient Boosting Decision Trees Training and Inference	Mingxuan He, Purdue University Mithuna Thottethodi, Purdue University T. N. Vijaykumar, Purdue University
Bounding the Flow Time in Online Scheduling with Structured Processing Sets	Louis-Claude Canon, FEMTO-ST Institute Anthony Dugois, Inria Loris Marchal, CNRS
Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero Overhead Execution	Johannes Doerfert, Argonne National Laboratory Atmn Patel, University of Waterloo Joseph Huber, Oak Ridge National Laboratory Shilei Tian, Stony Brook University Jose M. Monsalve Diaz, Argonne National Laboratory Barbara Chapman, Stony Brook University Giorgis Georgakoudis, Lawrence Livermore National Laboratory
Coloring the Vertices of 9-pt and 27-pt Stencils with Intervals	Dante Durrman, UNC Charlotte Erik Saule, UNC Charlotte
Colza: Enabling Elastic In Situ Visualization for High-performance Computing Simulations	Matthieu Dorier, Argonne National Laboratory (ANL) Zhe Wang, Rutgers University Utkarsh Ayachit, Kitware, Inc Shane Snyder, Argonne National Laboratory Rob Ross, Argonne National Laboratory Manish Parashar, University of Utah
Communication-efficient Massively Distributed Connected Components	Sebastian Lamm, Karlsruhe Institute of Technology Peter Sanders, Karlsruhe Institute of Technology
Compiler-Directed Incremental Checkpointing for Low Latency GPU Preemption	Zhuoran Ji, The University of Hong Kong Cho-Li Wang, The University of Hong Kong
Coupling streaming AI and HPC ensembles to achieve 100-1000$\times$ faster bio-molecular simulations	Alexander Brace, University of Chicago Shantenu Jha, Brookhaven National Lab Igor Yakushin, Argonne National Laboratory Hyungro Lee, Rutgers University Heng Ma, Argonne National Laboratory Anda Trifan, University of Illinois Urbana Champaign Li Tan, Brookhaven National Laboratory Todd Munson, Argonne National Laboratory Matteo Turilli, Rutgers University Ian Foster, Argonne National Laboratory Arvind Ramanathan, Argonne National Lab
CSC: Collaborative System Configuration for I/O-Intensive Applications in Multi-Tenant Clouds	Haowei Huang, Shanghai Jiao Tong University Pu Pang, Shanghai Jiao Tong University Quan Chen, Shanghai Jiao Tong University Jieru Zhao, Shanghai Jiao Tong University Wenli Zheng, Shanghai Jiao Tong University Minyi Guo, Shanghai Jiao Tong University
CSMV: A Highly Scalable Multi-Versioned Software Transactional Memory for GPUs	Diogo Nunes, IST/INESC-ID Daniel Castro, IST/INESC-ID Paolo Romano, IST/INESC-ID
DEAN: A Lightweight and Resource-efficient Blockchain Protocol for Reliable Edge Computing	Abdullah Al Mamun, University of Nevada, Reno Haoting Shen, University of Nevada, Reno Dongfang Zhao, University of Nevada, Reno
Degree-Aware Kernels for Computing Jaccard Weights on GPUs	Amro Alabsi Aljundi, Sabancı University Taha Atahan Akyıldız, Sabancı University Kamer Kaya, Sabancı University
DeNOVA: Deduplication Extended NOVA File System	Hyungjoon Kwon, Sogang University Yonghyeon Cho, Sogang University Awais Khan, Oak Ridge National Laboratory Yeohyeon Park, Sogang University Youngjae Kim, Sogang University
DFMan: A Graph-based Optimization of Dataflow Scheduling on High-Performance Computing Systems	Fahim Tahmid Chowdhury, Florida State University Francesco Di Natale, Lawrence Livermore National Laboratory Adam Moody, Lawrence Livermore National Laboratory Kathryn Mohror, Lawrence Livermore National Laboratory Weikuan Yu, Florida State University
DGSF: Disaggregated GPUs for Serverless Functions	Henrique Fingler, The University of Texas at Austin Zhiting Zhu, The University of Texas at Austin Esther Yoon, The University of Texas at Austin Zhipeng Jia, The University of Texas at Austin Emmett Witchel, The University of Texas at Austin Christopher J. Rossbach, The University of Texas at Austin
Direct solution of larger coupled sparse/dense linear systems using low-rank compression on single-node multi-core machines in an industrial context	Emmanuel Agullo, Inria Marek Felšöci, Inria Guillaume Sylvand, Airbus Central R & T
DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices	Xueyu Hou, New Jersey Institute of Technology Yongjie Guan, New Jersey Institute of Technology Tao Han, New Jersey Institute of Technology Ning Zhang, University of Windsor
Distributed Memory Sparse Kernels for Machine Learning	Vivek Bharadwaj, University of California, Berkeley Aydın Buluç, Lawrence Berkeley National Laboratory James Demmel, University of California, Berkeley
Dynamic Computation Offloading for Green Things-Edge-Cloud Computing with Local Caching	Xianzhong Tian, Zhejiang University of Technology Huixiao Meng, Zhejiang University of Technology Yanjun Li, Zhejiang University of Technology Pingting Miao, Zhejiang University of Technology Pengcheng Xu, Zhejiang University of Technology
Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics	Benjamin Tovar, University of Notre Dame Benjamin Lyons, University of Notre Dame Kelci Mohrman, University of Notre Dame Barry Sly-Delgado, University of Notre Dame Kevin Lannon, University of Notre Dame Douglas Thain, University of Notre Dame
Enabling Efficient Request Management through Microservice Level Parallelism	Xinkai Wang, Shanghai Jiao Tong University Chao Li, Shanghai Jiao Tong University Lu Zhang, Shanghai Jiao Tong University Xiaofeng Hou, Hong Kong University of Science and Technology Quan Chen, Shanghai Jiao Tong University Minyi Guo, Shanghai Jiao Tong University
Excavating the Potential of Graph Workload on RDMA-based Far Memory Architecture	Jing Wang, Shanghai Jiao Tong University Chao Li, Shanghai Jiao Tong University Taolei Wang, Shanghai Jiao Tong University Lu Zhang, Shanghai Jiao Tong University Pengyu Wang, Shanghai Jiao Tong University Junyi Mei, Shanghai Jiao Tong University Minyi Guo, Shanghai Jiao Tong University
Exploiting Reduced Precision for GPU-based Time Series Mining	Yi Ju, Max Planck Computing and Data Facility Amir Raoofy, Technical University of Munich Dai Yang, NVIDIA GmbH Erwin Laure, Max Plank Computing and Data Facility Martin Schulz, Technical University of Munich
Falcon: A Timestamp-based Protocol to Maximize the Cache Efficiency in the Distributed Shared Memory	Jin Zhang, Shanghai Jiao Tong University Xiangyao Yu, University of Wisconsin–Madison Zhengwei Qi, Shanghai Jiao Tong University Haibing Guan, Shanghai Jiao Tong University
FAM-Graph: Graph Analytics on Disaggregated Memory	Daniel Zahka, Georgia Institute of Technology Ada Gavrilovska, Georgia Institute of Technology
Fast and High-Quality Influence Maximization on Multiple GPUs	Gökhan Göktürk, Sabancı University Kamer Kaya, Sabancı University
Fast Convergence to Fairness for Reduced Long Flow Tail Latency in Datacenter Networks	John Snyder, Duke University Alvin R. Lebeck, Duke University
Fast Parallel Bayesian Network Structure Learning	Jiantong Jiang, The University of Western Australia Zeyi Wen, The University of Western Australia Ajmal Mian, The University of Western Australia
Fault-tolerant Snapshot Objectsin Message Passing Systems	Vijay Garg, UT Austin Saptaparni Kumar, Unaffiliated Lewis Tseng, Boston College Xiong Zheng, Google
Finding Small Vertex Covers in Parallel using GPUs	Peter Yamout, American University of Beirut Karim Barada, American University of Beirut Adnan Jaljuli, American University of Beirut Amer Mouawad, American University of Beirut Izzat El Hajj, American University of Beirut
FlashWalker: An In-Storage Accelerator for Graph Random Walks	Fuping Niu, Huazhong University of Science and Technology Jianhui Yue, Michigan Tech. University Jiangqiu Shen, Michigan Tech. University Xiaofei Liao, Huazhong University of Science and Technology Haikun Liu, Huazhong University of Science and Technology Hai Jin, Huazhong University of Science and Technology
Generalized Flow-Graph Programming Using Template Task-Graphs: Initial Implementation and Assessment	Joseph Schuchart, University of Tennessee, Innovative Computing Laboratory Poornima Nookala, IACS, Stony Brook University Mohammad Mahdi Javanmard, Facebook Inc. Thomas Herault, University of Tennessee, Innovative Computing Laboratory Edward F. Valeev, Department of Chemistry, Virginia Tech George Bosilca, University of Tennessee, Innovative Computing Laboratory Robert J. Harrison, IACS, Stony Brook University
GSpecPal: Speculation-Centric Finite State Machine Parallelization on GPUs	Yuguang Wang, Michigan Technological University Robbie Watling, Michigan Technological University Junqiao Qiu, Michigan Technological University Zhenlin Wang, Michigan Technological University
HACCS: Heterogeneity-Aware Clustered Client Selection for Accelerated Federated Learning	Joel Wolfrath, University of Minnesota Nikhil Sreekumar, University of Minnesota Dhruv Kumar, University of Minnesota Yuanli Wang, University of Minnesota Abhishek Chandra, University of Minnesota
HDagg: Hybrid Aggregation of Loop-carried Dependence Iterations in Sparse Matrix Computations	Behrooz Zarebavani, University of Toronto Kazem Cheshmi, University of Toronto Bangtian Liu, University of Toronto Michelle Mills Strout, University of Arizona Maryam Mehri Dehnavi, University of Toronto
High-order Line Graphs of Non-uniform Hypergraphs: Algorithms, Applications, and Experimental Analysis	Xu Tony Liu, University of Washington Jesun Firoz, Pacific Northwest National Laboratory Andrew Lumsdaine, University of Washington Cliff Joslyn, Pacific Northwest National Lab Sinan Aksoy, Pacific Northwest National Lab Ilya Amburg, Pacific Northwest National Lab Brenda Praggastis, Pacific Northwest National Lab Assefaw Gebremedhin, Washington State University
HRaft: Adaptive Erasure Coded Data Maintenance for Consensus in Distributed Networks	Yulei Jia, Tianjin University of Technology Guangping Xu, Tianjin University of Technology Chi Wan Sung, City University of Hong Kong Salwa Mostafa, City University of Hong Kong Yulei Wu, University of Exeter
HTS: A Threaded Multilevel Sparse Hybrid Solver	Joshua D. Booth, University of Alabama, Huntsville
Hybrid Workload Scheduling on HPC Systems	Yuping Fan, Illinois Institute of technology Zhiling Lan, Illinois Institute of technology Paul Rich, Argonne National Laboratory William Allcock, Argonne National Laboratory Michael Papka, Argonne National Laboratory
I/O-optimal Cache-oblivious Sparse Matrix-Sparse Matrix Multiplication	Niels Gleinig, ETH Zurich Maciej Besta, ETH Zurich Torsten Hoefler, ETH Zurich
In-Memory Indexed Caching for Distributed Data Processing	Alexandru Uta, Leiden University Bogdan Ghit, Databricks Ankur Dave, UC Berkeley Jan Rellermeyer, TU Delft Peter Boncz, CWI
Landau collision operator in the CUDA programming model applied to thermal quench plasmas	Mark Adams, Lawrence Berkeley National Laboratory Dylan Brennan, Princeton University Matthew Knepley, University of Buffalo Peng Wang, NVIDIA
Learning Intermediate Representations using Graph Neural Networks for NUMA and Prefetchers Optimization	Ali TehraniJamsaz, Iowa State University Mihail Popov, Inria Akash Dutta, Iowa State University Emmanuelle Saillard, Inria Ali Jannesari, Iowa State University
Lightning: Scaling the GPU Programming Model Beyond a Single GPU	Stijn Heldens, Netherlands eScience Center Pieter Hijma, VU University Amsterdam Ben van Werkhoven, Netherlands eScience Center Jason Maassen, Netherlands eScience Center Rob V. van Nieuwpoort, Netherlands eScience Center
Memory Access Granularity Aware Lossless Compression for GPUs	Sohan Lal, Technical University of Hamburg Manuel Renz, Technical University of Berlin Julian Hartmer, Technical University of Berlin Ben Juurlink, Technical University of Berlin
Memory-Aware Scheduling of Tasks Sharing Data on Multiple GPUs with Dynamic Runtime Systems	Maxime Gonthier, ENS Lyon Loris Marchal, French National Center for Scientific Research Samuel Thibault, Univ. Bordeaux
MICCO: An Enhanced Multi-GPU Scheduling Framework for Many-Body Correlation Functions	Qihan Wang, College of William and Mary Bin Ren, College of William and Mary Jie Chen, Jefferson Lab Robert Edwards, Jefferson Lab
Minerva: Rethinking Secure Architectures for the Era of Fabric-Attached Memory Architectures	Mazen Alwadi, University of Central Florida Rujia Wang, Illinois Institute of Technology David Mohaisen, University of Central Florida Clayton Hughes, Sandia National Laboratories Simon Hammond, Sandia National Laboratories Amro Awad, North Carolina State University
Mixed precision $s$-step Conjugate Gradient with Residual Replacement on GPUs	Ichitaro Yamazaki, Sandia National Laboratories Erin Carson, Charles University Brian Kelley, Sandia National Laboratories
MLCNN: Cross-Layer Cooperative Optimization and Accelerator Architecture for Speeding Up Deep Learning Applications	Beilei Jiang, University of North Texas Xianwei Cheng, University of North Texas Sihai Tang, University of North Texas Xu Ma, University of North Texas Zhaochen Gu, University of North Texas Song Fu, University of North Texas Qing Yang, University of North Texas Mingxiong Liu, Los Alamos National Laboratory
Mnemonic: A Parallel Subgraph Matching System for Streaming Graphs	Bibek Bhattarai, George Washington University Howie Huang, George Washington University
Modeling Matrix Engines for Portability and Performance	Nicholai Tukanov, Carnegie Mellon University Tze Meng Low, Carnegie Mellon University Jose Moreira, IBM Rajalakshmi Srinivasaraghavan, IBM
Multi-Phase Task-Based HPC Applications: Quickly Learning how to Run Fast	Lucas Leandro Nesi, Institute of Informatics, Federal University of Rio Grande do Sul Lucas Mello Schnorr, Institute of Informatics, Federal University of Rio Grande do Sul Arnaud Legrand, University Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG
Neon: A Multi-GPU Programming Model for Grid-based Computations	Massimiliano Meneghin, Autodesk Research Ahmed Mahmoud, Autodesk Research Pradeep Kumar Jayaraman, Autodesk Research Nigel J. W. Morris, Autodesk Research
Next-Generation Local Time Stepping for the ADER-DG Finite Element Method	Alexander Breuer, Friedrich Schiller University Jena Alexander Heinecke, Intel
OmpSs@cloudFPGA: An FPGA Task-Based Programming Model with Message Passing	Juan Miguel de Haro, Barcelona Supercomputing Center Rubén Cano, Barcelona Supercomputing Center Carlos Álvarez, Universitat Politécnica de Catalunya Daniel Jiménez-González, Universitat Politécnica de Catalunya Xavier Martorell, Universitat Politécnica de Catalunya Eduard Ayguadé, Barcelona Supercomputing Center Jesús Labarta, Barcelona Supercomputing Center Burkhard Ringlein, IBM Research Europe Francois Abel, IBM Research Europe Beat Weiss, IBM Research Europe
On the Parallel Reconstruction from Pooled Data	Oliver Gebhard, Goethe University, Frankfurt Max Hahn-Klimroth, Goethe University, Frankfurt Dominik Kaaser, University of Hamburg Philipp Loick, Goethe University, Frankfurt
Optimal Arbitrary Pattern Formation on a Grid by Asynchronous Autonomous Robots	Rory Hector, Louisiana State University Gokarna Sharma, Kent State University Ramachandran Vaidyanathan, Louisiana State University Jerry L. Trahan, Louisiana State University
Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs	Cody Rivera, University of Alabama Sheng Di, Argonne National Laboratory Xiaodong Yu, Argonne National Laboratory Jiannan Tian, Washington State University Dingwen Tao, Washington State University Franck Cappello, Argonne National Laboratory
Parallel Approximations of the Tukey g-and-h Likelihoods and Predictions for Non-Gaussian Geostatistics	Sagnik Mondal, King Abdullah University of Science and Technology Sameh Abdulah, King Abdullah University of Science and Technology Marc Genton, King Abdullah University of Science and Technology Ying Sun, King Abdullah University of Science and Technology Hatem Ltaief, King Abdullah University of Science and Technology David Keyes, King Abdullah University of Science and Technology
Parallel Fully Dynamic Maintenance of 2-Connected Components	Chirayu Haryan, Indian Institute of Technology Tirupati Ramakrishna G, Indian Institute of Technology Tirupati Kishore Kothapalli, International Institute of Information Technology Hyderabad Dip Sankar Banerjee, Indian Institute of Technology Jodhpur
Parallel Global Edge Switching for the Uniform Sampling of Simple Graphs with Prescribed Degrees	Daniel Allendorf, Goethe University Frankfurt Ulrich Meyer, Goethe University Frankfurt Manuel Penschuck, Goethe University Frankfurt Hung Tran, Goethe University Frankfurt
Parallel Tensor Train Rounding using Gram SVD	Hussam Al Daas, Rutherford Appleton Laboratory Grey Ballard, Wake Forest University Lawton Manning, Wake Forest University
Parallel, Portable Algorithms for Distance-2 Maximal Independent Set and Graph Coarsening	Brian Kelley, Sandia National Laboratories Sivasankaran Rajamanickam, Sandia National Laboratories
Parallelizing and Balancing Large-scale Particle Simulations based on Coupled DSMC/PIC	Haozhong Qiu, College of Computer, National University of Defense Technology Chuanfu Xu, College of Computer, National University of Defense Technology Dali Li, College of Aerospace Science and Engineer, National University of Defense Technology Haoyu Wang, College of Aerospace Science and Engineer, National University of Defense Technology Jie Li, College of Aerospace Science and Engineer, National University of Defense Technology Zheng Wang, University of Leeds
ParaTreeT: A Fast, General Framework for Spatial Tree Traversal	Joseph Hutter, University of Illinois at Urbana-Champaign Justin Szaday, University of Illinois at Urbana-Champaign Jaemin Choi, University of Illinois at Urbana-Champaign Spencer Wallace, University of Washington Simeng Liu, University of Illinois at Urbana-Champaign Laxmikant Kale, University of Illinois at Urbana-Champaign Thomas Quinn, University of Washington
PARSEC: PARallel Subgraph Enumeration in CUDA	Vibhor Dodeja, University of Illinois at Urbana-Champaign Mohammad Almasri, University of Illinois at Urbana-Champaign Rakesh Nagi, University of Illinois at Urbana-Champaign Jinjun Xiong, IBM Thomas J. Watson Research Center Wen-Mei Hwu, University of Illinois at Urbana-Champaign
P-ckpt: Coordinated Prioritized Checkpointing	Subhendu Behera, North Carolina State University Lipeng Wan, Oak Ridge National Laboratory Frank Mueller, North Carolina State University Matthew Wolf, Oak Ridge National Laboratory Scott Klasky, Oak Ridge National Laboratory
pFedGF: Enabling Personalized Federated Learning via Gradient Fusion	Xinghao Wu, State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science, Beihang University Jianwei Niu, State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science, Beihang University Xuefeng Liu, State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science, Beihang University Tao Ren, Hangzhou Innovation Institute, Beihang University, Hangzhou 310051, China Zhangmin Huang, Hangzhou Innovation Institute of Beihang University Zhetao Li, Hunan International Scientific and Technological Cooperation Base of Intelligent Network, Xiangtan University, Xiangtan, Hunan 411105, China
PINT: Parallel INTerval-Based Race Detector	Yifan Xu, Washington University in St. Louis Anchengcheng Zhou, Washington University in St. Louis Kunal Agrawal, Washington University in St. Louis I-Ting Angelina Lee, Washington University in St. Louis
Pok´eMem: Taming Wild Memory Consumers in Apache Spark	Minhyeok Kweun, Samsung Research Goeun Kim, Samsung Research Byungsoo Oh, Samsung Research Seongho Jung, Samsung Research Taegeon Um, Samsung Research Woo-Yeon Lee, Samsung Research
PowerSpector: Towards Energy Efficiency with Calling-Context-Aware Profiling	Xin You, Beihang University Hailong Yang, Beihang University Zhibo Xuan, Beihang University Zhongzhi Luan, Beihang University Depei Qian, Beihang University
Preprocessing Pipeline Optimization for Scientific Deep-Learning Workloads	Khaled Ibrahim, Lawrence Berkeley National Laboratory Leonid Oliker, Lawrence Berkeley National Laboratory
QoS-awareness of Microservices with Excessive Loads via Inter-Datacenter Scheduling	Jiuchen Shi, Shanghai Jiao Tong University Jiawen Wang, Shanghai Jiao Tong University Kaihua Fu, Shanghai Jiao Tong University Quan Chen, Shanghai Jiao Tong University Deze Zeng, China University of Geosciences Minyi Guo, Shanghai Jiao Tong University
Resource Utilization Aware Job Scheduling to Mitigate Performance Variability	Daniel Nichols, University of Maryland, College Park Aniruddha Marathe, Lawrence Livermore National Laboratory Kathleen Shoga, Lawrence Livermore National Laboratory Todd Gamblin, Lawrence Livermore National Laboratory Abhinav Bhatele, University of Maryland, College Park
RLRP: High-Efficient Data Placement with Reinforcement Learning for Modern Distributed Storage Systems	Kai Lu, Huazhong University of Science and Technology Nannan Zhao, Northwestern Polytechnical University Jiguang Wan, Huazhong University of Science and Technology Changhong Fei, Huazhong University of Science and Technology Wei Zhao, SenseTime Research Tongliang Deng, SenseTime Research
SALoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs	Seongyeon Park, CS, Yonsei University Hajin Kim, CS, Yonsei University Tanveer Ahmad, TU Delft Nauman Ahmed, TU Delft Zaid Al-Ars, TU Delft Peter Hofstee, TU Delft Youngsok Kim, CS/AI, Yonsei University Jinho Lee, CS/AI, Yonsei University
Scalable Low-Latency Inter-FPGA Networks	Kien Trung Pham, Graduate University for Advanced Studies, SOKENDAI Thao Nguyen Truong, National Institute of Advanced Industrial Science and Technology (AIST) Hiroshi Yamaguchi, Photonics Electronics Technology Research Association (PETRA) Yutaka Urino, Photonics Electronics Technology Research Association (PETRA) Michihiro Koibuchi, National Institute of Informatics (NII)
Scalable Multi-Versioning Metadata Dictionaries with Persistent Memory Support	Bogdan Nicolae, Argonne National Laboratory (ANL)
Scaling and Selecting GPU Methods for All Pairs Shortest Paths (APSP) Computations	Yang Xia, Ohio State Peng Jiang, University of Iowa Rajiv Ramnath, Ohio State Gagan Agrawal, Augusta University
Scheduling on Uniform and Unrelated Machines with Bipartite Incompatibility Graphs	Tytus Pikies, Gdańsk University of Technology Hanna Furmańczyk, University of Gdańsk
SecFortress: Securing Hypervisor using Cross-layer Isolation	Qihang Zhou, Institute of Information Engineering, Chinese Academy of Sciences Xiaoqi Jia, Institute of Information Engineering, Chinese Academy of Sciences Shengzhi Zhang, Department of Computer Science, Metropolitan College, Boston University, USA Nan Jiang, Institute of Information Engineering, Chinese Academy of Sciences Jiayun Chen, Institute of Information Engineering, Chinese Academy of Sciences Weijuan Zhang, Institute of Information Engineering, Chinese Academy of Sciences
SFP: Service Function Chain Provision on Programmable Switches for Cloud Tenants	Hongyi Huang, Tsinghua University Wenfei Wu, Peking University Zehua Guo, Beijing Institute of Technology Yongchao He, Tsinghua University
"Smarter" NICs for Faster Molecular Dynamics: a Case Study	Sara Karamati, Georgia Institute of Technology Jeffrey Young, Georgia Institute of Technology Richard Vuduc, Georgia Institute of Technology
Sparsity-Aware Tensor Factorization	Sureyya Emre Kurt, University of Utah Saurabh Raje, University of Utah Aravind Sukumaran-Rajam, Washington State University P. Sadayappan, University of Utah
SpectralFly: Ramanujan Graphs as Flexible and Efficient Interconnection Networks	Sinan Aksoy, Pacific Northwest National Lab Stephen Young, Pacific Northwest National Lab Jesun Firoz, Pacific Northwest National Laboratory Roberto Gioiosa, Pacific Northwest National Lab Mark Raugas, Pacific Northwest National Lab Mark Kempton, Brigham Young University Tobias Hagge, Pacific Northwest National Lab Juan Andres Escobedo Contreras, Pacific Northwest National Lab
SPIDER: An Effective, Efficient and Robust Load Scheduler for Real-time Split Frame Rendering	Bingzheng Ma, Nankai University Ziqiang Zhang, Nankai University Yusen Li, Nankai University Wentong Cai, Nanyang Technological University Gang Wang, Nankai University Xiaoguang Liu, Nankai University
SSB-Tree: Making Persistent Memory B+-Trees Crash-Consistent and Concurrent by Lazy-Box	Tongliang Li, Tsinghua University Haixia Wang, Tsinghua University Airan Shao, Tsinghua University Dongsheng Wang, Tsinghua University
StencilMART: Predicting Optimization Selection for Stencil Computations across GPUs	Qingxiao Sun, Beihang University Yi Liu, Beihang University Hailong Yang, Beihang University Zhonghui Jiang, Beihang University Zhongzhi Luan, Beihang University Depei Qian, Beihang University
TagTree: Global Tagging Index with Efficient Querying for Time Series Databases	Jin Xue, The Chinese University of Hong Kong Zhiqi Wang, The Chinese University of Hong Kong Tianyu Wang, The Chinese University of Hong Kong Zili Shao, The Chinese University of Hong Kong
Task-based Acceleration of Bidirectional Recurrent Neural Networks on Multi-core Architectures	Robin Kumar Sharma, Barcelona Supercomputing center Marc Casas, Barcelona Supercomputing center
TEE-based decentralized recommender systems: The raw data sharing redemption	Akash Dhasade, EPFL Nevena Dresevic, EPFL Anne-Marie Kermarrec, EPFL Rafael Pires, EPFL
The Fast and Scalable MPI Application Launch of the Tianhe HPC system	Yiqin Dai, National University of Defense Technology Yong Dong, National University of Defense Technology Min Xie, National University of Defense Technology Kai Lu, National University of Defense Technology Ruibo Wang, National University of Defense Technology
The Universal Gossip Fighter	Anastasiia Gorbunova, École polytechnique fédérale de Lausanne Anne-Marie Anne-Marie Kermarrec, École polytechnique fédérale de Lausanne Anastasiia Kucherenko, École polytechnique fédérale de Lausanne Rafaël Pinot, École polytechnique fédérale de Lausanne Rachid Guerraoui, École polytechnique fédérale de Lausanne
Top-Down Performance Profiling on NVIDIA’s GPUs	Alvaro Saiz, University of Cantabria Pablo Prieto, University of Cantabria Pablo Abad, University of Cantabria Jose Angel Gregorio, University of Cantabria Valentin Puente, University of Cantabria
Topological Modeling and Parallelization of Multidimensional Data on Microelectrode Arrays	Olamide Tawose, University of Nevada, Reno Bin Li, University of Nevada, Reno Lei Yang, University of Nevada, Reno Feng Yan, University of Nevada, Reno Dongfang Zhao, University of Nevada, Reno
Towards Distributed 2-Approximation Steiner Minimal Trees in Billion-edge Graphs	Tahsin Reza, Lawrence Livermore National Laboratory Geoffrey Sanders, Lawrence Livermore National Laboratory Roger Pearce, Lawrence Livermore National Laboratory
Traffic-Optimal Virtual Network Function Placement and Migration in Dynamic Cloud Data Centers	Vincent Tran, University of California Riverside Jingsong Sun, California State University Dominguez Hills Bin Tang, California State University Dominguez Hills Deng Pan, Florida International University
Understanding the Design-Space of Sparse/Dense Multiphase GNN dataflows on Spatial Accelerators	Raveesh Garg, Georgia Institute of Technology Eric Qin, Georgia Institute of Technology Francisco Muñoz-Martínez, Universidad de Murcia Robert Guirado, Universitat Politecnica de Catalunya Akshay Jain, Neutroon Sergi Abadal, Universitat Politecnica de Catalunya José Abellán, Universidad Católica de Murcia Manuel Acacio, Universidad de Murcia Eduard Alarcon, Universitat Politecnica de Catalunya Sivasankaran Rajamanickam, Sandia National Laboratories Tushar Krishna, Georgia Institute of Technology
Unlocking Personalized Healthcare on Modern CPUs/GPUs: Three-way Gene Interaction Study	Diogo Marques, INESC-ID Rafael Campos, INESC-ID Sergio Santander-Jiménez, Polytechnic School, University of Extremadura Zakhar Matveev, Intel Corporation Leonel Sousa, INESC-ID Aleksandar Ilic, INESC-ID
Why Globally Re-shuffle? Revisiting Data Shuffling in Large Scale Deep Learning	Thao Nguyen Truong, AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory François Trahay, Télécom SudParis Jens Domke, RIKEN Center for Computational Science Aleksandr Drozd, RIKEN Center for Computational Science Emil Vatai, RIKEN Center for Computational Science Jianwei Liao, College of Computer and Information Science, Southwest University of China Mohamed Wahib, National Institute of Advanced Industrial Science and Technology Balazs Gerofi, RIKEN Center for Computational Science