GradZip: Gradient Compression using Alternating Matrix Factorization for Large-scale Deep Learning: Minsik Cho (IBM); Vinod Muthusamy (IBM Research); Brad Nemanich (IBM); Ruchir Puri (IBM)
PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision: Tomasz Kornuta (NVIDIA); Tomasz Kornuta (IBM Research, Almaden)
Post-Training 4-bit Quantization on Embedding Tables: Hui Guan (North Carolina State University); Andrey Malevich (Facebook); Jiyan Yang (Facebook Inc.); Jongsoo Park (Facebook, Inc.); Hector Yuen (Facebook Inc.)
CrossLang: the system of cross-lingual plagiarism detection: Oleg Bakhteev ( Moscow Institute of Physics and Technology); Aleksandr Ogaltsov (Higher School of Economics, Moscow); Andrey Khazov (Antiplagiat Company); Kamil Safin (Moscow Institute of Physics and Technology); Rita Kuznetsova (Moscow Institute of Physics and Technology)
Solving imperfect information games on heterogeneous hardware by operation aggregation: Dong Yan (Tsinghua University); Chengze Cai (Tsinghua University); Kai Su (Tsinghua University); Yaoliang Yang (Tsinghua University); Tongzheng Ren (UT Austin); Jun Zhu (Tsinghua University)
FastEstimator: A Deep Learning Library for Fast Prototyping and Productization: Xiaomeng Dong (GE Healthcare); Hsi-Ming Chang (GE Healthcare); michael potter (GE HEALTHCARE ); Junpyo Hong (GE Healthcare)
Scale MLPERF-0.6 models on Google TPU-v3 Pods: Sameer Kumar (Google, Inc.); Victor Bittorf (Google); Dehao Chen (Google); Chiachen Chou (Google Inc.); Blake Hechtman (Google); HyoukJoong Lee (Google); Naveen Kumar (Google); Peter Mattson (Google Inc.); Shibo Wang (Google Inc.); Tao Wang (Google Inc.); Yuanzhong Xu (Google); Zongwei Zhou (Google Inc.)
GENO – Optimization for Machine Learning Made Fast and Easy: Sören Laue (Friedrich Schiller University Jena / Data Assessment Solutions GmbH Hannover); Matthias Mitterreiter (Friedrich Schiller University Jena); Joachim Giesen (Friedrich Schiller University Jena)
A Programming System for Model Compression: Vinu Joseph (University Of Utah); Saurav Muralidharan (NVIDIA); Animesh Garg (University of Toronto, Nvidia); Michael Garland (NVIDIA); Ganesh Gopalakrishnan (University of Utah)
Alpine Meadow: A System for Interactive AutoML: Zeyuan Shang (MIT); Emanuel Zgraggen (MIT); Tim Kraska (MIT)
SLIDE : Training Deep Neural Networks with Large Outputs on a CPU faster than a V100-GPU: Beidi Chen (Rice University); Tharun Medini (Rice University); James Farwell (Intel Corporation); sameh gobriel (); Charlie Tai (Intel Corporation); Anshumali Shrivastava (Rice University)
Neural Network Surgery with Sets: Susan Zhang (OpenAI); Jonathan Raiman (OpenAI)
Block-distributed Gradient Boosted Trees: Theodore Vasiloudis (Research Institutes of Sweden); Hyunsu Cho (Amazon Web Services); Henrik Bostrom (KTH Royal Institute of Technology)
5 Parallel Prism: A topology for pipelined implementations of convolutional neural networks using computational memory: Martino Dazzi (IBM Research Zurich GmbH); Abu Sebastian (IBM Research Zurich GmbH); Pier Andrea Francese (Nil); Thomas Parnell (IBM Research); Luca Benini (ETHZ, University of Bologna ); Evangelos Eleftheriou (IBM Research)
Breadth-first, Depth-next Training of Random Forests: Andreea Anghel (IBM Research); Nikolas Ioannou (IBM Research); Thomas Parnell (IBM Research); Nikolaos Papandreou (IBM Research Zurich); Celestine Mendler-Dünner (UC Berkeley); Haralampos Pozidis (IBM Research Zurich)
Decentralized Distributed PPO: Erik Wijmans (Georgia Tech); Abhishek Kadian (Facebook AI Research); Ari S Morcos (Facebook AI Research (FAIR)); Stefan Lee (Georgia Institute of Technology); Irfan Essa (Georgia Institute of Technology); Devi Parikh (Georgia Tech & Facebook AI Research); Manolis Savva (Simon Fraser University); Dhruv Batra (Georgia Tech & Facebook AI Research)
Compiling Classical ML Pipelines into Tensor Computations for One-size-fits-all Prediction Serving: Supun Nakandala (University of California, San Diego); Gyeong-In Yu (Seoul National University); Markus Weimer (Microsoft); Matteo Interlandi (Microsoft)
sktime: A Unified Interface for Machine Learning with Time Series: Markus Löning (University College London); Anthony Bagnall (University of East Anglia); Franz J Kiraly (UCL)
Dali: Scaling Lazy Compilation & JIT Fusion: Jonathan Raiman (OpenAI)
Tiramisu: A Polyhedral Compiler for Dense and Sparse Deep Learning: Riyadh Baghdadi (MIT); Saman Amarasinghe (Massachusetts institute of technology); Michael Carbin (MIT); Fatima Zohra BENHAMIDA (ESI); Alex Renda (MIT); Jonathan Frankle (MIT)
Training Kinetics in 15 Minutes: Large-scale Distributed Training on Videos: Ji Lin (MIT); Chuang Gan (MIT-Watson AI Lab); Song Han (MIT)
AuthorGAN: Improving GAN Reproducibility using a Modular GAN Framework: Raunak Sinha (IBM Research); Anush Sankaran (Independent Researcher); Mayank Vatsa (IIIT-Delhi); Richa Singh (IIIT-Delhi)
NeMo: a toolkit for building AI applications using Neural Modules: Oleksii Kuchaiev (NVIDIA); Jason Li (NVIDIA); Huyen Nguyen (NVIDIA); Oleksii Hrinchuk (NVIDIA); Ryan Leary (NVIDIA); Boris Ginsburg (NVIDIA); Samuel Kriman (NVIDIA); Stanislav Beliaev (NVIDIA); Vitaly Lavrukhin (NVIDIA); Jack Cook (NVIDIA); Patrice Castonguay (NVIDIA); Mariya Popova (NVIDIA); Jocelyn Huang (NVIDIA); Jonathan Cohen (NVIDIA)
Stage-based Hyper-parameter Optimization for Deep Learning: Ahnjae Shin (Seoul National University); Dong-Jin Shin (Seoul National University); Sungwoo Cho (Seoul National University); Do Yoon Kim (Seoul National University); Eunji Jeong (Seoul National Univerity); Gyeong-In Yu (Seoul National University); Byung-Gon Chun (Seoul National University)
LISA: Towards Learned DNA Sequence Search: Darryl Ho (MIT); Jialin Ding (MIT); Sanchit Misra (Intel); Nesime Tatbul (Intel Labs and MIT); Vikram Nathan (MIT); Vasimuddin Md (Intel); Tim Kraska (MIT)
Distributed Asynchronous Domain Adaptation : Towards Making Domain Adaptation More Practical in Real-World Systems: Shaoduo Gan (ETH Zurich); Akhil Mathur (Nokia Bell Labs); Anton Isopoussu (Nokia Bell Labs); Nadia Berthouze (University College London); Nicholas Lane (University of Oxford); Fahim Kawsar (Nokia Bell Labs)
Distributed Training Across the World: Ligeng Zhu (MIT); Song Han (MIT)
Reversible Fixup Networks for Memory-Efficient Training: Vithursan Thangarasa (University of Guelph); Kenyon C.-Y. Tsai (N/A); Graham Taylor (University of Guelph); Urs Koster (Cerebras Systems)