MLSys branches and leaves
2023-07-05
Frameworks
- [VLDB '20] PyTorch Distributed: Experiences on Accelerating Data Parallel Training
- [NSDI '19] JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs
- [NeurIPS '19] PyTorch: An Imperative Style, High-Performance Deep Learning Library
- [OSDI '18] Ray: A Distributed Framework for Emerging AI Applications
- [OSDI '16] TensorFlow: A System for Large-Scale Machine Learning
Parallel and Distributed Systems
- PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices
- Tutel: Adaptive Mixture-of-Experts at Scale
- Breadth-First Pipeline Parallelism