Denis Barthou
Automatic Parallelization for large AI models
Modern large AI models are designed with a Domain Specific Language (DSL) and the computing power required for their training is driving the design of dedicated supercomputers, including specific accelerators tailored for a family of models. This defines a unique playground in terms of software parallelization and optimization, different from the usual High Performance Computing applications: Starting from a high level description of the large scale computation, the objective is to define how to automatically organize, parallelize, schedule all computations among nodes down to the vector/matrix units of the accelerators. We will describe some parallelization techniques developed in the MindSpore AI framework and discuss some limitations, challenges and perspectives, in particular for memory optimization and for inference.
back to overview
Watch Recording