Skip to content

Enable PyTorch/XLA Fully Sharded Data Parallel (FSDP) for a Specific Class of Transformer Models#20774

Closed
AlexWertheim wants to merge 8 commits into
huggingface:mainfrom
AlexWertheim:xla-fsdp-changes
Closed

Enable PyTorch/XLA Fully Sharded Data Parallel (FSDP) for a Specific Class of Transformer Models#20774
AlexWertheim wants to merge 8 commits into
huggingface:mainfrom
AlexWertheim:xla-fsdp-changes