noether.core.distributed.gather¶
Classes¶
Gathers tensors from all process and supports backward propagation for the gradients across processes. |
Functions¶
|
|
|
|
|
|
|
|
|
|
|
|
Module Contents¶
- class noether.core.distributed.gather.AllGatherGradAutograd¶
Bases:
torch.autograd.FunctionGathers tensors from all process and supports backward propagation for the gradients across processes.
- static forward(ctx, x)¶
- static backward(ctx, *grads)¶
- noether.core.distributed.gather.get_device_and_bfloat16supported()¶
- noether.core.distributed.gather.get_bool_gather_supported()¶
- noether.core.distributed.gather.all_gather_grad(x, batch_dim=0)¶
- noether.core.distributed.gather.all_gather_grad_autograd(x)¶
- noether.core.distributed.gather.all_gather_nograd(x)¶
- noether.core.distributed.gather.all_gather_nograd_clipped(x, max_length)¶
- noether.core.distributed.gather.all_reduce_sum_nograd(x)¶
- noether.core.distributed.gather.all_reduce_sum_grad(x)¶
- noether.core.distributed.gather.reduce_mean_grad(x, dest_rank=0)¶
- noether.core.distributed.gather.reduce_mean_nograd(x, dest_rank=0)¶
- noether.core.distributed.gather.reduce_max_grad(x, dest_rank=0)¶
- noether.core.distributed.gather.reduce_max_nograd(x, dest_rank=0)¶
- noether.core.distributed.gather.all_reduce_mean_grad(x)¶
- noether.core.distributed.gather.all_reduce_mean_nograd(x)¶