Skip to content

Commit

Permalink
remove unnecessary sync (#461)
Browse files Browse the repository at this point in the history
`nop` instruction is only for synchronization within the same
threadblock. Cross threadblock synchronization is handled by `barrier`
instruction. So insert `nop` only if the dependency is within the same
threadblock.
  • Loading branch information
Binyang2014 authored Feb 10, 2025
1 parent e7cff89 commit a6e00cc
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions python/mscclpp/language/ir.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@ def ir_to_json(program: Program):
# Expand extra dependencies into nop operations
nop = Op(Instruction.nop, -1, None, None, [])
for i, dep in enumerate(op.depends):
# barrier already syncs all threads
if dep.inst != Instruction.barrier:
# barrier already syncs all threads, only sync within the same threadblock
if dep.inst != Instruction.barrier and dep.tb == op.tb:
nop.depends.append(dep)
if len(new_ops) > 0 and (
new_ops[-1].inst == Instruction.barrier or new_ops[-1].inst == Instruction.nop
Expand Down

0 comments on commit a6e00cc

Please sign in to comment.