You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue will track the discussion into changing the Patmos ISA to make use of either deferred or split instructions.
Motivation
Some types of instructions cannot be executed without incurring some kind of delay or latency in the pipeline. One example is load instructions, which currently have a 1 cycles delay slot before the loaded value can be used. Another example could be a multiply or division instructions, which requires multiple cycles to execute.
Deferred/split instructions try to address the inefficiency in instructions with latency, by allowing the compiler decide how to manage this latency.
Split instructions
Split instructions "split" a given instructions into two parts: (1) issue the instruction and (2) get the result.
E.g., loads could be split into issuing the load (lwc, load word from data-cache) and then putting the loaded value into a register (glw, get loaded word).
The two parts of the load can then be scheduled independently by the compiler, to try and avoid any latency by issue other instructions between them.
Example:
lwc t1 = [r1] ; issue load of address in r1 to load-register t1add r2 = r3, r4 ; do something elseadd r2 = r2, r5 ; do something elseglw r1 = t1 ; get loaded value from load-register t1 into register r1add r2 = r2, r1 ; use loaded value
Deferred instructions
Deferred instructions try to address the same problem with a different approach. In addition to providing an instructions with the usual operands, it is also provided with an immediate value operand that specifies when the result is expected.
The immediate value defines after how many instruction words the result should be available in the target register. The compiler can then use this immediate value to issue the instruction early and set the value to match when in the instruction stream it needs the result.
Example:
lwc r1 = [r1],3 ; Issue a deferred load, with the value available to the third following instructionadd r2 = r3, r4 ; do something elseadd r2 = r2, r5 ; do something elseadd r2 = r2, r1 ; use loaded value
The deferral range is not specified yet, but suitable ranges could be between 32 and 256.
The text was updated successfully, but these errors were encountered:
This issue will track the discussion into changing the Patmos ISA to make use of either deferred or split instructions.
Motivation
Some types of instructions cannot be executed without incurring some kind of delay or latency in the pipeline. One example is load instructions, which currently have a 1 cycles delay slot before the loaded value can be used. Another example could be a multiply or division instructions, which requires multiple cycles to execute.
Deferred/split instructions try to address the inefficiency in instructions with latency, by allowing the compiler decide how to manage this latency.
Split instructions
Split instructions "split" a given instructions into two parts: (1) issue the instruction and (2) get the result.
E.g., loads could be split into issuing the load (
lwc
, load word from data-cache) and then putting the loaded value into a register (glw
, get loaded word).The two parts of the load can then be scheduled independently by the compiler, to try and avoid any latency by issue other instructions between them.
Example:
Deferred instructions
Deferred instructions try to address the same problem with a different approach. In addition to providing an instructions with the usual operands, it is also provided with an immediate value operand that specifies when the result is expected.
The immediate value defines after how many instruction words the result should be available in the target register. The compiler can then use this immediate value to issue the instruction early and set the value to match when in the instruction stream it needs the result.
Example:
The deferral range is not specified yet, but suitable ranges could be between 32 and 256.
The text was updated successfully, but these errors were encountered: