Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new opcodes to AS #1

Open
deanm1278 opened this issue Feb 22, 2018 · 0 comments
Open

Add support for new opcodes to AS #1

deanm1278 opened this issue Feb 22, 2018 · 0 comments

Comments

@deanm1278
Copy link
Owner

deanm1278 commented Feb 22, 2018

Add with carry, subtract with borrow for better 64-bit arithmetic.

Rd = Rx + Ry + AC0

Rd = Rx - Ry + AC0 - 1

Rd = Rx + Ry + AC0 (S)

Rd = Rx - Ry + AC0 - 1 (S)

Single cycle 32-bit multiplies and multiply accumulates.

A 32-bit result can be produced an R-register,

A 64-bit result in an R-register pair, e.g. R3:2, with high bits in odd register.

Accumulation is done in A1:0. The accumulator is actually 72 bits because A0.X is not used.

Rd = Rx * Ry mode

Re:d = Rx * Ry mode

A1:0 = Rx * Ry mode

A1:0 += Rx * Ry mode

A1:0 -= Rx * Ry mode

Rd = (A1:0 = Rx * Ry) mode

Rd = (A1:0 += Rx * Ry) mode

Rd = (A1:0 -= Rx * Ry) mode

Rd = A1:0 mode

Re:d = (A1:0 = Rx * Ry) mode

Re:d = (A1:0 += Rx * Ry) mode

Re:d = (A1:0 -= Rx * Ry) mode

Re:d = A1:0 mode

These modes are supported for the 32-bit multiplies and multiply accumulates.

default fractional rounding

(T) signed fractional truncating

(IS) signed integer

(IS,NS) signed integer, non saturating

(FU) unsigned fractional rounding

(TFU) unsigned fractional truncating

(IU) unsigned integer saturating

(IU,NS) unsigned integer non-saturating

(M) mixed, signed fractional rounding

(M,T) mixed, signed fractional truncating

(M,IS) mixed, signed integer saturating

(M,IS,NS) mixed not saturating, integer non-saturating

The fractional rounding modes cannot be used with accumulate and extract instructions.

So "R1:0 = (A1:0 = R3 * R4)" will give an error but "R1:0 = (A1:0 = R3 * R4) (T)" will not.

The existing *= operation is single cycle too in Blackfin+.

## Single cycle Complex multiplication.

Operands are R-registers with the 16-bit imaginary part in the high bits and the 16-bit real part in the low bits.

Results can be the same format or R-register pairs containing 32-bit imaginary in odd register and 32-bit real in the even register.

Accumulation is done in A1:0 with the imaginary part in the full 40-bits of A1 and the real part the full 40-bits of A0.

Rd = CMUL(Rx, Ry) mode

Rd = CMUL(Rx, Ry*) mode

Rd = CMUL(Rx*, Ry*) mode

Re:d = CMUL(Rx, Ry) mode

Re:d = CMUL(Rx, Ry*) mode

Re:d = CMUL(Rx*, Ry*) mode

A1:0 = CMUL(Rx, Ry) mode

A1:0 = CMUL(Rx, Ry*) mode

A1:0 = CMUL(Rx*, Ry*) mode

A1:0 += CMUL(Rx, Ry) mode

A1:0 += CMUL(Rx, Ry*) mode

A1:0 += CMUL(Rx*, Ry*) mode

A1:0 -= CMUL(Rx, Ry) mode

A1:0 -= CMUL(Rx, Ry*) mode

A1:0 -= CMUL(Rx*, Ry*) mode

Rd = (A1:0 = CMUL(Rx, Ry)) mode

Rd = (A1:0 = CMUL(Rx, Ry*)) mode

Rd = (A1:0 = CMUL(Rx*, Ry*)) mode

Rd = (A1:0 += CMUL(Rx, Ry)) mode

Rd = (A1:0 += CMUL(Rx, Ry*)) mode

Rd = (A1:0 += CMUL(Rx*, Ry*)) mode

Rd = (A1:0 -= CMUL(Rx, Ry)) mode

Rd = (A1:0 -= CMUL(Rx, Ry*)) mode

Rd = (A1:0 -= CMUL(Rx*, Ry*)) mode

Re:d = (A1:0 = CMUL(Rx, Ry)) mode

Re:d = (A1:0 = CMUL(Rx, Ry*)) mode

Re:d = (A1:0 = CMUL(Rx*, Ry*)) mode

Re:d = (A1:0 += CMUL(Rx, Ry)) mode

Re:d = (A1:0 += CMUL(Rx, Ry*)) mode

Re:d = (A1:0 += CMUL(Rx*, Ry*)) mode

Re:d = (A1:0 -= CMUL(Rx, Ry)) mode

Re:d = (A1:0 -= CMUL(Rx, Ry*)) mode

Re:d = (A1:0 -= CMUL(Rx*, Ry*)) mode

A * after an operand indicate the operand to the multiply is the complex conjugate of the value in the register.

These modes are supported for complex multiplies and multiply accumulates.

default signed fractional saturating rounding

(T) signed fractional saturating truncating (extract accumulator to single R-register operations only.)

(IS) signed integer saturating

## New Accumulator Loads.

A couple of new options have been added to the DSP32ALU instruction to help initialize A1:0 for complex and 32-bit multiply accumulates.

A1 = Rx (X), A0 = Ry (Z) // sign extend Rx:y into A1:0

A1 = Rx (X), A0 = Ry (X)

A1 = Rx (Z), A0 = Ry (Z) // zero extend Rx:y into A1:0

A1 = Rx (Z), A0 = Ry (X)

Initializing the accumulator pair to zero is supported by the existing A1 = A0 = 0 instruction.

New hardware loop instructions for zero-trip and known iteration loops.

LSETUPZ (lab) LCx=Py // Jumps over loop if Py==0 when executed

LSETUPZ (lab) LCx=Py>>1 // Jumps over loop if Py==0 when executed

LSETUPLEZ (lab) LCx=Py // Jumps over loop if Py<=0 when executed

LSETUPLEZ (lab) LCx=Py>>1 // Jumps over loop if Py<=0 when executed

LSETUP (lab) LCx=imm // Loop with immediate trip count

Jumps and calls with 32-bit immediate target`

JUMP.A imm32 // absoulte addess
JUMP.XL imm32 // PC-relative
CALL.A imm32
CALL.XL imm32

As always the assembler and linker will chose the right call or jump for you if you use a label and do not specify an extension.

Move 32-bit value to register

Rd = imm32

Pd = imm32

ureg = imm32 // works with all register allowed in the register move instruction

Loads and stores with 32-bit immediate address

Rd = [ imm32 ]

Pd = [ imm32 ]

[ imm32 ] = Rx

[ imm32 ] = Px

Rd = W[ imm32 ] (Z)

Rd = W[ imm32 ] (X)

W[ imm32 ] = Rx

Rd = B[ imm32 ] (Z)

Rd = B[ imm32 ] (X)

B[ imm32 ] = Rx

Rd_hi = W[ imm32 ]

Rd_lo = W[ imm32 ]

W[ imm32 ] = Rx_hi

A few changes that improve orthogonality

Rd = !CC // filled in an encoding hole

ureg = ureg // restrictions on which registers can be copied to which have been removed

compute || preg-access || preg-load // P-register addressing allowed in both DAG slots

The last of these is quite pervasive as there is now no need to load an I-register to enable dual load/store. I find when writing assembler for BF70x I only use I- and M-registers for circular buffering.

The assembler will accept alternative syntax for immediate shift instructions or new error checks in the assembler

System instructions

STI IDLE Rx // combines STI and IDLE to avoid a race condition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant