0% found this document useful (0 votes)

245 views19 pages

AVR200: Multiply and Divide Routines: 8-Bit MCU With Downloadable Flash Application Note

The document describes optimized multiplication and division routines for 8-bit and 16-bit signed and unsigned numbers on AVR microcontrollers. It provides subroutines that perform 8x8, 16x16, 8/8, and 16/16 operations with optimized code size and execution speed. An example shows an 8x8 multiplication can be performed in 1.7 microseconds using the speed optimized routine. Booth's algorithm is used for signed 8x8 multiplication.

Uploaded by

Richard Krill

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

245 views19 pages

AVR200: Multiply and Divide Routines: 8-Bit MCU With Downloadable Flash Application Note

Uploaded by

Richard Krill

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

AVR200: Multiply and Divide Routines

Features Introduction 8-Bit

• 8 and 16-bit Implementations This application note lists subroutines for
•
•
Signed & Unsigned routines
Speed & Code Size Optimized Routines
multiplication and division of 8 and 16-bit MCU with
signed and unsigned numbers. A listing
•
•
Runable Example Programs
Speed is Comparable with HW
of all implementations with key perfor- Downloadable
mance specifications is given in Table 1.
Multiplicators/Dividers
Example: 8 × 8 Mul in 1.7 µs, 16 × 16 Mul Flash
in 6.8 ms (20 MHz)
• Extremely Compact Code

Table 1. Performance Figures Summary

Application
Application Code Size
(Words)
Execution Time
(Cycles)
Note
8 x 8 = 16 bit unsigned (Code Optimized) 9 58 AVR200
8 x 8 = 16 bit unsigned (Speed Optimized) 34 34
8 x 8 = 16 bit signed (Code Optimized) 10 73
16 x 16 = 32 bit unsigned (Code Optimized) 14 153
16 x 16 = 32 bit unsigned (Speed Optimized) 105 105
16 x 16 = 32 bit signed (Code Optimized) 16 218
8 / 8 = 8 + 8 bit unsigned (Code Optimized) 14 97
8 / 8 = 8 + 8 bit unsigned (Speed Optimized) 66 58
8 / 8 = 8 + 8 bit signed (Code Optimized) 22 103
16 / 16 = 16 + 16 bit unsigned (Code Optimized) 19 243
16 / 16 = 16 + 16 bit unsigned (Speed Optimized) 196 173
16 / 16 = 16 + 16 bit signed (Code Optimized) 39 255

The application note listing consists of 8 x 8 = 16 Unsigned

two files: Multiplication - “mpy8u”
“avr200.asm”: Code size optimized
Both program files contain a routine
multiplied and divide routines.
called “mpy8u” which performs unsigned
“avr200b.asm”: Speed optimized multi- 8-bit multiplication. Both implementa-
ply and divide routines. tions are based on the same algorithm.
The code size optimized implementa-
tion, however, uses looped code,
whereas the speed optimized code is a
straight-line code implementation. Figure
1 shows the flow chart for the code size 0936A-A–8/97
optimized version.

1
Algorithm Description
The algorithm for the Code Size optimized version is as fol- 5. Shift right result High byte into result Low byte/multi-
lows: plier.
1. Clear result High byte. 6. Shift right result Low byte/multiplier.
2. Load loop counter with 8. 7. Decrement Loop counter.
3. Shift right multiplier 8. If loop counter not zero, goto Step 4.
4. If carry (previous bit 0 of multiplier) set, add multipli-
cand to result High byte.

MPY8U

CLEAR RESULT
HIGH BYTE

LOOP COUNTER ‹ 8

SHIFT MULTIPLIER
RIGHT

ADD MULTIPLICAND Y
CARRY SET?
TO RESULT HIGH BYTE

SHIFT RIGHT RESULT

HIGH BYTE

SHIFT RIGHT RESULT LOW

BYTE AND MULTIPLIER

DECREMENT LOOP
COUNTER

N
LOOP COUNTER = 0?

RETURN

Figure 1. “mpy8u” Flow Chart (Code Size Optimized

Implementation)

2 AVR200
AVR200

Usage
The usage of “mpy8u” is the same for both versions: 3. The 16 -bit result is found in the two register variables
1. Load register variables “mp8u” and “mc8u” with the “m8uH” (High byte) and “m8uL” (Low byte)
multiplier and multiplicand, respectively. Observe that to minimize register usage, code and execu-
2. Call “mpy8u” tion time, the multiplier and result Low byte share the same
register.
Performance
Table 2. “mpy8u” Register Usage (Code Size Optimized Implementation)
Register Input Internal Output
R16 “mc8u” - multiplicand
R17 “mp8u” - multiplier “m8uL” - result Low byte
R18 “m8uH” - result High byte
R19 “mcnt8u” - loop counter

Table 3. “mpy8u” Performance Figures (Code Size Optimized Implementation)

Parameter Value
Code Size (Words) 9 + return
Execution Time (Cycles) 58 + return
Register Usage • Low registers :None
• High registers :4
• Pointers :None
Interrupts Usage None
Peripherals Usage None

Table 4. “mpy8u” Register Usage (Straight-line Implementation)

Register Input Internal Output
R16 “mc8u” - multiplicand
R17 “mp8u” - multiplier “m8uL” - result Low byte
R18 “m8uH” - result High byte

Table 5. “mpy8u” Performance Figures (Straight-Line Implementation)

Parameter Value
Code Size (Words) 34 + return
Execution Time (Cycles) 34 + return
Register Usage • Low registers :None
• High registers :3
• Pointers :None
Interrupts Usage None
Peripherals Usage None

3
8 x 8 = 16 Signed Multiplication - 1. Clear result High byte and carry.
“mpy8s” 2. Load loop counter with 8.
This subroutine, which is found in “avr200.asm” imple- 3. If carry (previous bit 0 of multiplier) set, add multipli-
ments signed 8 x 8 multiplication. Negative numbers are cand to result High byte.
represented as 2's complement numbers. The application 4. If current bit 0 of multiplier set, subtract multiplicand
is an implementation of Booth's algorithm. The algorithm from result High byte.
provides both small and fast code. However, it has one lim- 5. Shift right result High byte into result Low byte/multi-
itation that the user should bear in mind; If all 16-bits of the plier.
result is needed, the algorithm fails when used with the
6. Shift right result Low byte/multiplier.
most negative number (-128) as the multiplicand.
7. Decrement loop counter
Algorithm Description 8. If loop counter not zero, goto Step 3.
The algorithm for signed 8 x 8 multiplication is as follows:

MPY8S

CLEAR RESULT
HIGH BYTE AND CARRY

LOOP COUNTER ‹ 8

ADD MULTIPLICAND Y
CARRY = 1?
TO RESULT HIGH BYTE

BIT 0 OF
SUBTRACT MULTIPLICAND Y
MULTIPLIER
FROM RESULT HIGH BYTE
SET?

SHIFT RIGHT RESULT

HIGH BYTE

SHIFT RIGHT RESULT LOW

BYTE AND MULTIPLIER

DECREMENT LOOP
COUNTER

N
LOOP COUNTER = 0?

RETURN

Figure 2. “mpy8s” Flow Chart

4 AVR200
AVR200

Usage 3. The 16 -bit result is found in the two register variables

The usage of “mpy8s” is as follows: “m8sH” (High byte) and “m8sL” (Low byte)
1. Load register variables “mp8s” and “mc8s” with the Observe that to minimize register usage, code and execu-
multiplier and multiplicand, respectively. tion time, the multiplier and result Low byte share the same
register.
2. Call “mpy8s”
Performance
Table 6. “mpy8s” Register Usage
Register Input Internal Output
R16 “mc8s” - multiplicand
R17 “mp8s” - multiplier “m8sL” - result Low byte
R18 “m8sH” - result High byte
R19 “mcnt8s” - loop counter

Table 7. “mpy8s” Performance Figures

Parameter Value
Code Size (Words) 10 + return
Execution Time (Cycles) 73 + return
Register Usage • Low registers :None
• High registers :4
• Pointers :None
Interrupts Usage None
Peripherals Usage None

16 x 16 = 32 Unsigned Multiplication - Algorithm Description

“mpy16u” The algorithm for the Code Size optimized version is as fol-
lows:
Both program files contain a routine called “mpy16u” which
performs unsigned 16-bit multiplication. Both implementa- 1. Clear result high word (Bytes 2 and 3)
tions are based on the same algorithm. The code size opti- 2. Load loop counter with 16.
mized implementation, however, uses looped code, 3. Shift multiplier right
whereas the speed optimized code is a straight-line code
4. If carry (previous bit 0 of multiplier Low byte) set, add
implementation. Figure 3 shows the flow chart for the Code
multiplicand to result High word.
Size optimized (looped) version.
5. Shift right result High word into result Low word/multi-
plier.
6. Shift right Low word/multiplier
7. Decrement Loop counter.
8. 8. If loop counter not zero, goto Step 4.

5
MPY16U

CLEAR RESULT
HIGH WORD

LOOP COUNTER ‹ 16

SHIFT MULTIPLIER
RIGHT

ADD MULTIPLICAND Y
CARRY SET?
TO RESULT HIGH WORD

SHIFT RIGHT RESULT

HIGH WORD

SHIFT RIGHT RESULT LOW

WORD AND MULTIPLIER

DECREMENT LOOP
COUNTER

N
LOOP COUNTER = 0?

RETURN

Figure 3. “mpy16u” Flow Chart (Code Size Optimized

Implementation)
Usage
The usage of “mpy16u” is the same for both versions:
1. Load register variables “mp16uL”/”mp16uH” with multi-
plier Low and High byte, respectively.
2. Load register variables “mc16uH”/”mc16uH” with multi-
plicand Low and High byte, respectively.
3. Call “mpy16u”
4. The 32-bit result is found in the four-byte register vari-
able “m16u3:m16u2:m16u1:m16u0”.
Observe that to minimize register usage, code and execu-
tion time, the multiplier and result Low word share the same
registers.

6 AVR200
AVR200

Performance
Table 8. “mpy16u” Register Usage (Code Size Optimized Implementation)
Register Input Internal Output
R16 “mc16uL” - multiplicand low byte
R17 “mc16uH” - multiplicand high byte
R18 “mp16uL” - multiplier low byte “m16u0” - result byte 0
R19 “mp16uH” - multiplier high byte “m16u1” - result byte 1
R20 “m16u2” - result byte 2
R21 “m16u3” - result byte 3
R22 “mcnt16u” - loop counter

Table 9. “mpy16u” Performance Figures (Code Size Optimized Implementation)

Parameter Value
Code Size (Words) 14 + return
Execution Time (Cycles) 153 + return
Register Usage • Low registers :None
• High registers :7
• Pointers :None
Interrupts Usage None
Peripherals Usage None

Table 10. “mpy16u” Register Usage (Straight-line Implementation)

Register Input Internal Output
R16 “mc16uL” - multiplicand low byte
R17 “mc16uH” - multiplicand high byte
R18 “mp16uL” - multiplier low byte “m16u0” - result byte 0
R19 “mp16uH” - multiplier high byte “m16u1” - result byte 1
R20 “m16u2” - result byte 2
R21 “m16u3” - result byte 3

Table 11. “mpy16u” Performance Figures (Straight-Line Implementation)

Parameter Value
Code Size (Words) 105 + return
Execution Time (Cycles) 105 + return
Register Usage • Low registers :None
• High registers :6
• Pointers :None
Interrupts Usage None
Peripherals Usage None

7
16 x 16 = 32 Signed Multiplication -
“mpy16s” MPY16S
This subroutine, which is found in “avr200.asm” imple-
ments signed 16 x 16 multiplication. Negative numbers are
represented as 2's complement numbers. The application CLEAR RESULT
is an implementation of Booth's algorithm. The algorithm HIGH WORD AND CARRY
provides both small and fast code. However, it has one lim-
itation that the user should bear in mind; If all 32-bits of the
result is needed, the algorithm fails when used with the LOOP COUNTER ‹ 8
most negative number (-32768) as the multiplicand.
Algorithm Description
The algorithm for signed 16 x 16 multiplication is as follows:
ADD MULTIPLICAND Y
CARRY = 1?
1. Clear result High word (Bytes 2&3) and carry. TO RESULT HIGH WORD
2. Load loop counter with 16.
N
3. If carry (previous bit 0 of multiplier Low byte) set, add
multiplicand to result High word.
BIT 0 OF
4. If current bit 0 of multiplier Low byte set, subtract multi- SUBTRACT MULTIPLICAND Y
MULTIPLIER LOW
FROM RESULT HIGH WORD
plicand from result High word. BYTE SET?

5. Shift right result High word into result Low word/multi- N

plier.
SHIFT RIGHT RESULT
6. Shift right Low word/multiplier
HIGH WORD
7. Decrement Loop counter.
8. If loop counter not zero, goto Step 3.
SHIFT RIGHT RESULT LOW
WORD AND MULTIPLIER

DECREMENT LOOP
COUNTER

N
LOOP COUNTER = 0?

RETURN

Figure 4. “mpy16s” Flow Chart

8 AVR200
AVR200

Usage
The usage of “mpy16s” is as follows: 3. Call “mpy16s”
1. Load register variables “mp16sL”/”mp16sH” with multi- 4. The 32-bit result is found in the four-byte register vari-
plier Low and High byte, respectively. able “m16s3:m16s2:m16s1:m16s0”.
2. Load register variables “mc16sH”/”mc16sH” with multi- Observe that to minimize register usage, code and execu-
plicand Low and High byte, respectively. tion time, the multiplier and result Low byte share the same
register.
Performance
Table 12. “mpy16s” Register Usage
Register Input Internal Output
R16 “mc16uL” - multiplicand low byte
R17 “mc16uH” - multiplicand high byte
R18 “mp16uL” - multiplier low byte “m16u0” - result byte 0
R19 “mp16uH” - multiplier high byte “m16u1” - result byte 1
R20 “m16u2” - result byte 2
R21 “m16u3” - result byte 3
R22 “mcnt16u” - loop counter

Table 13. “mpy16s” Performance Figures

Parameter Value
Code Size (Words) 16 + return
Execution Time (Cycles) 218 + return
Register Usage • Low registers :None
• High registers :7
• Pointers :None
Interrupts Usage None
Peripherals Usage None

8 / 8 = 8 + 8 Unsigned Division - “div8u” Algorithm Description

Both program files contain a routine called “div8u” which The algorithm for unsigned 8 / 8 division (Code Size opti-
performs unsigned 8-bit division. Both implementations are mized code) is as follows:
based on the same algorithm. The code size optimized 1. Clear remainder and carry.
implementation, however, uses looped code, whereas the 2. Load loop counter with 9.
speed optimized code is a straight-line code implementa- 3. Shift left dividend into carry.
tion. Figure 5 shows the flow chart for the code size opti-
mized version. 4. Decrement loop counter.
5. If loop counter = 0, return.
6. Shift left carry (from dividend/result) into remainder
7. Subtract divisor from remainder.
8. If result negative, add back divisor, clear carry and goto
Step 3.
9. Set carry and goto Step 3.

9
DIV8U

CLEAR REMAINDER
AND CARRY

LOOP COUNTER ‹ 9

SHIFT LEFT DIVIDEND

DECREMENT LOOP
COUNTER

N
SHIFT LEFT REMAINDER LOOP COUNTER = 0?

Y
REMAINDER ‹
REMAINDER DIVISOR
RETURN

Y REMAINDER ‹
RESULT NEGATIVE?
REMAINDER + DIVISOR

N
CLEAR CARRY

SET CARRY

Figure 5. “div8u” Flow Chart (Code Size Optimized Imple-

mentation)
Usage
The usage of “div8u” is the same for both implementations
and is described in the following procedure:
1. Load register variable “dd8u” with the dividend (the
number to be divided).
2. Load register variable “dv8u” with the divisor (the divid-
ing number).
3. Call “div8u”.
4. The result is found in “dres8u” and the remainder in
“drem8u”.

10 AVR200
AVR200

Performance
Table 14. “div8u” Register Usage (Code Size Optimized Version)
Register Input Internal Output
R15 “drem8u” - remainder
R16 “dd8u” - dividend “dres8u” - result
R17 “dv8u” - divisor”
R18 “dcnt8u” - loop counter

Table 15. “div8u” Performance Figures (Code Size Optimized Version)

Parameter Value
Code Size (Words) 14
Execution Time (Cycles) 97
Register Usage • Low registers :1
• High registers :3
• Pointers :None
Interrupts Usage None
Peripherals Usage None

Table 16. “div8u” Register Usage (Speed Optimized Version)

Table 17. “div8u” Performance Figures (Speed Optimized Version)

Parameter Value
Code Size (Words) 66
Execution Time (Cycles) 58
Register Usage • Low registers :1
• High registers :2
• Pointers :None
Interrupts Usage None
Peripherals Usage None

8 / 8 = 8 + 8 Signed Division - “div8s”

The subroutine “mpy8s” implements signed 8-bit division.
The implementation is Code Size optimized. If negative, the
input values shall be represented on 2's complement's
form.

11
Algorithm Description
The algorithm for signed 8 / 8 division is as follows:
1. XOR dividend and divisor and store in a Sign register.
2. If MSB of dividend set, negate dividend.
3. If MSB if divisor set, negate dividend.
4. Clear remainder and carry.
5. Load loop counter with 9.
6. Shift left dividend into carry.
7. Decrement loop counter.
8. If loop counter ≠ 0, goto step 11.
9. If MSB of Sign register set, negate result.
10. Return
11. Shift left carry (from dividend/result) into remainder
12. Subtract divisor from remainder.
13. If result negative, add back divisor, clear carry and goto
Step 6.
14. Set carry and goto Step 6.

12 AVR200
AVR200

DIV8S

SIGN REGISTER ‹
DIVIDEND XOR DIVISOR

MSB OF Y
NEGATE DIVISOR
DIVISOR SET?

MSB OF Y
NEGATE DIVIDEND
DIVIDEND SET?

LOOP COUNTER ‹ 9

SHIFT LEFT DIVIDEND

DECREMENT LOOP
COUNTER

N Y MSB OF SIGN Y
SHIFT LEFT REMAINDER LOOP COUNTER = 0? NEGATE RESULT
REGISTER SET?

N
REMAINDER ‹
REMAINDER DIVISOR
RETURN

Y REMAINDER ‹
RESULT NEGATIVE?
REMAINDER + DIVISOR

CLEAR CARRY

SET CARRY

Figure 6. “div8s” Flow Chart

13
Usage
The usage of “div8s” follows the procedure below: 3. Call “div8s”.
1. Load register variable “dd8s” with the dividend (the 4. The result is found in “dres8s” and the remainder in
number to be divided). “drem8s”.
2. Load register variable “dv8s” with the divisor (the divid-
ing number).
Performance
Table 18. “div8s” Register Usage
Register Input Internal Output
R14 “d8s” - sign register
R15 “drem8s” - remainder
R16 “dd8s” - dividend “dres8s” - result
R17 “dv8s” - divisor”
R18 “dcnt8s” - loop counter

Table 19. “div8s” Performance Figures

Parameter Value
Code Size (Words) 22
Execution Time (Cycles) 103
Register Usage • Low registers :2
• High registers :3
• Pointers :None
Interrupts Usage None
Peripherals Usage None

16 / 16 = 16 + 16 Unsigned Division - Algorithm Description

“div16u” The algorithm for unsigned 16 / 16 division (Code Size opti-
mized code) is as follows:
Both program files contain a routine called “div16u” which
performs unsigned 16-bit division 1. Clear remainder and carry.
Both implementations are based on the same algorithm. 2. Load loop counter with 17.
The code size optimized implementation, however, uses 3. Shift left dividend into carry
looped code, whereas the speed optimized code is a 4. Decrement loop counter.
straight-line code implementation. Figure 7 shows the flow
5. If loop counter = 0, return.
chart for the code size optimized version.
6. Shift left carry (from dividend/result) into remainder
7. Subtract divisor from remainder.
8. If result negative, add back divisor, clear carry and goto
Step 3.
9. Set carry and goto Step 3.

14 AVR200
AVR200

DIV16U

CLEAR REMAINDER
AND CARRY

LOOP COUNTER ‹ 17

SHIFT LEFT DIVIDEND

DECREMENT LOOP
COUNTER

N
SHIFT LEFT REMAINDER LOOP COUNTER = 0?

Y
REMAINDER ‹
REMAINDER DIVISOR
RETURN

Y REMAINDER ‹
RESULT NEGATIVE?
REMAINDER + DIVISOR

CLEAR CARRY

SET CARRY

Figure 7. “div16u” Flow Chart (Code Size Optimized

Implementation)
Usage
The usage of “div16u” is the same for both implementations
and is described in the following procedure:
1. Load the 16-bit register variable “dd16uH:dd16uL” with
the dividend (the number to be divided).
2. Load the 16-bit register variable “dv16uH:dv16uL” with
the divisor (the dividing number).
3. Call “div16u”.
4. The result is found in “dres16u” and the remainder in
“drem16u”.

15
Performance
Table 20. “div16u” Register Usage (Code Size Optimized Version)
Register Input Internal Output
R14 “drem16uL” - remainder low byte
R15 “drem16uH - remainder high byte
R16 “dd16uL” - dividend low byte “dres16uL” - result low byte
R17 “dd16uH” - dividend high byte “dres16uH” - result high byte
R18 “dv16uL” - divisor low byte
R19 “dv16uH” - divisor high byte
R20 “dcnt16u” - loop counter

Table 21. “div16u” Performance Figures (Code Size Optimized Version)

Parameter Value
Code Size (Words) 19
Execution Time (Cycles) 243
Register Usage • Low registers :2
• High registers :5
• Pointers :None
Interrupts Usage None
Peripherals Usage None

Table 22. “div16u” Register Usage (Speed Optimized Version)

Register Input Internal Output
R14 “drem16uL” - remainder low byte
R15 “drem16uH - remainder high byte
R16 “dd16uL” - dividend low byte “dres16uL” - result low byte
R17 “dd16uH” - dividend high byte “dres16uH” - result high byte
R18 “dv16uL” - divisor low byte
R19 “dv16uH” - divisor high byte

Table 23. “div16u” Performance Figures (Speed Optimized Version)

Parameter Value
Code Size (Words) 196 + return
Average Execution Time 173
(Cycles)
Register Usage • Low registers :2
• High registers :4
• Pointers :None
Interrupts Usage None
Peripherals Usage None

16 AVR200
AVR200

16 / 16 = 16 + 16 Signed Division -
“div16s”
The subroutine “mpy16s” implements signed 16-bit divi-
sion. The implementation is Code Size optimized. If nega-
tive, the input values shall be represented on 2's comple-
ment's form.
Algorithm Description
The algorithm for signed 16 / 16 division is as follows:
1. XOR dividend and divisor High bytes and store in a
Sign register.
2. If MSB of dividend High byte set, negate dividend.
3. If MSB if divisor set High byte, negate dividend.
4. Clear remainder and carry.
5. Load loop counter with 17.
6. Shift left dividend into carry
7. Decrement loop counter.
8. If loop counter ≠ 0, goto step 11.
9. If MSB of Sign register set, negate result.
10. Return
11. Shift left carry (from dividend/result) into remainder
12. Subtract divisor from remainder.
13. If result negative, add back divisor, clear carry and goto
Step 6.
14. Set carry and goto Step 6.

17
DIV16S

SIGN REGISTER ‹
DIVIDENDH XOR DIVISORH

MSB OF Y
NEGATE DIVISOR
DIVISOR SET?

MSB OF Y
NEGATE DIVIDEND
DIVIDEND SET?

LOOP COUNTER ‹ 17

SHIFT LEFT DIVIDEND

DECREMENT LOOP
COUNTER

N Y MSB OF SIGN Y
SHIFT LEFT REMAINDER LOOP COUNTER = 0? NEGATE RESULT
REGISTER SET?

N
REMAINDER ‹
REMAINDER DIVISOR
RETURN

Y REMAINDER ‹
RESULT NEGATIVE?
REMAINDER + DIVISOR

CLEAR CARRY

SET CARRY

Figure 8. “div16s” Flow Chart

18 AVR200
AVR200

Usage
The usage of “div16s” is described in the following proce- 3. Call “div16s”.
dure: 4. The result is found in “dres16s” and the remainder in
1. Load the 16-bit register variable “dd16sH:dd16sL” with “drem16s”.
the dividend (the number to be divided).
2. Load the 16-bit register variable “dv16sH:dv16sL” with
the divisor (the dividing number).
Performance
Table 24. “div16s” Register Usage
Register Input Internal Output
R14 “drem16sL” - remainder low byte
R15 “drem16sH - remainder high byte
R16 “dd16sL” - dividend low byte “dres16sL” - result low byte
R17 “dd16sH” - dividend high byte “dres16sH” - result high byte
R18 “dv16sL” - divisor low byte
R19 “dv16sH” - divisor high byte
R20 “dcnt16s” - loop counter

Table 25. “div16s” Performance Figures

Parameter Value
Code Size (Words) 39
Execution Time (Cycles) 255
Register Usage • Low registers :2
• High registers :5
• Pointers :None
Interrupts Usage None
Peripherals Usage None

AVR200: Multiply and Divide Routines: Features
No ratings yet
AVR200: Multiply and Divide Routines: Features
21 pages
DSP Lecture 02
100% (1)
DSP Lecture 02
121 pages
CS 162 Computer Architecture: Cs 152 l6 Multiply DAP Fa 97 © U.C.B
No ratings yet
CS 162 Computer Architecture: Cs 152 l6 Multiply DAP Fa 97 © U.C.B
15 pages
Microcontroller Programming Guide
No ratings yet
Microcontroller Programming Guide
8 pages
Esi Programs Part 1
No ratings yet
Esi Programs Part 1
27 pages
Microprocessors Unit 5 2INSTRUCTION SET OF THE 6800 MPU
No ratings yet
Microprocessors Unit 5 2INSTRUCTION SET OF THE 6800 MPU
17 pages
Ac 68080 PRM
No ratings yet
Ac 68080 PRM
97 pages
EE234 - Lec - 03
No ratings yet
EE234 - Lec - 03
45 pages
LAB1
No ratings yet
LAB1
14 pages
Part A - MPMC Lab Programs
No ratings yet
Part A - MPMC Lab Programs
20 pages
68HC12 Arithmetic Logic Instructions
No ratings yet
68HC12 Arithmetic Logic Instructions
23 pages
Lecture 4 - GPIO and ISA I-Type EECS 388
No ratings yet
Lecture 4 - GPIO and ISA I-Type EECS 388
33 pages
Cpe626 Multipliers
No ratings yet
Cpe626 Multipliers
37 pages
Computer Architecture ECE 361 Lecture 6: ALU Design
No ratings yet
Computer Architecture ECE 361 Lecture 6: ALU Design
33 pages
Table 1a: The Complete MSP430 Instruction Set of 27 Core Instructions
No ratings yet
Table 1a: The Complete MSP430 Instruction Set of 27 Core Instructions
9 pages
8086 Instruction Set Guide
No ratings yet
8086 Instruction Set Guide
26 pages
Basic Avr Arithmetic
No ratings yet
Basic Avr Arithmetic
19 pages
Atmel AVR201 Using The AVR Hardware Multiplier
No ratings yet
Atmel AVR201 Using The AVR Hardware Multiplier
16 pages
K.Praveen Kumar: Asst. Professor GITAM University
No ratings yet
K.Praveen Kumar: Asst. Professor GITAM University
67 pages
Lab 02
No ratings yet
Lab 02
7 pages
Ecen-106 HW6
No ratings yet
Ecen-106 HW6
11 pages
BCD Adjustments in Assembly Language
No ratings yet
BCD Adjustments in Assembly Language
27 pages
L5 Arithmetic Logic and Shift Instr
No ratings yet
L5 Arithmetic Logic and Shift Instr
13 pages
L13-Arithmetic Instructions
No ratings yet
L13-Arithmetic Instructions
40 pages
Mic Final45
No ratings yet
Mic Final45
17 pages
Complete 8086 Instruction Set
50% (2)
Complete 8086 Instruction Set
39 pages
Chapter 4 - Arithmetic and Logic Instructions
No ratings yet
Chapter 4 - Arithmetic and Logic Instructions
47 pages
MIPS Instruction Encoding Overview
No ratings yet
MIPS Instruction Encoding Overview
25 pages
8086 Instruction Set Overview
No ratings yet
8086 Instruction Set Overview
30 pages
MAES - MID - LECTURE 05 - v4
No ratings yet
MAES - MID - LECTURE 05 - v4
21 pages
Chapter2 Part 2 Machine Instructions and Programs
No ratings yet
Chapter2 Part 2 Machine Instructions and Programs
38 pages
MM 2
No ratings yet
MM 2
266 pages
FMPM Module 2
No ratings yet
FMPM Module 2
64 pages
FIT1008 Mid-Semester Test 2016
No ratings yet
FIT1008 Mid-Semester Test 2016
11 pages
Chapter-3 Instruction Set and Programming of 8085 Part1 - 1-Introduction
No ratings yet
Chapter-3 Instruction Set and Programming of 8085 Part1 - 1-Introduction
82 pages
Meet The PIC!
0% (1)
Meet The PIC!
11 pages
Name of Program:-: ALP To Perform Multiplication Two 16 Bit Numbers
No ratings yet
Name of Program:-: ALP To Perform Multiplication Two 16 Bit Numbers
6 pages
Micro Session5
No ratings yet
Micro Session5
17 pages
Instruction32 Preview
No ratings yet
Instruction32 Preview
4 pages
8086 Ins Set-UK
No ratings yet
8086 Ins Set-UK
12 pages
Week 4 Assembly Commands
No ratings yet
Week 4 Assembly Commands
100 pages
ES LAB Programs
No ratings yet
ES LAB Programs
29 pages
tc1 6 Architecture Vol2
No ratings yet
tc1 6 Architecture Vol2
478 pages
Course Information: Lecturers Web Page Assessment
No ratings yet
Course Information: Lecturers Web Page Assessment
6 pages
Set Completo de Instrucciones 8086
No ratings yet
Set Completo de Instrucciones 8086
50 pages
CME321 Microprocessors: Dr. O Uzhan Menemencio LU
No ratings yet
CME321 Microprocessors: Dr. O Uzhan Menemencio LU
109 pages
DSP Programming for Engineers
No ratings yet
DSP Programming for Engineers
30 pages
Complete 8086 Instruction Set
No ratings yet
Complete 8086 Instruction Set
46 pages
AL41
No ratings yet
AL41
104 pages
CMPSCI 201 Midterm 1 Solutions
No ratings yet
CMPSCI 201 Midterm 1 Solutions
6 pages
Programming in The PIC16 Family: Charles B. Cameron 18 January 2007
No ratings yet
Programming in The PIC16 Family: Charles B. Cameron 18 January 2007
20 pages
Why Assembly Language?
100% (1)
Why Assembly Language?
74 pages
Lec 3
No ratings yet
Lec 3
24 pages
MS RPL
No ratings yet
MS RPL
223 pages
ST7567A V1.2b
No ratings yet
ST7567A V1.2b
77 pages
Ms HMSHTTP
No ratings yet
Ms HMSHTTP
66 pages
(MS-CEPM) Ms SQL Server Protocols
No ratings yet
(MS-CEPM) Ms SQL Server Protocols
184 pages
(MS-DSDG) Ms SQL Server Protocols
No ratings yet
(MS-DSDG) Ms SQL Server Protocols
54 pages
Toshiba TDP-TW350U Conference Room Projector Specifications
No ratings yet
Toshiba TDP-TW350U Conference Room Projector Specifications
1 page
AVR190: Power Up Considerations: 8-Bit Microcontroller Application Note
No ratings yet
AVR190: Power Up Considerations: 8-Bit Microcontroller Application Note
4 pages
AVR202: 16-Bit Arithmetics: 8-Bit MCU With Downloadable Flash Application Note
No ratings yet
AVR202: 16-Bit Arithmetics: 8-Bit MCU With Downloadable Flash Application Note
2 pages
'NEC 2581 OptoCoupler PDF
No ratings yet
'NEC 2581 OptoCoupler PDF
12 pages
8-Bit Microcontroller With 1K Bytes Downloadable Flash AT90S1200 Preliminary
No ratings yet
8-Bit Microcontroller With 1K Bytes Downloadable Flash AT90S1200 Preliminary
7 pages
74HC 040
No ratings yet
74HC 040
20 pages
74HC HCT595 PDF
No ratings yet
74HC HCT595 PDF
24 pages
Intergas HRE MV1 PDF
No ratings yet
Intergas HRE MV1 PDF
48 pages
Projector Manual Toshiba PD P9
No ratings yet
Projector Manual Toshiba PD P9
52 pages
C++ Iteration Jump Statements
No ratings yet
C++ Iteration Jump Statements
5 pages
ZX Quantumpy A Beginners Guide To Quantum Computing and Zxcalculus Using Pyzx Ms Mahnoor Fatima Download
100% (2)
ZX Quantumpy A Beginners Guide To Quantum Computing and Zxcalculus Using Pyzx Ms Mahnoor Fatima Download
41 pages
Python 1695083170
No ratings yet
Python 1695083170
56 pages
Java Map Interface and HashMap Overview
No ratings yet
Java Map Interface and HashMap Overview
34 pages
Class-11 - C Lab Report 123
No ratings yet
Class-11 - C Lab Report 123
21 pages
Python Lists: Usage and Examples
No ratings yet
Python Lists: Usage and Examples
25 pages
Ip (Unit Iii)
No ratings yet
Ip (Unit Iii)
24 pages
Java Programming Lab Tasks
No ratings yet
Java Programming Lab Tasks
23 pages
4 - Flow - of - Control - Notes (2) XXXXX
No ratings yet
4 - Flow - of - Control - Notes (2) XXXXX
11 pages
PPS Mid 1 Objective 2024
No ratings yet
PPS Mid 1 Objective 2024
10 pages
Chapter - 4 Inheritancepackageinterface
No ratings yet
Chapter - 4 Inheritancepackageinterface
19 pages
FPL Prac Mannual
No ratings yet
FPL Prac Mannual
35 pages
C Programming Question Bank
No ratings yet
C Programming Question Bank
8 pages
How To Map Min and Max Terms On K-Map
No ratings yet
How To Map Min and Max Terms On K-Map
22 pages
DS - Notes Unit 1
No ratings yet
DS - Notes Unit 1
27 pages
Arduino Cheat Sheet
100% (1)
Arduino Cheat Sheet
1 page
Chapter3 Exercises&Solutions
No ratings yet
Chapter3 Exercises&Solutions
5 pages
C++ Programming MCQ (Multiple Choice Questions) : Here Are 1000 Mcqs On C++ (Chapterwise)
No ratings yet
C++ Programming MCQ (Multiple Choice Questions) : Here Are 1000 Mcqs On C++ (Chapterwise)
22 pages
C++ Quiz for Programming Enthusiasts
No ratings yet
C++ Quiz for Programming Enthusiasts
20 pages
Gate DLD
No ratings yet
Gate DLD
193 pages
Algorithms and Data Structure in C Documentation
No ratings yet
Algorithms and Data Structure in C Documentation
14 pages
Java OOP Concepts and Implementation
No ratings yet
Java OOP Concepts and Implementation
104 pages
OOABAP Training Presentation
No ratings yet
OOABAP Training Presentation
87 pages
SAP Advance ABAPCONCEPTS-1
No ratings yet
SAP Advance ABAPCONCEPTS-1
5 pages
Calendar in C++ by P.K Verma
100% (1)
Calendar in C++ by P.K Verma
13 pages
Mena As Functional Programming Ideas For The Curious Kotline
No ratings yet
Mena As Functional Programming Ideas For The Curious Kotline
186 pages
TC3 PLC en
100% (1)
TC3 PLC en
1,130 pages
Functions Theory Questions
No ratings yet
Functions Theory Questions
3 pages
OOPS - Question Bank
No ratings yet
OOPS - Question Bank
85 pages
Python Class Notes
No ratings yet
Python Class Notes
32 pages

AVR200: Multiply and Divide Routines: 8-Bit MCU With Downloadable Flash Application Note

Uploaded by

AVR200: Multiply and Divide Routines: 8-Bit MCU With Downloadable Flash Application Note

Uploaded by

AVR200: Multiply and Divide Routines

Features Introduction 8-Bit

Table 1. Performance Figures Summary

The application note listing consists of 8 x 8 = 16 Unsigned

SHIFT RIGHT RESULT

SHIFT RIGHT RESULT LOW

Figure 1. “mpy8u” Flow Chart (Code Size Optimized

Table 3. “mpy8u” Performance Figures (Code Size Optimized Implementation)

Table 4. “mpy8u” Register Usage (Straight-line Implementation)

Table 5. “mpy8u” Performance Figures (Straight-Line Implementation)

SHIFT RIGHT RESULT

SHIFT RIGHT RESULT LOW

Figure 2. “mpy8s” Flow Chart

Usage 3. The 16 -bit result is found in the two register variables

Table 7. “mpy8s” Performance Figures

16 x 16 = 32 Unsigned Multiplication - Algorithm Description

SHIFT RIGHT RESULT

SHIFT RIGHT RESULT LOW

Figure 3. “mpy16u” Flow Chart (Code Size Optimized

Table 9. “mpy16u” Performance Figures (Code Size Optimized Implementation)

Table 10. “mpy16u” Register Usage (Straight-line Implementation)

Table 11. “mpy16u” Performance Figures (Straight-Line Implementation)

5. Shift right result High word into result Low word/multi- N

Figure 4. “mpy16s” Flow Chart

Table 13. “mpy16s” Performance Figures

8 / 8 = 8 + 8 Unsigned Division - “div8u” Algorithm Description

SHIFT LEFT DIVIDEND

Figure 5. “div8u” Flow Chart (Code Size Optimized Imple-

Table 15. “div8u” Performance Figures (Code Size Optimized Version)

Table 16. “div8u” Register Usage (Speed Optimized Version)

Table 17. “div8u” Performance Figures (Speed Optimized Version)

8 / 8 = 8 + 8 Signed Division - “div8s”

SHIFT LEFT DIVIDEND

Figure 6. “div8s” Flow Chart

Table 19. “div8s” Performance Figures

16 / 16 = 16 + 16 Unsigned Division - Algorithm Description

SHIFT LEFT DIVIDEND

Figure 7. “div16u” Flow Chart (Code Size Optimized

Table 21. “div16u” Performance Figures (Code Size Optimized Version)

Table 22. “div16u” Register Usage (Speed Optimized Version)

Table 23. “div16u” Performance Figures (Speed Optimized Version)

SHIFT LEFT DIVIDEND

Figure 8. “div16s” Flow Chart

Table 25. “div16s” Performance Figures

You might also like