Difference between revisions of "PIC32MX: Benchmarking Mathematical Operations"
Line 81: | Line 81: | ||
** Time: 63 ns |
** Time: 63 ns |
||
* Test (c): Time required to execute 1 empty while loop cycle |
* Test (c): Time required to execute 1 empty while loop cycle |
||
** Instruction: <code> |
** Instruction: <code><pre> |
||
while(1) |
while(1) |
||
{} |
{} |
||
</code> |
</pre></code> |
||
** Time: 23 ns |
** Time: 23 ns |
||
Revision as of 04:17, 10 February 2010
Original Assignment
Do not erase this section!
Your assignment is to empirically test how long it takes to perform add, subtract, multiply, divide, sqrt, sin, and cos operations with the 80 MHz PIC32460F512L and our standard code optimization setting. You will do these tests with chars (8-bit integers), shorts (16-bit), integers (32-bit), long long integers (64-bit), floats (32-bit single precision floating point), and double (64-bit double-precision floating point). The integers can be unsigned or signed. Your end result will be a table with the operation on one axis (likely the horizontal axis) and the kind of variable on the other axis, and each cell of the table will have a normalized duration for the operation. The time will be normalized by the fastest operation, so the smallest number in the table will be 1.00. All other numbers will indicate how many times longer that operation takes. All numbers will have two decimal places, e.g., 2.57 or 24.72. You will also give the time that 1.00 corresponds to in nanoseconds.
Since bit-shifting left and right correspond to a version of multiplying and dividing, you should also include the operations >>1 and >>4 and <<1 and <<4. (If the results are identical, you can eliminate shift left from your table.)
To generate this table, you can set an output bit low before the operation, then high immediately after the operation, and measure the time on an oscilloscope. Two things to consider: (1) Time a single operation, over and over, with a short delay between the operation. This should create a pulse train on your oscilloscope. Can you get an accurate estimate of the time this way? You could also try doing five or ten operations between changing the digital output. See if this gives the same estimate. (This estimate might be more accurate as you are essentially averaging over a number of operations.) Avoid using arrays and for loops in your test, as indexing arrays and running the loop each take time. (2) Make sure the compiler doesn't compute the results in advance. You could try testing operations with numbers generated randomly (don't time this operation!) vs. numbers that you just type in manually to make sure that both are giving you the same result.
Overview
We were tasked with determining the real-time cost (measured in nanoseconds) of performing seven basic mathematical operations with each one of the six commonly used ANSI C data types.
The mathematical operations we tested were:
- subtraction
- addition
- multiplication
- division
- square root
- sine
- cosine
We six data types we tested each operation on were:
- char
- short
- integer
- long long
- float
- double
Our testing procedure was simple: throw an output pin high on the NU32 development board, perform a mathematical operation with a given data type, and then pull the same pin low.
Placing the above three steps in an infinite while loop afforded us the opportunity to use an oscilloscope to measure the duration between each high-low pair in the output waveform. After subtracting the time it took for the PIC to raise and lower the voltage on the output pin (something we previously measured), we were able to determine the amount of time required for the PIC32 chip to execute an operation with a high level of accuracy.
With seven operations to perform on six different data types, we created the following table to help us assign and keep track of the various tests we planned to run:
char (8-bit) | short (16-bit) | int (32-bit) | long long (64-bit) | float (32-bit) | double (64-bit) | |
---|---|---|---|---|---|---|
subtraction | Test 2 | Test 9 | Test 16 | Test 23 | Test 30 | Test 37 |
addition | Test 3 | Test 10 | Test 17 | Test 24 | Test 31 | Test 38 |
multiplication | Test 4 | Test 11 | Test 18 | Test 25 | Test 32 | Test 39 |
division | Test 5 | Test 12 | Test 19 | Test 26 | Test 33 | Test 40 |
square root | Test 6 | Test 13 | Test 20 | Test 27 | Test 34 | Test 41 |
sine | Test 7 | Test 14 | Test 21 | Test 28 | Test 35 | Test 42 |
cosine | Test 8 | Test 15 | Test 22 | Test 29 | Test 36 | Test 43 |
Several tests contained multiple procedures that explored various ways to carry out a given mathematical operation on a given data type. For example, in the multiplication tests, not only did we test the traditional multiplication operator (*), but also the bitwise left shift operator (<<). Our goal was to find out if one particular operator was faster than the other. Similarly, we also included procedures that performed the above operations on hard-coded numbers (such as 347) as well as randomly chosen numbers stored in variables (such as 'random_int1'). We wanted to ensure that the compiler didn't compute the results of each operation in advance. While pre-compiling can indeed afford welcome increases in execution time, situations in which the compiler can't optimize the operations ahead of time (for example, situations where the data to be operated on is not known in advance) are still common occurrences and are worth benchmarking.
Accordingly, several tests contain multiple procedures that not only account for multiple methods of performing a particular operation, but multiple sets of numbers to perform those operations on.
Test 1 was used to determine the duration required for the PIC32 to throw a pin high and pull a pin low, while Tests 2 through 43 were used to measure the actual performance of each operation and data-type pair.
Results
Below are the results of each particular test we performed, coupled with a short explanation for each result.
Basic Timing Constants (Test 1)
This test determines the length of time required by the PIC32 chip to push a given output pin high and pull the same pin low.
- Test (a): Time required to throw an output pin high
- Instruction: PIN_A2 = 1;
- Time: 63 ns
- Test (b): Time required to pull an output pin low
- Instruction: PIN_A2 = 0;
- Time: 63 ns
- Test (c): Time required to execute 1 empty while loop cycle
- Instruction:
- Instruction:
while(1)
{}
- Time: 23 ns
char Performance
A char data type, in ANSI C, is a value holding one byte, or one character code. The actual number of bits in a char in a particular implementation is documented as CHAR_BIT in that implementation's limits.h file. In practice, it is almost always 8 bits, corresponding to a decimal range of 0 to 255, inclusive. Unless otherwise noted, all (a) is an operation with two predefined ASCII letters, all (b) is an operation with two predefined numbers in the range of 0 to 255, and all (c) is an operation with two random numbers. This is in order to test times for pre-compiled operations and operations on the PIC.
Subtraction (Test 2)
This test determines the length of time required by the PIC32 chip to subtract one 8-bit number (a char) from another 8-bit number (a char).
- Test a: 50ns
- Test b: 50ns
- Test c: 112ns
Addition (Test 3)
This test determines the length of time required by the PIC32 chip to add one 8-bit number (a char) to another 8-bit number (a char).
- Test a: 50ns
- Test b: 50ns
- Test c: 99ns
Multiplication (Test 4)
This test determines the length of time required by the PIC32 chip to multiply one 8-bit number (a char) by another 8-bit number (a char).
- Test a: 49ns
- Test b: 48ns
- Test c: 137ns
Division (Test 5)
This test determines the length of time required by the PIC32 chip to divide one 8-bit number (a char) by another 8-bit number (a char).
- Test a: 48ns
- Test b: 50ns
- Test c: N/A
Square Root (Test 6)
This test determines the length of time required by the PIC32 chip to square root one 8-bit number (a char). Tests (a) thru (c) use the 'sqrt()' method while tests (d) thru (f) use the char to the 1/2 power.
- Test a: 48ns
- Test b: 48ns
- Test c: 2087ns
- Test d: 48ns
- Test e: 48ns
- Test f: 75ns
Sine (Test 7)
This test determines the length of time required by the PIC32 chip to get the sine of one 8-bit number (a char).
- Test a: 9963ns
- Test b: 9550ns
- Test c: 6962ns
Cosine (Test 8)
This test determines the length of time required by the PIC32 chip to get the cosine of one 8-bit number (a char).
- Test a: 9111ns
- Test b: 8724ns
- Test c: 5936ns
short Performance
A short data type, in ANSI C, is a value that holds 2 bytes, or 16 bits. This corresponds to a range of 0 to 65535 (2^16 - 1). If the variable is signed (negative), then the range is from -32767 to 32767 (-2^15 + 1 to 2^15 -1). In this series of tests, tests (a) are with a predefined number, and tests (b) are with a random number. This is in order to test times for pre-compiled operations and operations on the PIC.
Subtraction (Test 9)
This test determines the length of time required by the PIC32 chip to subtract one 16-bit number (a short) from another 16-bit number (a short).
- Test a: 25ns
- Test b: 62ns
Addition (Test 10)
This test determines the length of time required by the PIC32 chip to add one 16-bit number (a short) to another 16-bit number (a short).
- Test a: 50ns
- Test b: 100ns
Multiplication (Test 11)
This test determines the length of time required by the PIC32 chip to multiply one 16-bit number (a short) by another 16-bit number (a short).
- Test a: 24ns
- Test b: 88ns
Division (Test 12)
This test determines the length of time required by the PIC32 chip to divide one 16-bit number (a short) by another 16-bit number (a short).
- Test a: 28ns
- Test b: 300ns
Square Root (Test 13)
This test determines the length of time required by the PIC32 chip to get the square root of one 16-bit number (a short). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.
- Test a: 50ns
- Test b: 8674ns
- Test c: 50ns
- Test d: 76ns
Sine (Test 14)
This test determines the length of time required by the PIC32 chip to get the sine of one 16-bit number (a short).
- Test a: 13014ns
- Test b: 13824ns
Cosine (Test 15)
This test determines the length of time required by the PIC32 chip to get the cosine of one 16-bit number (a short).
- Test a: 12174ns
- Test b: 12924ns
int Performance
An int data type, in ANSI C, is a value that holds 4 bytes, or 32 bits. This corresponds to a range of 0 to 4294967295 (2^32 - 1). If the variable is signed (negative), then the range is from -2147483647 to 2147483647 (-2^31 + 1 to 2^31 -1). In this series of tests, tests (a) are with a predefined number, and tests (b) are with a random number. This is in order to test times for pre-compiled operations and operations on the PIC.
Subtraction (Test 16)
This test determines the length of time required by the PIC32 chip to subtract one 32-bit number (an int) from another 32-bit number (an int).
- Test a: 38ns
- Test b: 64ns
Addition (Test 17)
This test determines the length of time required by the PIC32 chip to add one 32-bit number (an int) to another 32-bit number (an int).
- Test a: 26ns
- Test b: 60ns
Multiplication (Test 18)
This test determines the length of time required by the PIC32 chip to multiply one 32-bit number (an int) by another 32-bit number (an int).
- Test a: 38ns
- Test b: 86ns
Division (Test 19)
This test determines the length of time required by the PIC32 chip to divide one 32-bit number (an int) by another 32-bit number (an int).
- Test a: 38ns
- Test b: 486s
Square Root (Test 20)
This test determines the length of time required by the PIC32 chip to get the square root of one 32-bit number (an int). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.
- Test a: 50ns
- Test b: 8737ns
- Test c: 88ns
- Test d: 74ns
Sine (Test 21)
This test determines the length of time required by the PIC32 chip to get the sine of one 32-bit number (an int).
- Test a: 19488ns
- Test b: 18988ns
Cosine (Test 22)
This test determines the length of time required by the PIC32 chip to get the cosine of one 32-bit number (an int).
- Test a: 20324ns
- Test b: 19837ns
long long Performance
An long long data type, in ANSI C, is a value that holds 8 bytes, or 64 bits. This corresponds to a range of 0 to 1.84467441 × 10^19 (2^64 - 1). If the variable is signed (negative), then the range is from -9.22337204 × 10^18 to 9.22337204 × 10^18 (-2^63 + 1 to 2^63 -1). In this series of tests, tests (a) are with a predefined number, and tests (b) are with a random number. This is in order to test times for pre-compiled operations and operations on the PIC.
Subtraction (Test 23)
This test determines the length of time required by the PIC32 chip to subtract one 64-bit number (a long long) from another 64-bit number (a long long).
- Test a: 186ns
- Test b: 150ns
Addition (Test 24)
This test determines the length of time required by the PIC32 chip to add one 64-bit number (a long long) to another 64-bit number (a long long).
- Test a: 88ns
- Test b: 200ns
Multiplication (Test 25)
This test determines the length of time required by the PIC32 chip to multiply one 64-bit number (a long long) by another 64-bit number (a long long).
- Test a: 74ns
- Test b: 398ns
Division (Test 26)
This test determines the length of time required by the PIC32 chip to divide one 64-bit number (a long long) by another 64-bit number (a long long).
- Test a: 74ns
- Test b: 1724ns
Square Root (Test 27)
This test determines the length of time required by the PIC32 chip to get the square root of one 64-bit number (a long long). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.
- Test a: 87ns
- Test b: 16311ns
- Test c: 188ns
- Test d: 74ns
Sine (Test 28)
This test determines the length of time required by the PIC32 chip to get the sine of one 64-bit number (a long long).
- Test a: 23837ns
- Test b: 29898ns
Cosine (Test 29)
This test determines the length of time required by the PIC32 chip to get the cosine of one 64-bit number (a long long).
- Test a: 24611ns
- Test b: 30623ns
float Performance
Subtraction (Test 30)
This test determines the length of time required by the PIC32 chip to subtract one 32-bit number (a float) from another 32-bit number (a float).
- Test a: 100ns
- Test b: 900ns
Addition (Test 31)
This test determines the length of time required by the PIC32 chip to add one 32-bit number (a float) to another 32-bit number (a float).
- Test a: 124ns
- Test b: 1024ns
Multiplication (Test 32)
This test determines the length of time required by the PIC32 chip to multiply one 32-bit number (a float) by another 32-bit number (a float).
- Test a: 124ns
- Test b: 736ns
Division (Test 33)
This test determines the length of time required by the PIC32 chip to divide one 32-bit number (a float) by another 32-bit number (a float).
- Test a: 99ns
- Test b: 1674ns
Square Root (Test 34)
This test determines the length of time required by the PIC32 chip to get the square root of one 32-bit number (a float). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.
- Test a: 99ns
- Test b: 8636ns
- Test c: N/A
- Test d: N/A
Sine (Test 35)
This test determines the length of time required by the PIC32 chip to get the sine of one 32-bit number (a float).
- Test a: 19574ns
- Test b: 19562ns
Cosine (Test 36)
This test determines the length of time required by the PIC32 chip to get the cosine of one 32-bit number (a float).
- Test a: 20311ns
- Test b: 20297
double Performance
Subtraction (Test 37)
This test determines the length of time required by the PIC32 chip to subtract one 64-bit number (a double) from another 64-bit number (a double).
- Test a: 199ns
- Test b: 1560ns
Addition (Test 38)
This test determines the length of time required by the PIC32 chip to add one 64-bit number (a double) to another 64-bit number (a double).
- Test a: 199ns
- Test b: 1236ns
Multiplication (Test 39)
This test determines the length of time required by the PIC32 chip to multiply one 64-bit number (a double) by another 64-bit number (a double).
- Test a: 188ns
- Test b: 1438ns
Division (Test 40)
This test determines the length of time required by the PIC32 chip to divide one 64-bit number (a double) by another 64-bit number (a double).
- Test a: 187ns
- Test b: 3184ns
Square Root (Test 41)
This test determines the length of time required by the PIC32 chip to get the square root of one 64-bit number (a double). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.
- Test a: 188ns
- Test b: 7998ns
- Test c: N/A
- Test d: N/A
Sine (Test 42)
This test determines the length of time required by the PIC32 chip to get the sine of one 64-bit number (a double).
- Test a: 20299ns
- Test b: 20624ns
Cosine (Test 43)
This test determines the length of time required by the PIC32 chip to get the cosine of one 64-bit number (a double).
- Test a: 19762ns
- Test b: 20011ns
Code
Test 1
This is the first test.
Test 2
This is the second test.
Blah blah blah, I would really like to be watching the super bowl right about now. /sigh.
Code
Test 1
This is the first test.
Test 2
This is the second test.
Blah blah blah, I would really like to be watching the super bowl right about now. /sigh.