Difference between revisions of "PIC32MX: Benchmarking Mathematical Operations"

Latest revision as of 15:20, 1 March 2010

Overview

We were tasked with determining the real-time cost (measured in nanoseconds) of performing seven basic mathematical operations with each one of the six commonly used ANSI C data types.

The mathematical operations we tested were:

subtraction
addition
multiplication
division
square root
sine
cosine

The six data types we tested each operation on were:

char
short
integer
long long
float
double

Our testing procedure was simple: throw an output pin high on the NU32 development board, perform a mathematical operation with a given data type, and then pull the same pin low.

Placing the above three steps in an infinite while loop afforded us the opportunity to use an oscilloscope to measure the duration between each high-low pair in the output waveform. After subtracting the time it took for the PIC to raise and lower the voltage on the output pin (something we previously measured), we were able to determine the amount of time required for the PIC32 chip to execute an operation with a high level of accuracy.

With seven operations to perform on six different data types, we created the following table to help us assign and keep track of the various tests we planned to run:

Operation vs. Data type
	char (8-bit)	short (16-bit)	int (32-bit)	long long (64-bit)	float (32-bit)	double (64-bit)
subtraction	Test 2	Test 9	Test 16	Test 23	Test 30	Test 37
addition	Test 3	Test 10	Test 17	Test 24	Test 31	Test 38
multiplication	Test 4	Test 11	Test 18	Test 25	Test 32	Test 39
division	Test 5	Test 12	Test 19	Test 26	Test 33	Test 40
square root	Test 6	Test 13	Test 20	Test 27	Test 34	Test 41
sine	Test 7	Test 14	Test 21	Test 28	Test 35	Test 42
cosine	Test 8	Test 15	Test 22	Test 29	Test 36	Test 43

Several tests contained multiple procedures that explored various ways to carry out a given mathematical operation on a given data type. For example, in the multiplication tests, not only did we test the traditional multiplication operator (*), but also the bitwise left shift operator (<<). Our goal was to find out if one particular operator was faster than the other. Similarly, we also included procedures that performed the above operations on hard-coded numbers (such as 347) as well as randomly chosen numbers stored in variables (such as 'random_int1'). We wanted to ensure that the compiler didn't compute the results of each operation in advance. While pre-compiling can indeed afford welcome increases in execution time, situations in which the compiler can't optimize the operations ahead of time (for example, situations where the data to be operated on is not known in advance) are still common occurrences and are worth benchmarking.

Accordingly, several tests contain multiple procedures that not only account for multiple methods of performing a particular operation, but multiple sets of numbers to perform those operations on.

Test 1 was used to determine the duration required for the PIC32 to throw a pin high and pull a pin low, while Tests 2 through 43 were used to measure the actual performance of each operation and data-type pair.

Circuit

The circuit used for testing consisted of connecting the oscilloscope to the output pin (in this case, pin A2) in order to view the waveform.

Results

Below is the quick summary of the testing results comparing each data type and each operation. All results are normalized to 60 ns (1.00 = 60 ns).

Operation vs. Data type
	Subtraction	Addition	Multiplication	Division	Square Root	Sine	Cosine
Char	1.87	1.65	2.28	N/A	34.78	116.03	98.93
Short	1.03	1.67	1.47	5.00	144.57	230.40	215.40
Int	1.07	1.00	1.43	8.10	145.62	316.43	330.62
Long Long	2.50	3.33	6.63	28.73	271.85	498.30	510.38
Float	15.00	17.07	12.27	27.90	143.93	326.03	338.28
Double	26.00	20.60	23.97	53.07	133.30	343.73	333.52

Here is the Excel file with the full results, in nanoseconds and normalized. Media:Lab5.xls

Below are the results of each particular test we performed, coupled with a short explanation for each result.

Basic Timing Constants (Test 1)

Test (a): Time required to throw an output pin high

Test (b): Time required to pull an output pin low

Test (c): Time required to execute 1 empty while loop cycle

Actual waveforms as seen on the output pin.

This test determines the length of time required by the PIC32 chip to push a given output pin high and pull the same pin low.

Test (a): Time required to throw an output pin high
- Instruction: PIN_A2 = 1;
- Time: 63 ns
Test (b): Time required to pull an output pin low
- Instruction: PIN_A2 = 0;
- Time: 63 ns
Test (c): Time required to execute 1 empty while loop cycle
- Instruction: while(1){}
- Time: 23 ns

char Performance

A char data type, in ANSI C, is a value holding one byte, or one character code. The actual number of bits in a char in a particular implementation is documented as CHAR_BIT in that implementation's limits.h file. In practice, it is almost always 8 bits, corresponding to a decimal range of 0 to 255 inclusive. Given that there are many different ways to perform a given operation on a char, we've done our best to include several different methods that we feel are representative of normal coding practices. Depending on how the source code is compiled, these different methods may or may not produce different results. Furthermore, unless otherwise noted, all (a) benchmarks are operations on two predefined (and mot likely pre-computed) ASCII letters, all (b) benchmarks are operations on two predefined (and most likely pre-computed) numbers in the range of 0 to 255, and all (c) benchmarks are operations on two random (and most likely not pre-computed) numbers. These multiple benchmarks per test exist to illustrate the differences in execution time between pre-compiled operations that the compiler may have optimized ahead of time and operations the PIC must perform in real time.

Subtraction (Test 2)

Test (a): Time required to subtract two constant chars

Test (b): Time required to subtract two ints cast into a char

Test (c): Time required to subtract two random chars

This test determines the length of time required by the PIC32 chip to subtract one 8-bit number (a char) from another 8-bit number (a char).

Test (a): Time required to subtract two constant chars (may be pre-computed)
- Instruction: letter_capital_a = 'z'-'7';
- Time: 50 ns
Test (b): Time required to subtract two ints cast into a char (may be pre-computed)
- Instruction: letter_b = 100-2;
- Time: 50 ns
Test (c): Time required to subtract two random chars (guaranteed not to be pre-comp)
- Instruction: random_char = larger-smaller;*
- Time: 112 ns

*See Code section for a more details on how randomness was guaranteed.

Addition (Test 3)

Test (a): Time required to add two constant chars

Test (b): Time required to add two ints cast into a char

Test (c): Time required to add two random chars

This test determines the length of time required by the PIC32 chip to add one 8-bit number (a char) to another 8-bit number (a char).

Test (a): Time required to add two constant chars (may be pre-computed)
- Instruction: letter_a = ')'+'8';
- Time: 50 ns
Test (b): Time required to add two ints cast into a char (may be pre-computed)
- Instruction: letter_b = 97+1;
- Time: 50 ns
Test (c): Time required to add two random chars (guaranteed not to be pre-comp)
- Instruction: random_char = random_char1+random_char2;*
- Time: 99 ns

*See Code section for a more details on how randomness was guaranteed.

Multiplication (Test 4)

Test (a): Time required to multiply two constant chars

Test (b): Time required to multiply two ints cast into a char

Test (c): Time required to multiply of two random chars

This test determines the length of time required by the PIC32 chip to multiply one 8-bit number (a char) by another 8-bit number (a char).

Test (a): Time required to multiply two constant chars (may be pre-computed)
- Instruction: ascii_225 = 'K'*'♥';
- Time: 49 ns
Test (b): Time required to multiply two ints cast into a char (may be pre-computed)
- Instruction: ascii_200 = 100*2;
- Time: 48 ns
Test (c): Time required to multiply of two random chars (guaranteed not to be pre-comp)
- Instruction: random_char = larger*smaller;*
- Time: 137 ns

*See Code section for a more details on how randomness was guaranteed.

Division (Test 5)

Test (a): Time required to divide two constant chars

Test (b): Time required to divide two ints cast into a char

This test determines the length of time required by the PIC32 chip to divide one 8-bit number (a char) by another 8-bit number (a char).

Test (a): Time required to divide two constant chars (may be pre-computed)
- Instruction: ascii_25 = 'K'/'♥'; //thp
- Time: 48 ns
Test (b): Time required to divide two ints cast into a char (may be pre-computed)
- Instruction: letter_2 = 100/2;
- Time: 50 ns
Test (c): Time required to divide two random chars (guaranteed not to be pre-comp)
- Instruction: random_char = larger/smaller;*
- Time: N/A

* We had great difficulty in trying to test this particular operation. After some research with an oscilloscope and voltmeter, it seems that the PIC32 is not capable of dividing chars in this way. More specifically, every time the PIC32 attempts to divide one char by another, all output pins are immediately grounded. We've tested this code in other C environments, and it works as expected, so the error must either lie somewhere within our specific PICs (which would be very unlikely - we tested 3) or the silicon architecture of the PIC32 itself (sill unlikely, but given the number of PICs we tested, more probable). If you absolutely need to divide chars, cast them to ints first, perform your division, then cast them back to chars.

Square Root (Test 6)

Test (a): Time required to sqrt() a constant char

Test (b): Time required to sqrt() an int cast into a char

Test (c): Time required to sqrt() a random char

Test (d): Time required to ^(1/2) a constant char

Test (e): Time required to ^(1/2) an int cast into a char

Test (f): Time required to ^(1/2) a random char

This test determines the length of time required by the PIC32 chip to square root one 8-bit number (a char). Benchmarks (a) through (c) use the sqrt() function while benchmarks (d) through (f) raise the operands to the 1/2 power.

Test (a): Time required to sqrt() a constant char (may be pre-computed)
- Instruction: ascii_25 = sqrt('u');
- Time: 48 ns
Test (b): Time required to sqrt() an int cast into a char (may be pre-computed)
- Instruction: number_10 = sqrt(100);
- Time: 48 ns
Test (c): Time required to sqrt() a random char (guaranteed not to be pre-comp)
- Instruction: random_char = sqrt(random_char1);
- Time: 2087 ns
Test (d): Time required to ^(1/2) a constant char (may be pre-computed)
- Instruction: ascii_25 = ('u')^(1/2);
- Time: 48 ns
Test (e): Time required to ^(1/2) an int cast into a char (may be pre-computed)
- Instruction: number_10 = (100)^(1/2);
- Time: 48 ns
Test (f): Time required to ^(1/2) a random char (guaranteed not to be pre-comp)
- Instruction: random_char = (random_char2)^(1/2);*
- Time: 75 ns

*See Code section for a more details on how randomness was guaranteed.

Sine (Test 7)

Test (a): Time required to take the sine of a constant char

Test (b): Time required to take the sine of a constant int cast into a char

Test (c): Time required to take the sine of a random char

This test determines the length of time required by the PIC32 chip to take the sine of an 8-bit number (a char).

Test (a): Time required to take the sine of a constant char (may be pre-computed)
- Instruction: ascii_25 = sin('K');
- Time: 9963 ns
Test (b): Time required to take the sine of a constant int cast into a char (may be pre-computed)
- Instruction: letter_2 = sin(50);
- Time: 9550 ns
Test (c): Time required to take the sine of a random char (guaranteed not to be pre-comp)
- Instruction: random_char = sin(larger);*
- Time: 6962 ns

*See Code section for a more details on how randomness was guaranteed.

Cosine (Test 8)

Test (a): Time required to take the cosine of a constant char

Test (b): Time required to take the cosine of a constant int cast into a char

Test (c): Time required to take the cosine of a random char

This test determines the length of time required by the PIC32 chip to take the cosine of an 8-bit number (a char).

Test (a): Time required to take the cosine of a constant char (may be pre-computed)
- Instruction: ascii_25 = cos('K');
- Time: 9111 ns
Test (b): Time required to take the cosine of a constant int cast into a char (may be pre-computed)
- Instruction: letter_2 = cos(50);
- Time: 8724 ns
Test (c): Time required to take the cosine of a random char (guaranteed not to be pre-comp)
- Instruction: random_char = cos(larger);*
- Time: 5936 ns

*See Code section for a more details on how randomness was guaranteed.

short Performance

A short data type, in ANSI C, is a value that holds 2 bytes, or 16 bits. This corresponds to a range of 0 to 65535 (2^16 - 1). If the variable is signed (negative), then the range is from -32767 to 32767 (-2^15 + 1 to 2^15 -1). Given that there are many different ways to perform a given operation on a short, we've done our best to include several different methods that we feel are representative of normal coding practices. Depending on how the source code is compiled, these different methods may or may not produce different results. Furthermore, unless otherwise noted, all (a) benchmarks are operations on two predefined (and mot likely pre-computed) shorts and all (b) benchmarks are operations on two random (and most likely not pre-computed) shorts. These multiple benchmarks per test exist to illustrate the differences in execution time between pre-compiled operations that the compiler may have optimized ahead of time and operations the PIC must perform in real time.

Subtraction (Test 9)

Test (a): Time required to subtract two constant shorts

Test (b): Time required to subtract two random shorts

This test determines the length of time required by the PIC32 chip to subtract one 16-bit number (a short) from another 16-bit number (a short).

Test (a): Time required to subtract two constant shorts (may be pre-computed)
- Instruction: short1 = 1337-343;
- Time: 25 ns
Test (b): Time required to subtract two random shorts (guaranteed not to be pre-comp)
- Instruction: random_short = larger-smaller;
- Time: 62 ns

Addition (Test 10)

Test (a): Time required to add two constant shorts

Test (b): Time required to add two random shorts

This test determines the length of time required by the PIC32 chip to add one 16-bit number (a short) to another 16-bit number (a short).

Test (a): Time required to add two constant shorts (may be pre-computed)
- Instruction: short1 = 1337+343;
- Time: 50 ns
Test (b): Time required to add two random shorts (guaranteed not to be pre-comp)
- Instruction: random_short = larger-smaller;
- Time: 100 ns

Multiplication (Test 11)

Test (a): Time required to multiply two constant shorts

Test (b): Time required to multiply two random shorts

This test determines the length of time required by the PIC32 chip to multiply one 16-bit number (a short) by another 16-bit number (a short).

Test (a): Time required to multiply two constant shorts (may be pre-computed)
- Instruction: short1 = 47*347;
- Time: 24 ns
Test (b): Time required to multiply two random shorts (guaranteed not to be pre-comp)
- Instruction: random_short = larger*smaller;
- Time: 88 ns

Division (Test 12)

Test (a): Time required to divide two constant shorts

Test (b): Time required to divide two random shorts

This test determines the length of time required by the PIC32 chip to divide one 16-bit number (a short) by another 16-bit number (a short).

Test (a): Time required to divide two constant shorts (may be pre-computed)
- Instruction: short1 = 62488/347;
- Time: 28 ns
Test (b): Time required to divide two random shorts (guaranteed not to be pre-comp)
- Instruction: random_short = larger/smaller;
- Time: 300 ns

Square Root (Test 13)

Test (a): Time required to sqrt() a constant short

Test (b): Time required to sqrt() a random short

Test (c): Time required to ^(1/2) a constant short

Test (d): Time required to ^(1/2) a random short

This test determines the length of time required by the PIC32 chip to get the square root of one 16-bit number (a short). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.

Test (a): Time required to sqrt() a constant short (may be pre-computed)
- Instruction: short1 = sqrt(31337);
- Time: 50 ns
Test (b): Time required to sqrt() a random short (guaranteed not to be pre-comp)*
- Instruction: short2 = sqrt(random_short1);
- Time: 8674 ns
Test (c): Time required to ^(1/2) a constant short (may be pre-computed)
- Instruction: short3 = (30343)^(1/2);
- Time: 50 ns
Test (d): Time required to ^(1/2) a random short (guaranteed not to be pre-comp)*
- Instruction: short4 = (random_short2)^(1/2);
- Time: 76 ns

*See Code section for a more details on how randomness was guaranteed.

Sine (Test 14)

Test (a): Time required to take the sine of a constant short

Test (b): Time required to take the sine of a random short

This test determines the length of time required by the PIC32 chip to get the sine of one 16-bit number (a short).

Test (a): Time required to take the sine of a constant short (may be pre-computed)
- Instruction: short1 = sin(31337);
- Time: 13014 ns
Test (b): Time required to take the sine of a random short (guaranteed not to be pre-comp)
- Instruction: random_short = sin(random_short1);
- Time: 13824 ns

Cosine (Test 15)

Test (a): Time required to take the cosine of a constant short

Test (b): Time required to take the cosine of a random short

This test determines the length of time required by the PIC32 chip to get the cosine of one 16-bit number (a short).

Test (a): Time required to take the cosine of a constant short (may be pre-computed)
- Instruction: short1 = cos(31337);
- Time: 12174 ns
Test (b): Time required to take the cosine of a random short (guaranteed not to be pre-comp)
- Instruction: random_short = cos(random_short1);
- Time: 12924 ns

int Performance

An int data type, in ANSI C, is a value that holds 4 bytes, or 32 bits. This corresponds to a range of 0 to 4294967295 (2^32 - 1). If the variable is signed (negative), then the range is from -2147483647 to 2147483647 (-2^31 + 1 to 2^31 -1). Given that there are many different ways to perform a given operation on an int, we've done our best to include several different methods that we feel are representative of normal coding practices. Depending on how the source code is compiled, these different methods may or may not produce different results. Furthermore, unless otherwise noted, all (a) benchmarks are operations on two predefined (and mot likely pre-computed) ints and all (b) benchmarks are operations on two random (and most likely not pre-computed) ints. These multiple benchmarks per test exist to illustrate the differences in execution time between pre-compiled operations that the compiler may have optimized ahead of time and operations the PIC must perform in real time.

Subtraction (Test 16)

Time required to subtract two constant ints

Test (b): Time required to subtract two random ints

This test determines the length of time required by the PIC32 chip to subtract one 32-bit number (an int) from another 32-bit number (an int).

Test (a): Time required to subtract two constant ints (may be pre-computed)
- Instruction: int1 = 2271988-7889;
- Time: 38 ns
Test (b): Time required to subtract two random ints (guaranteed not to be pre-comp)
- Instruction: random_int = larger-smaller;
- Time: 64 ns

Addition (Test 17)

Test (a): Time required to add two constant ints

Test (b): Time required to add two random ints

Actual waveforms as seen on the output pin.

This test determines the length of time required by the PIC32 chip to add one 32-bit number (an int) to another 32-bit number (an int).

Test (a): Time required to add two constant ints (may be pre-computed)
- Instruction:
- Time: 26 ns
Test (b): Time required to add two random ints (guaranteed not to be pre-comp)
- Instructions:
- Time: 60 ns

Multiplication (Test 18)

Test (a): Time required to multiply two constant ints

Test (b): Time required to multiply two random ints

This test determines the length of time required by the PIC32 chip to multiply one 32-bit number (an int) by another 32-bit number (an int).

Test (a): Time required to multiply two constant ints (may be pre-computed)
- Instruction: int1 = 65500*6550;
- Time: 38 ns
Test (b): Time required to multiply two random ints (guaranteed not to be pre-comp)
- Instruction: random_int = random_int1*random_int2;
- Time: 86 ns

Division (Test 19)

Test (a): Time required to divide two constant ints

Test (b): Time required to divide two random ints

This test determines the length of time required by the PIC32 chip to divide one 32-bit number (an int) by another 32-bit number (an int).

Test (a): Time required to divide two constant ints (may be pre-computed)
- Instruction: int1 = 1943438364/347;
- Time: 38 ns
Test (b): Time required to divide two random ints (guaranteed not to be pre-comp)
- Instruction: random_int = larger/smaller;
- Time: 486 ns

Square Root (Test 20)

Test (a): Time required to sqrt() a constant int

Test (b): Time required to sqrt() a random int

Test (c): Time required to ^(1/2) a constant int

Test (d): Time required to ^(1/2) a random int

This test determines the length of time required by the PIC32 chip to get the square root of one 32-bit number (an int). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.

Test (a): Time required to sqrt() a constant int (may be pre-computed)
- Instruction: int1 = sqrt(347343777);
- Time: 50 ns
Test (b): Time required to sqrt() a random int (guaranteed not to be pre-comp)
- Instruction: int2 = sqrt(random_int1);
- Time: 8737 ns
Test (c): Time required to ^(1/2) a constant int (may be pre-computed)
- Instruction: int3 = (743347343)^(1/2);
- Time: 88 ns
Test (d): Time required to ^(1/2) a random int (guaranteed not to be pre-comp)
- Instruction: int4 = (random_int2)^(1/2);
- Time: 74 ns

Sine (Test 21)

Test (a): Time required to take the sine of a constant int

Test (b): Time required to take the sine of a random int

This test determines the length of time required by the PIC32 chip to get the sine of one 32-bit number (an int).

Test (a): Time required to take the sine of a constant int (may be pre-computed)
- Instruction: int1 = sin(1347433747);
- Time: 19488 ns
Test (b): Time required to take the sine of a random int (guaranteed not to be pre-comp)
- Instruction: random_int = sin(random_int1);
- Time: 18988 ns

Cosine (Test 22)

Test (a): Time required to take the cosine of a constant int

Test (b): Time required to take the cosine of a random int

This test determines the length of time required by the PIC32 chip to get the cosine of one 32-bit number (an int).

Test (a): Time required to take the cosine of a constant int (may be pre-computed)
- Instruction: int1 = cos(1347433747);
- Time: 20324 ns
Test (b): Time required to take the cosine of a random int (guaranteed not to be pre-comp)
- Instruction: random_int = cos(random_int1);
- Time: 19837 ns

long long Performance

A long long data type, in ANSI C, is a value that holds 8 bytes, or 64 bits. This corresponds to a range of 0 to 1.84467441 × 10^19 (2^64 - 1). If the variable is signed (negative), then the range is from -9.22337204 × 10^18 to 9.22337204 × 10^18 (-2^63 + 1 to 2^63 -1). Given that there are many different ways to perform a given operation on a long long, we've done our best to include several different methods that we feel are representative of normal coding practices. Depending on how the source code is compiled, these different methods may or may not produce different results. Furthermore, unless otherwise noted, all (a) benchmarks are operations on two predefined (and mot likely pre-computed) long longs and all (b) benchmarks are operations on two random (and most likely not pre-computed) long longs. These multiple benchmarks per test exist to illustrate the differences in execution time between pre-compiled operations that the compiler may have optimized ahead of time and operations the PIC must perform in real time.

Subtraction (Test 23)

Test (a): Time required to subtract two constant long longs

Test (b): Time required to subtract two random long longs

This test determines the length of time required by the PIC32 chip to subtract one 64-bit number (a long long) from another 64-bit number (a long long).

Test (a): Time required to subtract two constant long longs (may be pre-computed)
- Instruction: longlong1 = 17179800000LL-7179869184LL;
- Time: 186 ns
Test (b): Time required to subtract two random long longs (guaranteed not to be pre-comp)
- Instruction: random_longlong = larger-smaller;
- Time: 150 ns

Addition (Test 24)

Test (a): Time required to add two constant long longs

Test (b): Time required to add two random long longs

This test determines the length of time required by the PIC32 chip to add one 64-bit number (a long long) to another 64-bit number (a long long).

Test (a): Time required to add two constant long longs (may be pre-computed)
- Instruction: longlong1 = 17179800000LL+179869184LL;
- Time: 88 ns
Test (b): Time required to add two random long longs (guaranteed not to be pre-comp)
- Instruction: random_longlong = larger+smaller;
- Time: 200 ns

Multiplication (Test 25)

Test (a): Time required to multiply two constant long longs

Test (b): Time required to multiply two random long longs

This test determines the length of time required by the PIC32 chip to multiply one 64-bit number (a long long) by another 64-bit number (a long long).

Test (a): Time required to multiply two constant long longs (may be pre-computed)
- Instruction: longlong1 = 171798000LL*7186984LL;
- Time: 74 ns
Test (b): Time required to multiply two random long longs (guaranteed not to be pre-comp)
- Instruction: random_longlong = larger*smaller;
- Time: 398 ns

Division (Test 26)

Test (a): Time required to divide two constant long longs

Test (b): Time required to divide two random long longs

This test determines the length of time required by the PIC32 chip to divide one 64-bit number (a long long) by another 64-bit number (a long long).

Test (a): Time required to divide two constant long longs (may be pre-computed)
- Instruction: longlong1 = 17179800000LL/7179869184LL;
- Time: 74 ns
Test (b): Time required to divide two random long longs (guaranteed not to be pre-comp)
- Instruction: random_longlong = larger/smaller;
- Time: 1724 ns

Square Root (Test 27)

Test (a): Time required to sqrt() a constant long long

Test (b): Time required to sqrt() a random long long

Test (c): Time required to ^(1/2) a constant long long

Test (d): Time required to ^(1/2) a random long long

This test determines the length of time required by the PIC32 chip to get the square root of one 64-bit number (a long long). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.

Test (a): Time required to sqrt() a constant long long (may be pre-computed)
- Instruction: longlong1 = sqrt(17179800000LL);
- Time: 87 ns
Test (b): Time required to sqrt() a random long long (guaranteed not to be pre-comp)
- Instruction: random_longlong = sqrt(random_longlong1);
- Time: 16311 ns
Test (c): Time required to ^(1/2) a constant long long (may be pre-computed)
- Instruction: longlong1 = 17179800000LL^(1/2);
- Time: 188 ns
Test (d): Time required to ^(1/2) a random long long (guaranteed not to be pre-comp)
- Instruction: random_longlong = random_longlong2^(1/2);
- Time: 74 ns

Sine (Test 28)

Test (a): Time required to take the sine of a constant long long

Test (b): Time required to take the sine of a random long long

This test determines the length of time required by the PIC32 chip to get the sine of one 64-bit number (a long long).

Test (a): Time required to take the sine of a constant long long (may be pre-computed)
- Instruction: longlong1 = sin(1347433747);
- Time: 23837 ns
Test (b): Time required to take the sine of a random long long (guaranteed not to be pre-comp)
- Instruction: random_longlong = sin(random_longlong1);
- Time: 29898 ns

Cosine (Test 29)

Test (a): Time required to take the cosine of a constant long long

Test (b): Time required to take the cosine of a random long long

This test determines the length of time required by the PIC32 chip to get the cosine of one 64-bit number (a long long).

Test (a): Time required to take the cosine of a constant long long (may be pre-computed)
- Instruction: longlong1 = cos(1347433747);
- Time: 24611 ns
Test (b): Time required to take the cosine of a random long long (guaranteed not to be pre-comp)
- Instruction: random_longlong = cos(random_longlong1);
- Time: 30623 ns

float Performance

A float data type, in ANSI C, is a single-precision value represented by 4 bytes, consisting of a sign bit, an 8-bit excess-127 binary exponent, and a 23-bit mantissa. This roughly corresponds to a range of approximately 3.4E–38 to 3.4E+38 for unsigned floats. Signed floats are more complex to understand, but are covered in depth here: http://msdn.microsoft.com/en-us/library/hd7199ke%28VS.80%29.aspx. Given that there are many different ways to perform a given operation on a float, we've done our best to include several different methods that we feel are representative of normal coding practices. Depending on how the source code is compiled, these different methods may or may not produce different results. Furthermore, unless otherwise noted, all (a) benchmarks are operations on two predefined (and mot likely pre-computed) floats and all (b) benchmarks are operations on two random (and most likely not pre-computed) floats. These multiple benchmarks per test exist to illustrate the differences in execution time between pre-compiled operations that the compiler may have optimized ahead of time and operations the PIC must perform in real time.

Subtraction (Test 30)

Test (a): Time required to subtract two constant floats

Test (b): Time required to subtract two random floats

This test determines the length of time required by the PIC32 chip to subtract one 32-bit number (a float) from another 32-bit number (a float).

Test (a): Time required to subtract two constant floats (may be pre-computed)
- Instruction: float1 = 1347433747.0-7889.0;
- Time: 100 ns
Test (b): Time required to subtract two random floats (guaranteed not to be pre-comp)
- Instruction: random_float = larger-smaller;
- Time: 900 ns

Addition (Test 31)

Test (a): Time required to add two constant floats

Test (b): Time required to add two random floats

This test determines the length of time required by the PIC32 chip to add one 32-bit number (a float) to another 32-bit number (a float).

Test (a): Time required to add two constant floats (may be pre-computed)
- Instruction: float1 = 234232397.0+12353235.0;
- Time: 124 ns
Test (b): Time required to add two random floats (guaranteed not to be pre-comp)
- Instruction: random_float = random_float1+random_float2;
- Time: 1024 ns

Multiplication (Test 32)

Test (a): Time required to multiply two constant floats

Test (b): Time required to multiply two random floats

This test determines the length of time required by the PIC32 chip to multiply one 32-bit number (a float) by another 32-bit number (a float).

Test (a): Time required to multiply two constant floats (may be pre-computed)
- Instruction: float1 = 65500.0*650.0;
- Time: 124 ns
Test (b): Time required to multiply two random floats (guaranteed not to be pre-comp)
- Instruction: random_float = random_float1*random_float2;
- Time: 736 ns

Division (Test 33)

Test (a): Time required to divide two constant floats

Test (b): Time required to divide two random floats

This test determines the length of time required by the PIC32 chip to divide one 32-bit number (a float) by another 32-bit number (a float).

Test (a): Time required to divide two constant floats (may be pre-computed)
- Instruction: float1 = 1347433747.0/7889.0;
- Time: 99 ns
Test (b): Time required to divide two random floats (guaranteed not to be pre-comp)
- Instruction: random_float = larger/smaller;
- Time: 1674 ns

Square Root (Test 34)

Test (a): Time required to sqrt() a constant float

Test (b): Time required to sqrt() a random float

This test determines the length of time required by the PIC32 chip to get the square root of one 32-bit number (a float). Tests (a) and (b) use the 'sqrt()' method, while tests (c) and (d) use a number to the 1/2 power.

Test (a): Time required to sqrt() a constant float (may be pre-computed)
- Instruction: float1 = sqrt(1347433747.0);
- Time: 99 ns
Test (b): Time required to sqrt() a random float (guaranteed not to be pre-comp)
- Instruction: random_float = sqrt(random_float1);
- Time: 8636 ns
Test (c): Time required to ^(1/2) a constant float (may be pre-computed)
- Instruction:
- Time: N/A ns*
Test (d): Time required to ^(1/2) a random float (guaranteed not to be pre-comp)
- Instruction:
- Time: N/A ns*

* We had some difficulty in trying to test these particular operations. It seems that the PIC32 is not capable of raising floats to the 1/2 power.

Sine (Test 35)

Test (a): Time required to take the sine of a constant float

Test (b): Time required to take the sine of a random float

This test determines the length of time required by the PIC32 chip to get the sine of one 32-bit number (a float).

Test (a): Time required to take the sine of a constant float (may be pre-computed)
- Instruction: float1 = sin(1347433747.0);
- Time: 19574 ns
Test (b): Time required to take the sine of a random float (guaranteed not to be pre-comp)
- Instruction: random_float = sin(random_float1);
- Time: 19562 ns

Cosine (Test 36)

Test (a): Time required to take the cosine of a constant float

Test (b): Time required to take the cosine of a random float

This test determines the length of time required by the PIC32 chip to get the cosine of one 32-bit number (a float).

Test (a): Time required to take the cosine of a constant float (may be pre-computed)
- Instruction: float1 = cos(1347433747.0);
- Time: 20311 ns
Test (b): Time required to take the cosine of a random float (guaranteed not to be pre-comp)
- Instruction: random_float = cos(random_float1);
- Time: 20297 ns

double Performance

A double data type, in ANSI C, is a double-precision value represented by 8 bytes, consisting of a sign bit, an 8-bit excess-1023 binary exponent, and a 52-bit mantissa. This roughly corresponds to a range of approximately 2.22E–308 1.79E+308 for unsigned doubles. Signed doubles are more complex, but are covered in depth here: http://msdn.microsoft.com/en-us/library/hd7199ke%28VS.80%29.aspx. Given that there are many different ways to perform a given operation on a double, we've done our best to include several different methods that we feel are representative of normal coding practices. Depending on how the source code is compiled, these different methods may or may not produce different results. Furthermore, unless otherwise noted, all (a) benchmarks are operations on two predefined (and mot likely pre-computed) doubles and all (b) benchmarks are operations on two random (and most likely not pre-computed) doubles. These multiple benchmarks per test exist to illustrate the differences in execution time between pre-compiled operations that the compiler may have optimized ahead of time and operations the PIC must perform in real time.

Subtraction (Test 37)

Test (a): Time required to subtract two constant doubles

Test (b): Time required to subtract of two random doubles

This test determines the length of time required by the PIC32 chip to subtract one 64-bit number (a double) from another 8-bit number (a double).

Test (a): Time required to subtract two constant doubles (may be pre-computed)
- Instruction: double1 = 17179800000.0-7179869184.0;
- Time: 199 ns
Test (b): Time required to subtract of two random doubles (guaranteed not to be pre-comp)
- Instruction: random_double = larger-smaller;*
- Time: 112 ns

*See Code section for a more details on how randomness was guaranteed.

Addition (Test 38)

Test (a): Time required to add two constant doubles

Test (b): Time required to add of two random doubles

This test determines the length of time required by the PIC32 chip to add one 64-bit number (a double) to another 64-bit number (a double).

Test (a): Time required to add two constant doubles (may be pre-computed)
- Instruction: double1 = 17179800000.0+7179869184.0;
- Time: 199 ns
Test (b): Time required to add of two random doubles (guaranteed not to be pre-comp)
- Instruction: random_double = larger+smaller;*
- Time: 1236 ns

*See Code section for a more details on how randomness was guaranteed.

Multiplication (Test 39)

Test (a): Time required to multiply two constant doubles

Test (b): Time required to multiply of two random doubles

This test determines the length of time required by the PIC32 chip to multiply one 64-bit number (a double) by another 64-bit number (a double).

Test (a): Time required to multiply two constant doubles (may be pre-computed)
- Instruction: double1 = 17179800.0*71798680.0;
- Time: 188 ns
Test (b): Time required to multiply of two random doubles (guaranteed not to be pre-comp)
- Instruction: random_double = random_double1*random_double2;*
- Time: 1438 ns

*See Code section for a more details on how randomness was guaranteed.

Division (Test 40)

Test (a): Time required to divide two constant doubles

Test (b): Time required to divide of two random doubles

This test determines the length of time required by the PIC32 chip to divide one 64-bit number (a double) by another 64-bit number (a double).

Test (a): Time required to divide two constant doubles (may be pre-computed)
- Instruction: double1 = 17179800.0/71798680.0;
- Time: 187 ns
Test (b): Time required to divide of two random doubles (guaranteed not to be pre-comp)
- Instruction: random_double = random_double1/random_double2;*
- Time: 3184 ns

*See Code section for a more details on how randomness was guaranteed.

Square Root (Test 41)

Test (a): Time required to sqrt() a constant double

Test (b): Time required to sqrt() an int cast into a double

This test determines the length of time required by the PIC32 chip to square root a 64-bit number (a double). Benchmarks (a) through (c) use the sqrt() function while benchmarks (d) through (f) raise the operands to the 1/2 power.

Test (a): Time required to sqrt() a constant double (may be pre-computed)
- Instruction: double1 = sqrt(33359738444.0);
- Time: 188 ns
Test (b): Time required to sqrt() an int cast into a double (may be pre-computed)
- Instruction: random_double = sqrt(random_double1);
- Time: 7998 ns
Test (c): Time required to sqrt() a random char (guaranteed not to be pre-comp)
- Instruction:
- Time: N/A ns *
Test (d): Time required to ^(1/2) a constant char (may be pre-computed)
- Instruction:
- Time: N/A ns *

* We had some difficulty in trying to test these particular operations. It seems that the PIC32 is not capable of raising doubles to the 1/2 power.

Sine (Test 42)

Test (a): Time required to take the sine of a constant double

Test (b): Time required to take the sine of a random double

This test determines the length of time required by the PIC32 chip to take the sine of an 64-bit number (a double).

Test (a): Time required to take the sine of a constant doule (may be pre-computed)
- Instruction: double1 = sin(33359738444.0);
- Time: 20299 ns
Test (b): Time required to take the sine of a random double (guaranteed not to be pre-comp)
- Instruction: random_double = sin(random_double1);*
- Time: 20624 ns

*See Code section for a more details on how randomness was guaranteed.

Cosine (Test 43)

Test (a): Time required to take the cosine of a constant double

Test (b): Time required to take the cosine of a random double

This test determines the length of time required by the PIC32 chip to get the cosine of one 64-bit number (a double).

Test (a): Time required to take the cosine of a constant double (may be pre-computed)
- Instruction: double1 = cos(33359738444.0);
- Time: 19762 ns
Test (b): Time required to take the cosine of a random double (guaranteed not to be pre-comp)
- Instruction: random_double = cos(random_double1);*
- Time: 20011 ns

*See Code section for a more details on how randomness was guaranteed.

Code

There is a c file for each test that was performed. They are included in a .zip at the bottom of this section. and can be downloaded for further review. Below is an example of the code, in this case, for testing multiplication of two chars.

 /*******************************************************************************
 * PIC32 Benchmarking Test Suite
 * 
 * Test 4 - char Multiplication Duration
 * Version 1.0
 * Copyright (C) 2010 Todd H. Poole, Katy Powers, Max Willer
 * 
 * This test determines the length of time required by the PIC32 chip to multiply
 * one 8-bit number (a char) by another 8-bit number (a char).
 * 
 * A char data type, in ANSI C, is a value holding one byte, or one character
 * code. The actual number of bits in a char in a particular implementation is
 * documented as CHAR_BIT in that implementation's limits.h file. In practice,
 * it is almost always 8 bits, corresponding to a decimal range of 0 to 255,
 * inclusive.
 *
 * Given that there are many different ways to multiply one char by another,
 * we've done our best to include several different multiplication methods that
 * we feel are representative of normal coding practices. Depending on how the
 * source code is compiled, these different methods may or may not produce
 * different results.
 *******************************************************************************/
  
 	 #include "HardwareProfile.h" 
 	 #define PIN_A2 LATAbits.LATA2 //Define the specific pin we'll use for our tests
 
 	char ascii_225;
 	char ascii_200;
 	char random_char;
 
 	float zero_to_1;
 	float zero_to_15;
 	int int_zero_to_15;
 	int random_int1;
 	int random_int2;
 	char random_char1;
 	char random_char2;
 	char larger;
 	char smaller;
 
  	int main(void)
 	{
 		SYSTEMConfigPerformance(SYS_FREQ);
 		TRISAbits.TRISA2 = 0; //Setting our testing pin to be an output
 	
 	/*Seed the random number generator so we don't get the same 'random'
 	* numbers over and over again. This is just to ensure that our
 	* C-compiler doesn't try to pull any optimization tricks on us.
 	*/
 		srand(time(NULL));
 	
 	//Finding our first random char
 	
 	/*Take the output of rand() - which is a number [0,RAND_MAX] - and
 	* divide that output by RAND_MAX+1. We now have a number that ranges
 	* from [0,1). Multiply this result by 16. This will give us a random
 	* number in the range of [0,16). (I chose 16 because the maximum value
 	* a char can be is 255. Having our range run from 0 to 16 exclusive will
 	* ensure that our ints - and, later, our chars - will range from [0,15].
 	* Thus, regardless of what values our two ints/chars take, the final 
 	* multiplication will fit in a char, which range from 0 to 255 in decimal).
 	* Cast this result into an int, which will truncate (cut off) the digits
 	* after the decimal. Thus:
 	* 0.00001 through 0.99999 will cast to 0.
 	* 255.00001 through 255.99999 will cast to 255.
 	* Then, cast our int from an int into a char.
 	* PS: Yes, I know this doesn't give us a uniform probability distribution, but
 	* we're not too concerned with true randomness... just enough to ensure nothing
 	* is computed ahead of time by the C-compiler. 
 	*/
 
 		zero_to_1 = rand() / RAND_MAX+1; //float that ranges from [0,1)
 		zero_to_15 = zero_to_1 * 16; //float that ranges from [0,16)
 		int_zero_to_15 = zero_to_15; //int that ranges from [0,15]
 		random_int1 = int_zero_to_15; //our first random int
 		random_char1 = int_zero_to_15; //our first random char
 
 	//Finding our second random char 
 	
 		zero_to_1 = rand() / RAND_MAX+1; //float that ranges from [0,1)
 		zero_to_15 = zero_to_1 * 16; //float that ranges from [0,256)
 		int_zero_to_15 = zero_to_15; //int that ranges from [0,255]
 		random_int2 = int_zero_to_15; //our second random int
 		random_char2 = int_zero_to_15; //out second random char
 	
 	/*By forcing all of our operations to depend on the results of randomly
 	* generated numbers, we've guaranteed that our C-compiler won't be able to
 	* interfere with our benchmarks by trying to compute our results in advance.
 	* This ensures that all operations are performed on the fly by the PIC, and
 	* that it's performance will be similar to what one might encounter in other
 	* projects where all variables and data are not completely known in advance.
 	*/
 	
 	//Time to start the actual benchmarking
 	
 		while(1)
 		{
 		/*We start our while loop off like this so that we can create an easily
 		* recognizable pattern on the oscilloscope. We need to be able to
 		* differentiate between the various tests, and so, after seeing this unique
 		* pattern, we'll know that next test to follow will be the first test, the
 		* test after that will be the second, etc.
 		*/ 
 		
 		PIN_A2 = 1;
 		PIN_A2 = 0;
 		PIN_A2 = 0;
 		PIN_A2 = 0;
 		PIN_A2 = 1;
 		PIN_A2 = 0;
 		
 		//Test 4(a) - Multiplication of Two Constant chars (may be pre-computed)
 		PIN_A2 = 1;
 		ascii_225 = 'K'*'?'; 
 		PIN_A2 = 0;
 		
 		//Test 4(b) - Multiplication of Two ints Cast into a char (may be pre-computed)
 		PIN_A2 = 1;
 		ascii_200 = 100*2;
 		PIN_A2 = 0;
 		
 		//Test 4(c) - Mulitplication of Two Random chars (guaranteed not to be pre-comp)
 		PIN_A2 = 1;
 		random_char = larger*smaller;
 		PIN_A2 = 0;
 		PIN_A2 = 1;		
 		}
  	}

Media:Kp_mw_tp_ME333PIC32Benchmarking.zip‎

@@ Line 1: / Line 1: @@
-== Original Assignment ==
-'''Do not erase this section!'''
-Your assignment is to empirically test how long it takes to perform add, subtract, multiply, divide, sqrt, sin, and cos operations with the 80 MHz PIC32460F512L and our standard code optimization setting.  You will do these tests with chars (8-bit integers), shorts (16-bit), integers (32-bit), long long integers (64-bit), floats (32-bit single precision floating point), and double (64-bit double-precision floating point).  The integers can be unsigned or signed.  Your end result will be a table with the operation on one axis (likely the horizontal axis) and the kind of variable on the other axis, and each cell of the table will have a normalized duration for the operation.  The time will be normalized by the fastest operation, so the smallest number in the table will be 1.00.  All other numbers will indicate how many times longer that operation takes.  All numbers will have two decimal places, e.g., 2.57 or 24.72.  You will also give the time that 1.00 corresponds to in nanoseconds.
-Since bit-shifting left and right correspond to a version of multiplying and dividing, you should also include the operations >>1 and >>4 and <<1 and <<4.  (If the results are identical, you can eliminate shift left from your table.)
-To generate this table, you can set an output bit low before the operation, then high immediately after the operation, and measure the time on an oscilloscope.  Two things to consider:  (1) Time a single operation, over and over, with a short delay between the operation.  This should create a pulse train on your oscilloscope.  Can you get an accurate estimate of the time this way?  You could also try doing five or ten operations between changing the digital output.  See if this gives the same estimate.  (This estimate might be more accurate as you are essentially averaging over a number of operations.)  Avoid using arrays and for loops in your test, as indexing arrays and running the loop each take time.  (2)  Make sure the compiler doesn't compute the results in advance.  You could try testing operations with numbers generated randomly (don't time this operation!) vs. numbers that you just type in manually to make sure that both are giving you the same result.
 == Overview ==

Difference between revisions of "PIC32MX: Benchmarking Mathematical Operations"

Latest revision as of 15:20, 1 March 2010

Overview

Circuit

Results

Basic Timing Constants (Test 1)

char Performance

Subtraction (Test 2)

Addition (Test 3)

Multiplication (Test 4)

Division (Test 5)

Square Root (Test 6)

Sine (Test 7)

Cosine (Test 8)

short Performance

Subtraction (Test 9)

Addition (Test 10)

Multiplication (Test 11)

Division (Test 12)

Square Root (Test 13)

Sine (Test 14)

Cosine (Test 15)

int Performance

Subtraction (Test 16)

Addition (Test 17)

Multiplication (Test 18)

Division (Test 19)

Square Root (Test 20)

Sine (Test 21)

Cosine (Test 22)

long long Performance

Subtraction (Test 23)

Addition (Test 24)

Multiplication (Test 25)

Division (Test 26)

Square Root (Test 27)

Sine (Test 28)

Cosine (Test 29)

float Performance

Subtraction (Test 30)

Addition (Test 31)

Multiplication (Test 32)

Division (Test 33)

Square Root (Test 34)

Sine (Test 35)

Cosine (Test 36)

double Performance

Subtraction (Test 37)

Addition (Test 38)

Multiplication (Test 39)

Division (Test 40)

Square Root (Test 41)

Sine (Test 42)

Cosine (Test 43)

Code

Navigation menu

Search