Difference between revisions of "Real-Time Linux for TrackCam"

Latest revision as of 21:41, 20 June 2010

Overview

Project by: James Yeung, Master in Electrical and Computer Engineering, 2010.
Last updated: June 11, 2010

The goal of this project was to install and set up an operating system with real-time capabilities to work with Photonfocus' TrackCam and SiliconSoftware's MicroEnable Frame Grabber. This is the continuation of earlier work done with the same hardware but on Windows instead.

So what is a real-time operating system? A key characteristic of a real-time OS is the level of its consistency concerning the amount of time it takes to accept and complete an application's task; the variability is jitter. A hard real-time operating system has less jitter than a soft real-time operating system. The chief design goal is not high throughput, but rather a guarantee of a soft or hard performance category. A real-time OS that can usually or generally meet a deadline is a soft real-time OS, but if it can meet a deadline deterministically it is a hard real-time OS.

Implementations of Real-Time Operating Systems

There are currently two main methods of implementing a real-time operating system, a micro-kernel approach and a scheduling approach. We will be using an operating system which implements the scheduling approach because it is the operating system that the makers of TrackCam supports.

Micro-Kernel

In the micro-kernel approach, there's simply a very small, simple, real-time operating system underneath the main operating system. The main operating system becomes a task run only when there is no real-time task to run, and the micro-kernel will pre-empt the main operating system whenever a real-time task needs the processor. RTAI and RTLinux (not to be confused with the linux-rt patch) are examples of such implementation.

Scheduling

In the scheduling approach, the operating system has a scheduling policy where when a task starts running, it continues to run until it voluntarily yields the processor, blocks or is preempted by a higher-priority real-time task. This is the first-in-first-out policy. Another common policy uses a timeslice model where tasks are allotted timeslices based on their priority and run until they exhaust their timeslice. The -rt Linux patch is an example of the scheduling implementation.

The scheduling in linux-rt has a total of 139 levels. The lower the level, the higher the priority it has. Level 100 to 139 maps to the -20 to 19 niceness levels. Nice is a program that allows you to manually set the priority of a particular process, but it only gives you access to the highest 40 levels. If you want access to the lower levels (higher priority levels), you will need to use the function "sched_setscheduler" in the "sched.h" library (see example code below).

Our Setup

Since SiliconSoftware provides drivers and support for the -rt Linux operating system, we will be using it for this project. Below is a list of hardware that we will be using.

Computer

CPU - Intel Core 2 Quad Q8400 (2.66 GHz)
RAM - Crucial 4 GB DDR2 800 (PC2 6400)
Motherboard - Supermicro MBD-C2SBE-O
Hard Drive - 500 GB Seagate Barracude 7200.12, 7200 rpm
Video Card - EVGA 256-P2-N768-Fr GeForce 8600 GTS

Camera

PhotonFocus MV-D1024-TrackCam
SiliconSoftware MicroEnable III Frame Grabber

How To Setup Linux-rt On openSUSE

Overview

For those who are not familiar with Linux, or how an operating system works, here is a quick run down on the basics. Linux is really just a kernel. A kernel is a piece of software that handles the interaction between hardware and applications. All the different kinds of "Linux" out there like Ubuntu, Fedora, openSUSE and Debian all have basically the same kernel, with a few tweaks here and there. The main difference between them is the GUI that is on top of the kernel, which gives them each their distinct look and feel. We will be using openSUSE because it has been tested by SiliconSoftware with their MicroEnable Frame Grabber.

Instructions

Use an openSUSE Live CD to install a fresh copy of openSUSE.
- In this guide, we will be using a 64-bit version.
- Follow on screen instructions and note the root password that you set.
- When prompted about partition setup, be sure to use file system format “ext3” for the swap and home partitions. The version of the kernel that we will be building does not properly support “ext4” (or higher).
Check what is the newest version of the kernel that is supported by linux-rt and other applications/drivers that you will be using.
- For linux-rt, you can check here: http://www.kernel.org/pub/linux/kernel/projects/rt/
- In this guide, we will be using version 2.6.24.7, the highest kernel version that our frame grabber software was tested on.
Download the vanilla kernel and the linux-rt patch
- Open a terminal window. GNOME Terminal would do.
- Go into superuser mode. You will need to know the root password.
- Go into the directory where source code is commonly stored.
- Download the vanilla kernel.
- Download the linux-rt patch for the matching kernel.
Unpack the packages
Make symbolic link to new directory
Copy the config file provided with the menable driver.
Install needed packages
Apply the linux-rt patch. (Note that p1 has a one, not L)
Make oldconfig

When prompted, use default by pressing enter.

Configure the config file through gconfig

Make sure the following sections are untouched:
1. Processor type and features
2. Bus options
3. Kernel hacking
If you know which modules are needed for your hardware, enable them.
If you don't know which modules are needed for your hardware, compile the kernel and error messages from your attempt to install will give you a better idea of which modules need to be enabled.

Save and close out of gconfig
Need to change “getline” to “parseline” in scripts/unifdef.c (There are 3 getline's)
- Type “vi scripts/unifdef.c" to view and edit the file.
- Type “/getline” to search for “getline”
- Hit “i” to get into insert mode
- Change “getline” to “parseline”
- Hit the “Escape” key to get out of insert mode
- Search again until all getlines are changed
- Type “:wq” to save and quit
Need to change “=r” to “=q” in arch/x86/boot/boot.h
- Type “vi arch/x86/boot/boot.h”
- Type “112” to get to line 112
- Hit “i” to get into insert mode
- Change “=r” to “=q”
- Hit the “Escape” key to get out of insert mode
- Type “:wq” to save and quit
Compile & install kernel. (-j 4 is to use all 4 CPU cores, should take 15 minutes)

If you didn't enable all required modules, you will get error messages hinting which ones you need here.

Make sure “fstab” has correct paths.

The first couple of lines should look something like this:
If not, change the part after “/dev/” to “sda#” where # is the corresponding number to “-part#”.

You'll need to change the grub config file to match the changes in fstab.
- Grub is the software that controls which OS to boot into during boot up.
- Files to configure Grub are located at /boot/grub/
- You may also want to edit the menu.lst file to your desire.

How To Install MicroEnable Frame Grabber Drivers/Software

This guide assumes that you have downloaded and untarred the following files under /home/lims/Download/menable/
- menable_linuxdrv_3.9.10.tar.bz2
- siso-rt3-meIII-3.2.1-2.i586.rpm
- siso-rt-basesystem-1.0.0-1.i586.rpm
We will be using menable_linuxdrv_3.9.10 with the 2.6.23.7-rt17 kernel.

Drivers

Go into the root of the driver folder.
We need to copy the binary objects from the subdirectory matching our kernel and architecture.
Compile
Install the compiled driver
Confirm the install

The output should look like this:

Software

Go into the direction with the RPM packages.
Install the two rpm packages.
Confirm the install
- Use the following commands to see where the files should have been installed.
- Go into the directories and see if they are there.

How To Install OpenCV on Linux

Make sure the following packages are installed.
- gcc (version 4.x)
- cmake (version 2.6 or higher)
- pkg-config
Download module “FindOpenCV.cmake” and add it to cmake.
- Module can be downloaded via: http://opencv.willowgarage.com/wiki/Getting_started?action=AttachFile&do=view&target=FindOpenCV.cmake
- Download and place file into /usr/share/cmake/Modules.
Download OpenCV source code, configure, compile and install.
- Source can be downloaded via: http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.0/
- ./configure
- make
- make install
(Optional) Go through the “Hello World” tutorial to make sure it works.
- http://opencv.willowgarage.com/wiki/Getting_started

Benchmarking

This section was an attempt to see how "real-time" the operating system is.

Cyclictest

Cyclictest is program available on the rt-wiki for testing the latency of commands on your operating system. All the code and instructions are available on their website.

http://rt.wiki.kernel.org/index.php/Cyclictest

Homemade Test

I found a "Hello World" code for real-time applications on the -rt Linux patch online. I then modified the code such that it would have the highest possible priority (1) and that it would take a snapshot of the timer every 10ms.

To compile: g++ main.cpp -o main -lrt

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <sched.h>
#include <sys/mman.h>
#include <string.h>

#define MY_PRIORITY (1 /* we use 1 as the PRREMPT_RT use 50
                            as the priority of kernel tasklets
                            and interrupt handler by default */

#define MAX_SAFE_STACK (8*1024) /* The maximum stack size which is
                                   guranteed safe to access without
                                   faulting */

#define NSEC_PER_SEC (1000000000) /* The number of nsecs per sec. */

void stack_prefault(void) {
    unsigned char dummy[MAX_SAFE_STACK];
    memset(&dummy, 0, MAX_SAFE_STACK);
    return;
}

int main(int argc, char* argv[]) {

    //int time[1002];
    //int rc[1002];

    struct timespec t;
    struct timespec ti[1002];
    struct timespec tt[1002];
    struct timespec tf[1002];
    struct sched_param param;
    int interval = 10000000; /* 10ms*/

    /* Declare ourself as a real time task */

    param.sched_priority = MY_PRIORITY;
    if(sched_setscheduler(0, SCHED_FIFO, &param) == -1) {
        perror("sched_setscheduler failed");
        exit(-1);
    }

    /* Lock memory */
 
    if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
        perror("mlockall failed");
        exit(-2);
    }

    /* Pre-fault our stack */
 
    stack_prefault();

    clock_gettime(CLOCK_MONOTONIC ,&t);
    /* start after one second */
    t.tv_sec++;
 
    int ii = 0;
    while(ii < 1002) {
    
        tt[ii].tv_nsec = t.tv_nsec; //target time stamp
        clock_gettime(CLOCK_MONOTONIC ,&ti[ii]); //time before releasing CPU

        /* release CPU until target time stamp is reached */
        clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &t, NULL);
        
        /* do the stuff */
        clock_gettime(CLOCK_MONOTONIC ,&tf[ii]); //time right after waking up
        ii++;

        /* calculate next target */
        t.tv_nsec += interval;
        while (t.tv_nsec >= NSEC_PER_SEC) {
            t.tv_nsec -= NSEC_PER_SEC;
            t.tv_sec++;
        }
    }
    for(int ii = 1; ii < 1002; ii++){
        printf("%d %d %d %d\t%d\n", ti[ii].tv_nsec, tt[ii].tv_nsec, tf[ii].tv_nsec, tf[ii].tv_nsec - tt[ii].tv_nsec, tf[ii].tv_nsec - tf[ii-1].tv_nsec);
    }
}

PThread Test

This is a test that uses multiple threads where only one is set with a high priority and the rest are normal priority. The high priority thread is also reading one analog signal from a data acquisition card and writing one analog signal to an output card. In this particular test I had four normal priority threads, each continuously performing of the the basic simple math operation (+, -, *, /). This was then tested on a system with a quad core CPU. The four normal priority threads is to ensure that there are more than four active tasks for the CPU to handle. In a true setup, a total of two threads may be enough for your needs. More details about the cards used in this test can be found here.

The output of the results will be in the form:
interval = nanoseconds per interrupt
duration = number of times to trigger
starttime = system time of first trigger
maxDelay = maximum delay in microseconds experienced in this setup (>99 shows up as 99)
bins[x] = number of occurrences with x microseconds of delay in this setup
earlytime[for the xth trigger] = actual wakeup time (t = target wakeup time) value = microseconds of delay
badtime[xth >30us delay] = actual wakeup time (t = target wakeup time) value = microseconds of delay in this setup

You can download a sample of results here.

To compile: gcc main.c -o main -lrt -lnidaqmx

#define _REENTRANT
#define NSEC_PER_SEC (1000000000) /* The number of nsecs per sec. */
#define MY_PRIORITY (2) /* we use 49 as the PRREMPT_RT use 50
			as the priority of kernel tasklets
			and interrupt handler by default */
#define MAX_SAFE_STACK (8*1024) /* The maximum stack size which is
				guranteed safe to access without
				faulting */
#define DAQmxErrChk(functionCall) if(DAQmxFailed(error=functionCall)) goto Error; else
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <sched.h>
#include <sys/mman.h>
#include <string.h>
#include <NIDAQmx.h>

/* function prototypes */
void* rtTask1( );
void* Task2( );
void* Task3( );
void* Task4( );
void* Task5( );

void stack_prefault(void) {
	unsigned char dummy[MAX_SAFE_STACK];
	memset(&dummy, 0, MAX_SAFE_STACK);
	return;
}

/* mutex variables */
pthread_mutex_t printfLock = PTHREAD_MUTEX_INITIALIZER;

int end2 = 0;
int end3 = 0;
int end4 = 0;
int end5 = 0;
double temp = 0;

int interval = 100000;		//100us "interrupt" interval
int duration = 10000*60*60*1;	//10khz for 1 hours

/* global variables for input/output */
TaskHandle inputTaskHandle=0;
TaskHandle outputTaskHandle=0;
int32 error=0;
int32 read;
float64 data;
char errBuff[2048]={'\0'};

int main( void ){
	pthread_t thr1, thr2, thr3, thr4, thr5;
	struct timespec t[14];

	/*********************************************/
	// DAQmx Configure Code
	/*********************************************/
	DAQmxErrChk(DAQmxCreateTask("AnalogIn",&inputTaskHandle));
	DAQmxErrChk(DAQmxCreateTask("AnalogOut",&outputTaskHandle));
	DAQmxErrChk(DAQmxCreateAIVoltageChan(inputTaskHandle,"AnalogIn/ai0","TestChannel",DAQmx_Val_RSE,-10.0,10.0,DAQmx_Val_Volts,NULL));
	DAQmxErrChk(DAQmxCreateAOVoltageChan(outputTaskHandle,"AnalogOut/ao6","",-10.0,10.0,DAQmx_Val_Volts,""));

	/*********************************************/
	// DAQmx Start Code
	/*********************************************/
	DAQmxErrChk(DAQmxStartTask(inputTaskHandle));
	DAQmxErrChk(DAQmxStartTask(outputTaskHandle));

	clock_gettime(CLOCK_MONOTONIC ,&t[0]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[1]);
	pthread_create( &thr2, NULL, Task2, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[2]);
	pthread_create( &thr3, NULL, Task3, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[3]);
	pthread_create( &thr4, NULL, Task4, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[4]);
	pthread_create( &thr5, NULL, Task5, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[5]);

	pthread_join( thr1, NULL );
	interval = 100000;		//100us "interrupt" interval
	duration = 10000*60*60*5;	//10khz for 5 hours
	clock_gettime(CLOCK_MONOTONIC ,&t[6]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[7]);

	pthread_join( thr1, NULL );
	interval = 200000;		//200us "interrupt" interval
	duration = 5000*60*60*5;	//5khz for 5 hours
	clock_gettime(CLOCK_MONOTONIC ,&t[8]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[9]);

	pthread_join( thr1, NULL );
	interval = 500000;		//500us "interrupt" interval
	duration = 2000*60*60*5;	//2khz for 5 hours
	clock_gettime(CLOCK_MONOTONIC ,&t[10]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[11]);

	pthread_join( thr1, NULL );
	interval = 1000000;		//1ms "interrupt" interval
	duration = 1000*60*60*5;	//1khz for 5 hours
	clock_gettime(CLOCK_MONOTONIC ,&t[12]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[13]);

	pthread_join( thr1, NULL );

	end2 = 1;
	end3 = 1;
	end4 = 1;
	end5 = 1;

	pthread_join( thr2, NULL );
	pthread_join( thr3, NULL );
	pthread_join( thr4, NULL );
	pthread_join( thr5, NULL );

	FILE *file;
	file = fopen("results","a+");

	int ii;
	for(ii = 1; ii < 14; ii++){
		fprintf(file, "Task%d %9d %9d\n", ii, t[ii].tv_nsec, t[ii].tv_nsec - t[ii-1].tv_nsec);
	}
	fprintf(file, "\n========================================\n");

	return 0;

Error:
	if( DAQmxFailed(error) ){
		DAQmxGetExtendedErrorInfo(errBuff,2048);
	}
	if( inputTaskHandle!=0 ){
		DAQmxStopTask(inputTaskHandle);
		DAQmxClearTask(inputTaskHandle);
	}
	if( outputTaskHandle!=0){
		DAQmxStopTask(outputTaskHandle);
		DAQmxClearTask(outputTaskHandle);
	}
	if( DAQmxFailed(error) ){
		printf("DAQmx Error: %s\n",errBuff);
	}
	return 0;
}

void* rtTask1(){

	int bins[100];
	int i, ii, jj;
	for( i = 0; i < 100; i++){
		bins[i] = 0;
	}
	struct timespec t;
	struct timespec timestamp;
	struct timespec starttime;
	struct timespec earlytime[100];
	struct timespec earlytime2[100];
	int earlyvalue[100];
	struct timespec badtime[10000];
	struct timespec badtime2[10000];
	int badvalue[10000];
	struct sched_param param;

	FILE *file;
	file = fopen("results","a+");

	/* Declare ourself as a real time task */
	param.sched_priority = MY_PRIORITY;
	if(sched_setscheduler(0, SCHED_FIFO, &param) == -1) {
		perror("sched_setscheduler failed");
		exit(-1);
	}

	/* Lock memory */
	if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
		perror("mlockall failed");
		exit(-2);
	}

	/* Pre-fault our stack */
	stack_prefault();

	clock_gettime(CLOCK_MONOTONIC,&t);
	/* start after one second */
	t.tv_sec++;
	starttime = t;
	jj = 0;

	for( i = 0; i < duration; ++i ) {
		/* wait until next shot */
		clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &t, NULL);
		clock_gettime(CLOCK_MONOTONIC ,&timestamp);
		if(timestamp.tv_sec == t.tv_sec){
			ii = (int)(((timestamp.tv_nsec - t.tv_nsec)/1000) - 1);
		}else{
			ii = (int)((((timestamp.tv_sec - t.tv_sec) * NSEC_PER_SEC - t.tv_nsec + timestamp.tv_nsec)/1000) - 1);
		}
		// store timestamps if latency > 30
		if(ii > 30){
			if(jj < 10000){
				badtime[jj] = timestamp;
				badtime2[jj] = t;
				badvalue[jj] = ii;
				jj++;
			}
		}
		// store timestamps for first 100 wakeups
		if(i < 100){
			earlytime[i] = timestamp;
			earlytime2[i] = t;
			earlyvalue[i] = ii;
		}
		// cap delay for bins[ii]
		if(ii > 99){
			ii = 99;
		}else if(ii < 0){
			ii = 0;
		}
		bins[ii]++;

		DAQmxErrChk(DAQmxReadAnalogScalarF64(inputTaskHandle,10.0,&data,NULL));
		DAQmxErrChk(DAQmxWriteAnalogScalarF64(outputTaskHandle,1,10.0,data,NULL));

		t.tv_nsec += interval;
		while (t.tv_nsec >= NSEC_PER_SEC) {
			t.tv_nsec -= NSEC_PER_SEC;
			t.tv_sec++;
		}
	}

	fprintf(file, "interval = %d\n", interval);
	fprintf(file, "duration = %d\n", duration);
	fprintf(file, "starttime = %ds\t%dns\n", starttime.tv_sec, starttime.tv_nsec);
	int maxDelay = 0;
	for( i = 0; i < 100; i++){
		if( bins[i] > 0 ){
			maxDelay = i;
		}
	}
	fprintf(file, "maxDelay = %d\n", maxDelay);
	for( i = 0; i < 100; i++){
		fprintf(file, "bins[%d] = %d\n", i, bins[i]);
	}
	for( i = 0; i < 100; i++){
		fprintf(file, "earlytime[%d] = %ds\t%dns (t = %ds\t%dns)\tvalue = %d\n", i, earlytime[i].tv_sec, earlytime[i].tv_nsec, earlytime2[i].tv_sec, earlytime2[i].tv_nsec, earlyvalue[i]);
	}
	for( i = 0; i < jj; i++){
		fprintf(file, "badtime[%d] = %ds\t%dns (t = %ds\t%dns)\tvalue = %d\n", i, badtime[i].tv_sec, badtime[i].tv_nsec, badtime2[i].tv_sec, badtime2[i].tv_nsec, badvalue[i]);
	}
	fprintf(file, "\n");
	fclose(file);
	return NULL;

Error:
	if( DAQmxFailed(error) ){
		DAQmxGetExtendedErrorInfo(errBuff,2048);
	}
	if( inputTaskHandle!=0 ){
		DAQmxStopTask(inputTaskHandle);
		DAQmxClearTask(inputTaskHandle);
	}
	if( outputTaskHandle!=0){
		DAQmxStopTask(outputTaskHandle);
		DAQmxClearTask(outputTaskHandle);
	}
	if( DAQmxFailed(error) ){
		printf("DAQmx Error: %s\n",errBuff);
	}
	return NULL;
}

void* Task2(){
	while( end2 == 0 ) {
		temp = temp + rand();
	}
	return NULL;
}

void* Task3(){
	while( end3 == 0 ) {
		temp = temp - rand();
	}
	return NULL;
}

void* Task4(){
	while( end4 == 0 ) {
		temp = temp * rand();
	}
	return NULL;
}

void* Task5(){
	while( end4 == 0 ) {
		temp = temp / rand();
	}
	return NULL;
}

Results

To ensure that the operating system is truly "real-time", I ran the two tests (Cyclictest and Homemade Test) under two conditions, load and no load. The no load condition is simply where except for system processes, only the test programs are running. The load condition is where I compile the source code of OpenCV and the Linux kernel using a niceness of -20.

As for the PThread Test, it was run under the no load condition.

Cyclictest

Using the Cyclictest program, I was able to get some very nice results.

Arguments:
t = number of threads to work
p = priority level to set
n = use nano_sleep
i = interval in microseconds
l = number of times to loop

Outputs:
T = ?
P = priority level
I = interval in microseconds
C = count (number of tasks performed, this number counts up during the test)
Min = minimum latency experienced
Act = current latency (this number changes during the test)
Avg = average latency experienced
Max = maximum latency experienced

No Load
- Priority Level = 1
- Priority Level = 2
- Priority Level = 80
Under Load
- Priority Level = 1
- Priority Level = 2
- Priority Level = 80

The most important number to note in the results is the max because in order to assure hard real-time performance, the system should never have a latency greater than a certain threshold. That threshold would be determined by the frequency of the task that you are trying to perform. For example, if you wanted to perform a task at 100kHz and the worst case latency is 5us, then your task should take no more than 5us since there are only 10us between each consecutive tasks.

It is interesting to note that when using a priority level of 1 in Cyclictest, we get a much higher max than a priority level of 2 or 80. I still don't have a reasoning for this, but this phenomenon is also apparent in the Homemade Test and PThread Test.

Homemade Test

The results of this test can be summarized and better visualized in histograms. Below are two histograms from each of the two conditions. It is important to note that these tests were performed with priority level set at 1. The PThread test is a better measure of how real-time the system is.

Latencies between target time stamp and recorded time stamp under no-load conditions.

Intervals between consecutive time stamps under no-load conditions.

Latencies between target time stamp and recorded time stamp under load conditions.

Intervals between consecutive time stamps under load conditions.

.

PThread Test

The PThread was run on four different frequencies, 1kHz, 2kHz, 5kHz and 10kHz. Each frequency was tested for 5 hours. I have decided to run the test with the priority level set at 2 due to the unexplained phenomenon of a higher max latency at priority level 1 experienced in the Cyclictest. The graphs below show a histogram of latencies experienced in microseconds. I only kept count of latencies of up to 100us because a latency of 100us or greater would be considered catastrophic. Furthermore, the maximum latency experienced at the start of the test are generally high, so I have decided to discard the first hour as warm up time for the system. However, a one second warm up will be enough in practice since the max latency is experienced within the first one millisecond. The results from test are shown below.

Linux-rt delay from wakeup target time at 1kHz.

Linux-rt delay from wakeup target time at 2kHz.

Linux-rt delay from wakeup target time at 5kHz.

Linux-rt delay from wakeup target time at 10kHz.

.

Plotting Data in Real-Time

Here's a version of the PThread Test code that reads the encoder counts at 10kHz for 1 minute and then 5kHz at 1 minute. While it is reading, it will output motor angle at 10Hz. If you pipe this into the driveGnuPlotStream.pl script, it will plot the motor angle in real-time at 10Hz.

Click here to download or for more details about the driveGnuPlotStream.pl script.

To compile: gcc main.c -o main -lrt -lnidaqmx
To run:

sudo ./main | perl ./driveGnuPlotStreams.pl 1 1 50 25000 45000 500x300+0+0 'MotorAngle' 0

Note: You will need to have the script in the same directory.

#define _REENTRANT
#define NSEC_PER_SEC (1000000000) /* The number of nsecs per sec. */
#define MY_PRIORITY (2) /* we use 49 as the PRREMPT_RT use 50
			as the priority of kernel tasklets
			and interrupt handler by default */
#define MAX_SAFE_STACK (8*1024) /* The maximum stack size which is
				guranteed safe to access without
				faulting */
#define DAQmxErrChk(functionCall) if(DAQmxFailed(error=functionCall)) goto Error; else
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <sched.h>
#include <sys/mman.h>
#include <string.h>
#include <NIDAQmx.h>
#include <math.h>

/* function prototypes */
void* rtTask1( );
void* Task2( );
void* Task3( );
void* Task4( );
void* Task5( );

void stack_prefault(void) {
	unsigned char dummy[MAX_SAFE_STACK];
	memset(&dummy, 0, MAX_SAFE_STACK);
	return;
}

/* mutex variables */
pthread_mutex_t printfLock = PTHREAD_MUTEX_INITIALIZER;

int end2 = 0;
int end3 = 0;
int end4 = 0;
int end5 = 0;
int aa = 0;
double temp = 0;

int interval = 100000;		//100us "interrupt" interval
int duration = 10000;		//10khz for 1 second

/* global variables for input/output */
TaskHandle encoderTaskHandle=0;
int32 error=0;
int32 read;
float64 motorAngle;
char errBuff[2048]={'\0'};

int main( void ){
	pthread_t thr1, thr2, thr3, thr4, thr5;
	struct timespec t[10];

	/*********************************************/
	// DAQmx Configure Code
	/*********************************************/
	DAQmxErrChk(DAQmxCreateTask("EncoderTask",&encoderTaskHandle));
	DAQmxErrChk(DAQmxCreateCIAngEncoderChan(encoderTaskHandle,"AnalogIn/ctr0","Counter",DAQmx_Val_X4,0,0.0,DAQmx_Val_AHighBHigh,DAQmx_Val_Degrees,600,36000.0,""));

	/*********************************************/
	// DAQmx Start Code
	/*********************************************/
	DAQmxErrChk(DAQmxStartTask(encoderTaskHandle));

	clock_gettime(CLOCK_MONOTONIC ,&t[0]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[1]);
	pthread_create( &thr2, NULL, Task2, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[2]);
	pthread_create( &thr3, NULL, Task3, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[3]);
	pthread_create( &thr4, NULL, Task4, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[4]);
	pthread_create( &thr5, NULL, Task5, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[5]);

	pthread_join( thr1, NULL );
	interval = 100000;		//100us "interrupt" interval
	duration = 10000*60;		//10khz for 1 minute
	clock_gettime(CLOCK_MONOTONIC ,&t[6]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[7]);

	pthread_join( thr1, NULL );
	interval = 200000;		//200us "interrupt" interval
	duration = 5000*60;		//5khz for 1 minute
	clock_gettime(CLOCK_MONOTONIC ,&t[8]);
	pthread_create( &thr1, NULL, rtTask1, NULL );
	clock_gettime(CLOCK_MONOTONIC ,&t[9]);

	pthread_join( thr1, NULL );

	end2 = 1;
	end3 = 1;
	end4 = 1;
	end5 = 1;

	pthread_join( thr2, NULL );
	pthread_join( thr3, NULL );
	pthread_join( thr4, NULL );
	pthread_join( thr5, NULL );

	FILE *file;
	file = fopen("encodertiming","a+");

	int ii;
	for(ii = 1; ii < 10; ii++){
		fprintf(file, "Task%d %9d %9d\n", ii, t[ii].tv_nsec, t[ii].tv_nsec - t[ii-1].tv_nsec);
	}
	fprintf(file, "\n========================================\n");

	return 0;

Error:
	if( DAQmxFailed(error) ){
		DAQmxGetExtendedErrorInfo(errBuff,2048);
	}
	if( encoderTaskHandle!=0 ){
		DAQmxStopTask(encoderTaskHandle);
		DAQmxClearTask(encoderTaskHandle);
	}
	if( DAQmxFailed(error) ){
		printf("DAQmx Error: %s\n",errBuff);
	}
	return 0;
}

void* rtTask1(){

	int bins[100];
	int i, ii, jj;
	for( i = 0; i < 100; i++){
		bins[i] = 0;
	}
	struct timespec t;
	struct timespec timestamp;
	struct timespec starttime;
	struct timespec earlytime[100];
	struct timespec earlytime2[100];
	int earlyvalue[100];
	struct timespec badtime[10000];
	struct timespec badtime2[10000];
	int badvalue[10000];
	struct sched_param param;

	FILE *file;
	file = fopen("encodertiming","a+");
	int plotInterval = NSEC_PER_SEC / interval / 10;

	/* Declare ourself as a real time task */
	param.sched_priority = MY_PRIORITY;
	if(sched_setscheduler(0, SCHED_FIFO, &param) == -1) {
		perror("sched_setscheduler failed");
		exit(-1);
	}

	/* Lock memory */
	if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
		perror("mlockall failed");
		exit(-2);
	}

	/* Pre-fault our stack */
	stack_prefault();

	clock_gettime(CLOCK_MONOTONIC,&t);
	/* start after one second */
	t.tv_sec++;
	starttime = t;
	jj = 0;

	for( i = 0; i < duration; ++i ) {
		/* wait until next shot */
		clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &t, NULL);
		clock_gettime(CLOCK_MONOTONIC ,&timestamp);
		if(timestamp.tv_sec == t.tv_sec){
			ii = (int)(((timestamp.tv_nsec - t.tv_nsec)/1000) - 1);
		}else{
			ii = (int)((((timestamp.tv_sec - t.tv_sec) * NSEC_PER_SEC - t.tv_nsec + timestamp.tv_nsec)/1000) - 1);
		}
		// store timestamps if latency > 30
		if(ii > 30){
			if(jj < 10000){
				badtime[jj] = timestamp;
				badtime2[jj] = t;
				badvalue[jj] = ii;
				jj++;
			}
		}
		// store timestamps for first 100 wakeups
		if(i < 100){
			earlytime[i] = timestamp;
			earlytime2[i] = t;
			earlyvalue[i] = ii;
		}
		// cap delay for bins[ii]
		if(ii > 99){
			ii = 99;
		}else if(ii < 0){
			ii = 0;
		}
		bins[ii]++;

		DAQmxErrChk(DAQmxReadCounterF64(encoderTaskHandle,1,10.0,&motorAngle,1,&read,NULL));

		if(i % plotInterval == 0){
			printf("0:%f\n",motorAngle);
			fflush(stdout);
		}

		t.tv_nsec += interval;
		while (t.tv_nsec >= NSEC_PER_SEC) {
			t.tv_nsec -= NSEC_PER_SEC;
			t.tv_sec++;
		}
	}

	fprintf(file, "interval = %d\n", interval);
	fprintf(file, "duration = %d\n", duration);
	fprintf(file, "starttime = %ds\t%dns\n", starttime.tv_sec, starttime.tv_nsec);
	int maxDelay = 0;
	for( i = 0; i < 100; i++){
		if( bins[i] > 0 ){
			maxDelay = i;
		}
	}
	fprintf(file, "maxDelay = %d\n", maxDelay);
	for( i = 0; i < 100; i++){
		fprintf(file, "bins[%d] = %d\n", i, bins[i]);
	}
	for( i = 0; i < 100; i++){
		fprintf(file, "earlytime[%d] = %ds\t%dns (t = %ds\t%dns)\tvalue = %d\n", i, earlytime[i].tv_sec, earlytime[i].tv_nsec, earlytime2[i].tv_sec, earlytime2[i].tv_nsec, earlyvalue[i]);
	}
	for( i = 0; i < jj; i++){
		fprintf(file, "badtime[%d] = %ds\t%dns (t = %ds\t%dns)\tvalue = %d\n", i, badtime[i].tv_sec, badtime[i].tv_nsec, badtime2[i].tv_sec, badtime2[i].tv_nsec, badvalue[i]);
	}
	fprintf(file, "\n");
	fclose(file);
	return NULL;

Error:
	if( DAQmxFailed(error) ){
		DAQmxGetExtendedErrorInfo(errBuff,2048);
	}
	if( encoderTaskHandle!=0 ){
		DAQmxStopTask(encoderTaskHandle);
		DAQmxClearTask(encoderTaskHandle);
	}
	if( DAQmxFailed(error) ){
		printf("DAQmx Error: %s\n",errBuff);
	}
	return NULL;
}

void* Task2(){
	while( end2 == 0 ) {
		temp = temp + rand();
	}
	return NULL;
}

void* Task3(){
	while( end3 == 0 ) {
		temp = temp - rand();
	}
	return NULL;
}

void* Task4(){
	while( end4 == 0 ) {
		temp = temp * rand();
	}
	return NULL;
}

void* Task5(){
	while( end4 == 0 ) {
		temp = temp / rand();
	}
	return NULL;
}

Future Work

Now that we have an operating system with sub 100us latency, our next step is to port the code for the camera over to the Linux side. Once that is done, we would like to be able to capture images at 1000fps or better and track the location of bright spots in the image.

Useful Links

Micro-Kernel Approach

Scheduling Approach

Wiki on -rt Linux

A "Hello World" Example

Difference between revisions of "Real-Time Linux for TrackCam"

Latest revision as of 21:41, 20 June 2010

Contents

Overview

Implementations of Real-Time Operating Systems

Micro-Kernel

Scheduling

Our Setup

Computer

Camera

How To Setup Linux-rt On openSUSE

Overview

Instructions

How To Install MicroEnable Frame Grabber Drivers/Software

Drivers

Software

How To Install OpenCV on Linux

Benchmarking

Cyclictest

Homemade Test

PThread Test

Results

Cyclictest

Homemade Test

PThread Test

Plotting Data in Real-Time

Future Work

Useful Links

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Tools

Search

@@ Line 8: / Line 8: @@
 ==Implementations of Real-Time Operating Systems==
-There are currently two main methods of implementing a real-time operating system, a micro-kernel approach and a scheduling approach.
+There are currently two main methods of implementing a real-time operating system, a micro-kernel approach and a scheduling approach. We will be using an operating system which implements the scheduling approach because it is the operating system that the makers of TrackCam supports.
 ===Micro-Kernel===
@@ Line 350: / Line 350: @@
 ===PThread Test===
-This is a test that uses multiple threads where only one is set with a high priority and the rest are normal priority. The high priority thread is also reading one analog signal from a data acquisition card and writing one analog signal to an output card. In this particular test I had four normal priority threads, each continuously performing of the the basic simple math operation (+, -, *, /). This was then tested on a system with a quad core CPU. The four normal priority threads is to ensure that there are more than four active tasks for the CPU to handle. In a true setup, a total of two threads may be enough for your needs. More details about the cards used in this test can be found here.
+This is a test that uses multiple threads where only one is set with a high priority and the rest are normal priority. The high priority thread is also reading one analog signal from a data acquisition card and writing one analog signal to an output card. In this particular test I had four normal priority threads, each continuously performing of the the basic simple math operation (+, -, *, /). This was then tested on a system with a quad core CPU. The four normal priority threads is to ensure that there are more than four active tasks for the CPU to handle. In a true setup, a total of two threads may be enough for your needs. More details about the cards used in this test can be found [http://hades.mech.northwestern.edu/index.php/NI-DAQ_Cards_on_Linux here].<br>
+<br>
+The output of the results will be in the form:<br>
+interval = nanoseconds per interrupt<br>
+duration = number of times to trigger<br>
+starttime = system time of first trigger<br>
+maxDelay = maximum delay in microseconds experienced in this setup (>99 shows up as 99)<br>
+bins[x] = number of occurrences with x microseconds of delay in this setup<br>
+earlytime[for the xth trigger] = actual wakeup time (t = target wakeup time) value = microseconds of delay<br>
+badtime[xth >30us delay] = actual wakeup time (t = target wakeup time) value = microseconds of delay in this setup<br>
+<br>
+You can download a sample of results [http://hades.mech.northwestern.edu/images/b/b9/RTLinux-PThreadTest-SampleResults.zip here].<br>
+<br>
 To compile: gcc main.c -o main -lrt -lnidaqmx
@@ Line 663: / Line 674: @@
 ==Results==
 To ensure that the operating system is truly "real-time", I ran the two tests (Cyclictest and Homemade Test) under two conditions, load and no load. The no load condition is simply where except for system processes, only the test programs are running. The load condition is where I compile the source code of OpenCV and the Linux kernel using a niceness of -20.
+As for the PThread Test, it was run under the no load condition.
 ===Cyclictest===
 Using the Cyclictest program, I was able to get some very nice results.
-Arguments:
+Arguments:<br>
-t = number of threads to work
+t = number of threads to work<br>
-p = priority level to set
+p = priority level to set<br>
-n = use nano_sleep
+n = use nano_sleep<br>
-i = interval in microseconds
+i = interval in microseconds<br>
-l = number of times to loop
+l = number of times to loop<br>
+<br>
+Outputs:<br>
+T = ?<br>
+P = priority level<br>
+I = interval in microseconds<br>
+C = count (number of tasks performed, this number counts up during the test)<br>
+Min = minimum latency experienced<br>
+Act = current latency (this number changes during the test)<br>
+Avg = average latency experienced<br>
+Max = maximum latency experienced<br>
 <ul>
@@ Line 708: / Line 731: @@
  </li>
 </ul>
+The most important number to note in the results is the max because in order to assure hard real-time performance, the system should never have a latency greater than a certain threshold. That threshold would be determined by the frequency of the task that you are trying to perform. For example, if you wanted to perform a task at 100kHz and the worst case latency is 5us, then your task should take no more than 5us since there are only 10us between each consecutive tasks.<br>
-It is interesting to note that when using a priority level of 1 in Cyclictest, we get a much higher Max than a priority level of 2 or 80.
+<br>
+It is interesting to note that when using a priority level of 1 in Cyclictest, we get a much higher max than a priority level of 2 or 80. I still don't have a reasoning for this, but this phenomenon is also apparent in the Homemade Test and PThread Test.
 ===Homemade Test===
-The results of this test can be summarized and better visualized in histograms. Below are two histograms from each of the two conditions.
+The results of this test can be summarized and better visualized in histograms. Below are two histograms from each of the two conditions. It is important to note that these tests were performed with priority level set at 1. The PThread test is a better measure of how real-time the system is.
 [[image:RTLinux_Homemade_Latency_Noload.JPG|thumb|500px|Latencies between target time stamp and recorded time stamp under no-load conditions.|left]]
@@ Line 776: / Line 801: @@
 .
-===Pthread Test===
+===PThread Test===
+The PThread was run on four different frequencies, 1kHz, 2kHz, 5kHz and 10kHz. Each frequency was tested for 5 hours. I have decided to run the test with the priority level set at 2 due to the unexplained phenomenon of a higher max latency at priority level 1 experienced in the Cyclictest. The graphs below show a histogram of latencies experienced in microseconds. I only kept count of latencies of up to 100us because a latency of 100us or greater would be considered catastrophic. Furthermore, the maximum latency experienced at the start of the test are generally high, so I have decided to discard the first hour as warm up time for the system. However, a one second warm up will be enough in practice since the max latency is experienced within the first one millisecond. The results from test are shown below.
-I used my test program to test "interrupts" of four different frequencies, 1kHz, 2kHz, 5kHz and 10kHz. The results of these test are shown below.
+[[image:RTLinux1kHz.jpg|thumb|500px|Linux-rt delay from wakeup target time at 1kHz.|left]]
+[[image:RTLinux2kHz.jpg|thumb|500px|Linux-rt delay from wakeup target time at 2kHz.|right]]
+[[image:RTLinux5khz.jpg|thumb|500px|Linux-rt delay from wakeup target time at 5kHz.|left]]
+[[image:RTLinux10khz.jpg|thumb|500px|Linux-rt delay from wakeup target time at 10kHz.|right]]
-[[image:RTLinux_Homemade_Latency_Noload.JPG|thumb|500px|Latencies between target time stamp and recorded time stamp under no-load conditions.|left]]
-[[image:RTLinux_Homemade_Interval_Noload.JPG|thumb|500px|Intervals between consecutive time stamps under no-load conditions.|left]]
-[[image:RTLinux_Homemade_Latency_Underload.JPG|thumb|500px|Latencies between target time stamp and recorded time stamp under load conditions.|left]]
-[[image:RTLinux_Homemade_Interval_Underload.JPG|thumb|500px|Intervals between consecutive time stamps under load conditions.|left]]
@@ Line 840: / Line 876: @@
 .
+==Plotting Data in Real-Time==
+Here's a version of the PThread Test code that reads the encoder counts at 10kHz for 1 minute and then 5kHz at 1 minute. While it is reading, it will output motor angle at 10Hz. If you pipe this into the driveGnuPlotStream.pl script, it will plot the motor angle in real-time at 10Hz.<br>
+<br>
+Click [http://www.lysium.de/blog/index.php?/archives/234-Plotting-data-with-gnuplot-in-real-time.html here] to download or for more details about the driveGnuPlotStream.pl script.<br>
+<br>
+To compile: gcc main.c -o main -lrt -lnidaqmx<br>
+To run:
+ sudo ./main | perl ./driveGnuPlotStreams.pl 1 1 50 25000 45000 500x300+0+0 'MotorAngle' 0
+Note: You will need to have the script in the same directory.
+<pre>
+#define _REENTRANT
+#define NSEC_PER_SEC (1000000000) /* The number of nsecs per sec. */
+#define MY_PRIORITY (2) /* we use 49 as the PRREMPT_RT use 50
+			as the priority of kernel tasklets
+			and interrupt handler by default */
+#define MAX_SAFE_STACK (8*1024) /* The maximum stack size which is
+				guranteed safe to access without
+				faulting */
+#define DAQmxErrChk(functionCall) if(DAQmxFailed(error=functionCall)) goto Error; else
+#include <pthread.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <sys/time.h>
+#include <sched.h>
+#include <sys/mman.h>
+#include <string.h>
+#include <NIDAQmx.h>
+#include <math.h>
+/* function prototypes */
+void* rtTask1( );
+void* Task2( );
+void* Task3( );
+void* Task4( );
+void* Task5( );
+void stack_prefault(void) {
+	unsigned char dummy[MAX_SAFE_STACK];
+	memset(&dummy, 0, MAX_SAFE_STACK);
+	return;
+}
+/* mutex variables */
+pthread_mutex_t printfLock = PTHREAD_MUTEX_INITIALIZER;
+int end2 = 0;
+int end3 = 0;
+int end4 = 0;
+int end5 = 0;
+int aa = 0;
+double temp = 0;
+int interval = 100000;		//100us "interrupt" interval
+int duration = 10000;		//10khz for 1 second
+/* global variables for input/output */
+TaskHandle encoderTaskHandle=0;
+int32 error=0;
+int32 read;
+float64 motorAngle;
+char errBuff[2048]={'\0'};
+int main( void ){
+	pthread_t thr1, thr2, thr3, thr4, thr5;
+	struct timespec t[10];
+	/*********************************************/
+	// DAQmx Configure Code
+	/*********************************************/
+	DAQmxErrChk(DAQmxCreateTask("EncoderTask",&encoderTaskHandle));
+	DAQmxErrChk(DAQmxCreateCIAngEncoderChan(encoderTaskHandle,"AnalogIn/ctr0","Counter",DAQmx_Val_X4,0,0.0,DAQmx_Val_AHighBHigh,DAQmx_Val_Degrees,600,36000.0,""));
+	/*********************************************/
+	// DAQmx Start Code
+	/*********************************************/
+	DAQmxErrChk(DAQmxStartTask(encoderTaskHandle));
+	clock_gettime(CLOCK_MONOTONIC ,&t[0]);
+	pthread_create( &thr1, NULL, rtTask1, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[1]);
+	pthread_create( &thr2, NULL, Task2, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[2]);
+	pthread_create( &thr3, NULL, Task3, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[3]);
+	pthread_create( &thr4, NULL, Task4, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[4]);
+	pthread_create( &thr5, NULL, Task5, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[5]);
+	pthread_join( thr1, NULL );
+	interval = 100000;		//100us "interrupt" interval
+	duration = 10000*60;		//10khz for 1 minute
+	clock_gettime(CLOCK_MONOTONIC ,&t[6]);
+	pthread_create( &thr1, NULL, rtTask1, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[7]);
+	pthread_join( thr1, NULL );
+	interval = 200000;		//200us "interrupt" interval
+	duration = 5000*60;		//5khz for 1 minute
+	clock_gettime(CLOCK_MONOTONIC ,&t[8]);
+	pthread_create( &thr1, NULL, rtTask1, NULL );
+	clock_gettime(CLOCK_MONOTONIC ,&t[9]);
+	pthread_join( thr1, NULL );
+	end2 = 1;
+	end3 = 1;
+	end4 = 1;
+	end5 = 1;
+	pthread_join( thr2, NULL );
+	pthread_join( thr3, NULL );
+	pthread_join( thr4, NULL );
+	pthread_join( thr5, NULL );
+	FILE *file;
+	file = fopen("encodertiming","a+");
+	int ii;
+	for(ii = 1; ii < 10; ii++){
+		fprintf(file, "Task%d %9d %9d\n", ii, t[ii].tv_nsec, t[ii].tv_nsec - t[ii-1].tv_nsec);
+	}
+	fprintf(file, "\n========================================\n");
+	return 0;
+Error:
+	if( DAQmxFailed(error) ){
+		DAQmxGetExtendedErrorInfo(errBuff,2048);
+	}
+	if( encoderTaskHandle!=0 ){
+		DAQmxStopTask(encoderTaskHandle);
+		DAQmxClearTask(encoderTaskHandle);
+	}
+	if( DAQmxFailed(error) ){
+		printf("DAQmx Error: %s\n",errBuff);
+	}
+	return 0;
+}
+void* rtTask1(){
+	int bins[100];
+	int i, ii, jj;
+	for( i = 0; i < 100; i++){
+		bins[i] = 0;
+	}
+	struct timespec t;
+	struct timespec timestamp;
+	struct timespec starttime;
+	struct timespec earlytime[100];
+	struct timespec earlytime2[100];
+	int earlyvalue[100];
+	struct timespec badtime[10000];
+	struct timespec badtime2[10000];
+	int badvalue[10000];
+	struct sched_param param;
+	FILE *file;
+	file = fopen("encodertiming","a+");
+	int plotInterval = NSEC_PER_SEC / interval / 10;
+	/* Declare ourself as a real time task */
+	param.sched_priority = MY_PRIORITY;
+	if(sched_setscheduler(0, SCHED_FIFO, &param) == -1) {
+		perror("sched_setscheduler failed");
+		exit(-1);
+	}
+	/* Lock memory */
+	if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) {
+		perror("mlockall failed");
+		exit(-2);
+	}
+	/* Pre-fault our stack */
+	stack_prefault();
+	clock_gettime(CLOCK_MONOTONIC,&t);
+	/* start after one second */
+	t.tv_sec++;
+	starttime = t;
+	jj = 0;
+	for( i = 0; i < duration; ++i ) {
+		/* wait until next shot */
+		clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &t, NULL);
+		clock_gettime(CLOCK_MONOTONIC ,&timestamp);
+		if(timestamp.tv_sec == t.tv_sec){
+			ii = (int)(((timestamp.tv_nsec - t.tv_nsec)/1000) - 1);
+		}else{
+			ii = (int)((((timestamp.tv_sec - t.tv_sec) * NSEC_PER_SEC - t.tv_nsec + timestamp.tv_nsec)/1000) - 1);
+		}
+		// store timestamps if latency > 30
+		if(ii > 30){
+			if(jj < 10000){
+				badtime[jj] = timestamp;
+				badtime2[jj] = t;
+				badvalue[jj] = ii;
+				jj++;
+			}
+		}
+		// store timestamps for first 100 wakeups
+		if(i < 100){
+			earlytime[i] = timestamp;
+			earlytime2[i] = t;
+			earlyvalue[i] = ii;
+		}
+		// cap delay for bins[ii]
+		if(ii > 99){
+			ii = 99;
+		}else if(ii < 0){
+			ii = 0;
+		}
+		bins[ii]++;
+		DAQmxErrChk(DAQmxReadCounterF64(encoderTaskHandle,1,10.0,&motorAngle,1,&read,NULL));
+		if(i % plotInterval == 0){
+			printf("0:%f\n",motorAngle);
+			fflush(stdout);
+		}
+		t.tv_nsec += interval;
+		while (t.tv_nsec >= NSEC_PER_SEC) {
+			t.tv_nsec -= NSEC_PER_SEC;
+			t.tv_sec++;
+		}
+	}
+	fprintf(file, "interval = %d\n", interval);
+	fprintf(file, "duration = %d\n", duration);
+	fprintf(file, "starttime = %ds\t%dns\n", starttime.tv_sec, starttime.tv_nsec);
+	int maxDelay = 0;
+	for( i = 0; i < 100; i++){
+		if( bins[i] > 0 ){
+			maxDelay = i;
+		}
+	}
+	fprintf(file, "maxDelay = %d\n", maxDelay);
+	for( i = 0; i < 100; i++){
+		fprintf(file, "bins[%d] = %d\n", i, bins[i]);
+	}
+	for( i = 0; i < 100; i++){
+		fprintf(file, "earlytime[%d] = %ds\t%dns (t = %ds\t%dns)\tvalue = %d\n", i, earlytime[i].tv_sec, earlytime[i].tv_nsec, earlytime2[i].tv_sec, earlytime2[i].tv_nsec, earlyvalue[i]);
+	}
+	for( i = 0; i < jj; i++){
+		fprintf(file, "badtime[%d] = %ds\t%dns (t = %ds\t%dns)\tvalue = %d\n", i, badtime[i].tv_sec, badtime[i].tv_nsec, badtime2[i].tv_sec, badtime2[i].tv_nsec, badvalue[i]);
+	}
+	fprintf(file, "\n");
+	fclose(file);
+	return NULL;
+Error:
+	if( DAQmxFailed(error) ){
+		DAQmxGetExtendedErrorInfo(errBuff,2048);
+	}
+	if( encoderTaskHandle!=0 ){
+		DAQmxStopTask(encoderTaskHandle);
+		DAQmxClearTask(encoderTaskHandle);
+	}
+	if( DAQmxFailed(error) ){
+		printf("DAQmx Error: %s\n",errBuff);
+	}
+	return NULL;
+}
+void* Task2(){
+	while( end2 == 0 ) {
+		temp = temp + rand();
+	}
+	return NULL;
+}
+void* Task3(){
+	while( end3 == 0 ) {
+		temp = temp - rand();
+	}
+	return NULL;
+}
+void* Task4(){
+	while( end4 == 0 ) {
+		temp = temp * rand();
+	}
+	return NULL;
+}
+void* Task5(){
+	while( end4 == 0 ) {
+		temp = temp / rand();
+	}
+	return NULL;
+}
+</pre>
 ==Future Work==