Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently, it evaluates the average time rate (over 1e6 iterations) to complete the number of products indicated as well as the sum of the resulting vector. The length of the vector varies according to the number of samples.For a fixed number of samples, the code uses if statements to directly indicates how many filters to use. For the code using the binary decision tree, the number of filters is passed as a prepocessor argument. Once we know how many filters are necessary this will become a fixed value. one linear transformation and two inner products. Each pixel has its own ring buffer of 10 windows, each window has 100 data samples of uint_16. The buffers are refilled as the reader nears the empty flag from Random data. The buffers are implemented using Eigen Matrix library.

These benchmarks were obtained os psanagpu116, using O3 and vectored optimization. As a reference, 10 kHz operation gives a time window of 100 µs.

Code Block
languagebash
titleCompile flags
g++ -std=c++11 -O3 -DNDEBUG -march=native -IEigen main.cpp

 

All values are in μs.

...

Number of products

Number of samples

...

The following graph shows the calculation rate per pixel as a function of the density of hits. While the data generation is not in the scope of the timer, it may still affect the start and stop of said timer. Clemens algorithm pre generates all data and thus times one continuous loop. Thus if there are no events, the loop is very very efficient. However, the Eigen library used in the most recent version of the code does accelerate the calculation of the inner products, as shown by the better performance at higher hit rates. 

Image Added

...

 

Latest code :

Latest integration iteration (Working copy)11/7/2018 The current code is functional and creates random data, selects a case and runs the inner products on a self-renewing stream. It currently integrates Eigen u_int arrays in ring buffers with Eigen float arrays for filters  with the switch case strategy to distinguish between the 6 filter cases. However it suffers from slower than expected performance which has been partially fixed and the other part is still being tracked down. It seems to be slow loop behavior however the instruction causing the loop to hang is eluding me...

 

Original codes before combinations.

...