Parallelism in ImageSDK

ImageSDK is designed around the idea that concurrency is handled per image. This means, multiple threads process a single image, so all of the cores of the CPU can be utilized most efficiently. To do this ImageSDK maintains a thread pool, that by default, has CPU cores times 2 number of threads.

Further, ImageSDK maintains a line cache, that caches frequently accessed lines in the image. This is the cache that takes up 2-3 GB of memory! To be most efficient, this cache should include data from only a single image at a time.

Using the thread pool and line cache, the built-in parallelism thread fan-out occurs like this:

Image convertion/decode thread fan-out

Each available thread in the thread pool processes an independent portion of the image. This happens for both Decode and Convert pipelines.

Since conversion of a single image, is intended to take all available CPU resources on the machine, ImageSDK does not account for a scenario where you convert multiple images in parallel.

Converting multiple images in parallel

If two RawImage’s try to Convert (or Decode) in parallel, the requests are serialised, since there are only one global thread pool and one line cache.

This figure illustrates the shared resources used during image conversion:

The shared global resources used in image convertion

If you really wish to process multiple images in parallel, your only option is to do so in separate processes.

There is currently no way of creating multiple line caches in the ImageSDK’s shared state. So using separate processes is the way to work around this limitation.

In any case, processing images in parallel would require 2-3 GB of RAM per image, to be efficient.

Throttling Parallelism

It is possible to control the number of threads in ImageSDK’s internal thread pool, by providing an upper boundary on the number of threads used.

Call this method, before any other call to ImageSDK, to set the maximum allowed number of threads to 4:

P1::ImageSdk::SetThreadPoolThreadCount(4);
Sdk.SetThreadPoolThreadCount(4);

If you intent to use multiple concurrent processes, you should probably set this value. Since the default behavior is to use all available CPU cores.