Graphics processors offer tremendous processing power, but do not deliver peak performance, if programs do not offer the ability to be parallelized into thousands of coherently executing threads of execution. This talk focuses on this issue, presenting strategies towards unlocking the gates of GPU execution for a new class of algorithms.
With our task-based processing model, dynamic algorithms with various degrees of parallelism are cast into a description that can be executed on massively parallel devices efficiently. These task descriptors are manged using highly efficient queues, which are designed to handle accesses by thousands of threads in parallel. Furthermore, these queues allow different processes to use a single GPU concurrently, dividing the available processing time fairly between them. Alternatively, priorities can be set up to favor the execution of certain parts of one algorithm or entire algorithms over others, allowing full control over the execution on the GPU. Using our model and execution framework, we were able to advance rendering, visualization, and procedural modeling algorithms with previously unseen features in the context of GPU execution.