Quantcast
Channel: Timothy Lottes
Viewing all articles
Browse latest Browse all 434

No Traditional Dynamic Memory Allocation

$
0
0
To provide some meaning to a prior tweet... Outside of current and prior day jobs, I never use traditional dynamic memory allocation. All allocations are done up front at application load. On modern CPUs the cost of virtual memory is always there, might as well use it. Load time, allocate virtual memory address space for maximum practical memory usage for various data in the application. Virtual address space is backed initial by read-only common page zero-fill (no physical memory allocated). On write the OS modifies the page table and fills written pages with unique zero-fill physical backing. Can also preempt the OS write-fault and manual force various OSs to pre-back used virtual memory space (for real-time applications).

The reduction of complexity, runtime cost, and development cost enabled by this practice is massive.

This practice is a direct analog to what is required to do efficient GPU programming: layout data by usage locality into 1D/2D/3D arrays/tables/textures with indexes or handles linking different data structures. Programs designed around transforms of data, by some mix of gather/scatter. Bits of the larger application cut up into manageable parallel pieces which can be debugged/tested/replaced/optimized individually. Capture and replay of the entire program is relatively easy. Synchronization factored into coarse granularity signals and barriers. Scaling this development up to a large project requires an architect which can layout the high-level network of how data flows through the program. Then sub-architects which own self-contained sub-networks. Then individuals which provide the programs and details of parts of each sub-network.

My programming language of choice to feed this system, is nothing standard, but rather something built around instant run-time edit/modify/test cycle which works from within the program itself. Something similar to forth (a tiny fully expressive programming language which requires no parsing) which can also express assembly which can be run-time assembled into the tiny programs which process data in the application network.

Should be obvious that the practices which enable fast shader development and fast development of the GPU side of a rendering pipeline can directly apply to everything else as well, and also elegantly solves the problem in a way which scales on a parallel machine.

Viewing all articles
Browse latest Browse all 434

Trending Articles