Sei sulla pagina 1di 4

Preallocation Performance

Giovanni Ojan 23 luglio 2012

Abstract
Array preallocation is a standard and quite well-known technique for improving Matlab loop runtime performance. Today?s article will show that there is more than meets the eye for even such a simple coding technique. A note of caution: in the examples that follow, don?t take any speedup as an expected actual value ? the actual value may well be dierent on your system. Your mileage may vary. I only mean to display the relative dierences between dierent alternatives.

0.1

The underlying problem

Memory management has a direct inuence on performance. I have already shown some examples of this in past articles here. Preallocation solves a basic problem in simple program loops, where an array is iteratively enlarged with new data (dynamic array growth). Unlike other programming languages (such as C, C++, C# or Java) that use static typing, Matlab uses dynamic typing. This means that it is natural and easy to modify array size dynamically during program execution. For example: bonacci = [0, 1]; for idx = 3 : 100 bonacci(idx) = bonacci(idx-1) + bonacci(idx-2); end While this may be simple to program, it is not wise with regards to performance. The reason is that whenever an array is resized (typically enlarged), Matlab allocates an entirely new contiguous block of memory for the array, copying the old values from the previous block to the new, then releasing the old block for potential reuse. This operation takes time to execute. In some cases, this reallocation might require accessing virtual memory and page swaps, which would take an even longer time to execute. If the operation is done in a loop, then performance could quickly drop o a cli. The cost of such na array growth is theoretically quadratic. This means that multiplying ve the number of elements by N multiplies the execution time by about N2. The reason for this is that Matlab needs to reallocate N times more than before, and each time takes N times longer due to the larger allocation size (the average block size multiplies by N), and N times more data elements to copy from the old to the new memory blocks. A very interesting discussion of this phenomenon and various solutions can be found in a newsgroup thread from 2005. Three main solutions were presented: preallocation, selective dynamic growth (allocating headroom) and using cell arrays. The best solution among these in terms of ease of use and performance is preallocation. The basics of pre-allocation The basic idea of preallocation is to create a data array in the nal expected size before actually starting the processing loop. This saves any reallocations within the loop, since all the data array elements are already available and can be accessed. This solution is useful when the nal size is known in advance, as the following snippet illustrates: tic bonacci = [0,1]; for idx = 3 : 40000 bonacci(idx) = bonacci(idx-1) + bonacci(idx-2); end toc = Elapsed time is 0.019954 seconds. tic bonacci = zeros(40000,1); bonacci(1)=0; bonacci(2)=1; for idx = 3 : 40000, bonacci(idx) = bonacci(idx-1) + bonacci(idx-2); end toc = Elapsed time is 0.004132 seconds. On pre-R2011a releases the eect of preallocation is even more pronounced: I got a 35-times speedup on the same machine using Matlab 7.1 (R14 SP3). R2011a (Matlab 7.12) had a dramatic performance boost for such cases in the internal accelerator, so newer releases are much faster in dynamic allocations, but preallocation is still 5 times faster even on R2011a. Non-deterministic pre-allocation i

Because the eect of preallocation is so dramatic on all Matlab releases, it makes sense to utilize it even in cases where the data array?s nal size is not known in advance. We can do this by estimating an upper bound to the array?s size, preallocate this large size, and when we?re done remove any excess elements: data = zeros(1000,3000); numRows = 0; numCols = 0; while (someCondition) colIdx = someValue1; numCols = max(numCols,colIdx); rowIdx = someValue2; numRows = max(numRows,rowIdx); data(rowIdx,colIdx) = someOtherValue; end data(:,numCols+1:end) = []; data(numRows+1:end,:) = []; Variants for pre-allocation It turns out that MathWorks? ocial suggestion for preallocation, namely using the zeros function, is not the most ecient: clear data1, tic, data1 = zeros(1000,3000); toc = Elapsed time is 0.016907 seconds. clear data1, tic, data1(1000,3000) = 0; toc = Elapsed time is 0.000034 seconds. The reason for the second variant being so much faster is because it only allocates the memory, without worrying about the internal values (they get a default of 0, false or ?, in case you wondered). On the other hand, zeros has to place a value in each of the allocated locations, which takes precious time. In most cases the dierences are immaterial since the preallocation code would only run once in the program, and an extra 17ms isn?t such a big deal. But in some cases we may have a need to periodically refresh our data, where the extra run-time could quickly accumulate. Pre-allocating non-default values When we need to preallocate a specic value into every data array element, we cannot use Variant #2. The reason is that Variant #2 only sets the very last data element, and all other array elements get assigned the default value (0, ?? or false, depending on the array?s data type). In this case, we can use one of the following alternatives (with their associated timings for a 10003000 data array): scalar = pi; data = scalar(ones(1000,3000)); data(1:1000,1:3000) = scalar; data = repmat(scalar,1000,3000); data = scalar + zeros(1000,3000); data(1000,3000) = 0; data = data+scalar; As can be seen, Variants C-E are about twice as fast as Variant B, and 5 times faster than Variant A. Pre-allocating non-double data 7.4.5 Preallocating non-double data When preallocating an array of a type that is not double, we should be careful to create it using the desired type, to prevent memory and/or performance ineciencies. For example, if we need to process a large array of small integers (int8), it would be inecient to preallocate an array of doubles and type-convert to/from int8 within every loop iteration. Similarly, it would be inecient to preallocate the array as a double type and then convert it to int8. Instead, we should create the array as an int8 array in the rst place: data = int8(zeros(1000,1000)); = Elapsed time is 0.008170 seconds. data = zeros(1000,1000,int8); = Elapsed time is 0.000095 seconds. Pre-allocating cell arrays To preallocate a cell-array we can use the cell function (explicit preallocation), or the maximal cell index (implicit preallocation). Explicit preallocation is faster than implicit preallocation, but functionally equivalent (Note: this is contrary to the experience with allocation of numeric arrays and other arrays): data = cell(1000,3000); = Elapsed time is 0.004637 seconds. clear(data), data1000,3000 = []; = Elapsed time is 0.012873 seconds. Pre-allocating arrays of structs To preallocate an array of structs or class objects, we can use the repmat function to replicate copies of a single data element (explicit preallocation), or just use the maximal data index (implicit preallocation). In this case, unlike the case of cell arrays, implicit preallocation is much faster than explicit preallocation, since the single element does not actually need to be copied multiple times (ref): ii

element = struct(eld1,magic(2), eld2,[]); data = repmat(element, 100, 300); = Elapsed time is 0.002804 seconds. element = struct(eld1,magic(2), eld2,[]); clear(data), data(100,300) = element; = Elapsed time is 0.000429 seconds. When preallocating structs, we can also use a third variant, using the built-in struct feature of replicating the struct when the struct function is passed a cell array. For example, struct(eld1,cell(100,1), eld2,5) will create 100 structs, each of them having the empty eld eld1 and another eld called eld2 with value 5. Unfortunately, this variant is slower than both of the previous variants. Pre-allocating class objects When preallocating in general, ensure that you are using the maximal expected array size. There is no point in preallocating an empty array or an array having a smaller size than the expected maximum, since dynamic memory reallocation will automatically kick-in within the processingloop. For this reason, do not use the empty() method of class objects to preallocate, but rather repmat as explained above. When using repmat to replicate class objects, always be careful to note whether you are replicating the object itself (this happens if your class does NOT derive from handle) or its reference handle (which happens if you derive the class from handle). If you are replicating objects, then you can safely edit any of their properties independently of each other; but if you replicate references, you are merely using multiple copies of the same reference, so that modifying referenced object #1 will also automatically aect all the other referenced objects. This may or may not be suitable for your particular program requirements, so be careful to check carefully. If you actually need to use independent object copies, you will need to call the class constructor multiple times, once for each new independent object. Next week: what if we can?t avoid dynamic array resizing? ? apparently, all is not lost. Stay tuned?

iii

Potrebbero piacerti anche