% -*- texinfo -*- % @deftypefn {Function File} {} speed (@var{f}, @var{init}, @var{max_n}, @var{f2}, @var{tol}) % @deftypefnx {Function File} {[@var{order}, @var{n}, @var{T_f}, @var{T_f2}] =} speed (@dots{}) % % Determine the execution time of an expression for various @var{n}. % The @var{n} are log-spaced from 1 to @var{max_n}. For each @var{n}, % an initialization expression is computed to create whatever data % are needed for the test. If a second expression is given, the % execution times of the two expressions will be compared. Called % without output arguments the results are presented graphically. % % @table @code % @item @var{f} % The expression to evaluate. % % @item @var{max_n} % The maximum test length to run. Default value is 100. Alternatively, % use @code{[min_n,max_n]} or for complete control, @code{[n1,n2,@dots{},nk]}. % % @item @var{init} % Initialization expression for function argument values. Use @var{k} % for the test number and @var{n} for the size of the test. This should % compute values for all variables listed in args. Note that init will % be evaluated first for @math{k = 0}, so things which are constant throughout % the test can be computed then. The default value is @code{@var{x} = % randn (@var{n}, 1);}. % % @item @var{f2} % An alternative expression to evaluate, so the speed of the two % can be compared. Default is @code{[]}. % % @item @var{tol} % If @var{tol} is @code{Inf}, then no comparison will be made between the % results of expression @var{f} and expression @var{f2}. Otherwise, % expression @var{f} should produce a value @var{v} and expression @var{f2} % should produce a value @var{v2}, and these shall be compared using % @code{assert(@var{v},@var{v2},@var{tol})}. If @var{tol} is positive, % the tolerance is assumed to be absolute. If @var{tol} is negative, % the tolerance is assumed to be relative. The default is @code{eps}. % % @item @var{order} % The time complexity of the expression @code{O(a n^p)}. This % is a structure with fields @code{a} and @code{p}. % % @item @var{n} % The values @var{n} for which the expression was calculated and % the execution time was greater than zero. % % @item @var{T_f} % The nonzero execution times recorded for the expression @var{f} in seconds. % % @item @var{T_f2} % The nonzero execution times recorded for the expression @var{f2} in seconds. % If it is needed, the mean time ratio is just @code{mean(T_f./T_f2)}. % % @end table % % The slope of the execution time graph shows the approximate % power of the asymptotic running time @code{O(n^p)}. This % power is plotted for the region over which it is approximated % (the latter half of the graph). The estimated power is not % very accurate, but should be sufficient to determine the % general order of your algorithm. It should indicate if for % example your implementation is unexpectedly @code{O(n^2)} % rather than @code{O(n)} because it extends a vector each % time through the loop rather than preallocating one which is % big enough. For example, in the current version of Octave, % the following is not the expected @code{O(n)}: % % @example % speed ('for i = 1:n, y@{i@} = x(i); end', '', [1000,10000]) % @end example % % but it is if you preallocate the cell array @code{y}: % % @example % @group % speed ('for i = 1:n, y@{i@} = x(i); end', ... % 'x = rand (n, 1); y = cell (size (x));', [1000, 10000]) % @end group % @end example % % An attempt is made to approximate the cost of the individual % operations, but it is wildly inaccurate. You can improve the % stability somewhat by doing more work for each @code{n}. For % example: % % @example % speed ('airy(x)', 'x = rand (n, 10)', [10000, 100000]) % @end example % % When comparing a new and original expression, the line on the % speedup ratio graph should be larger than 1 if the new expression % is faster. Better algorithms have a shallow slope. Generally, % vectorizing an algorithm will not change the slope of the execution % time graph, but it will shift it relative to the original. For % example: % % @example % @group % speed ('v = sum (x)', '', [10000, 100000], ... % 'v = 0; for i = 1:length (x), v += x(i); end') % @end group % @end example % % A more complex example, if you had an original version of @code{xcorr} % using for loops and another version using an FFT, you could compare the % run speed for various lags as follows, or for a fixed lag with varying % vector lengths as follows: % % @example % @group % speed ('v = xcorr (x, n)', 'x = rand (128, 1);', 100, % 'v2 = xcorr_orig (x, n)', -100*eps) % speed ('v = xcorr (x, 15)', 'x = rand (20+n, 1);', 100, % 'v2 = xcorr_orig (x, n)', -100*eps) % @end group % @end example % % Assuming one of the two versions is in @var{xcorr_orig}, this % would compare their speed and their output values. Note that the % FFT version is not exact, so we specify an acceptable tolerance on % the comparison @code{100*eps}, and the errors should be computed % relatively, as @code{abs((@var{x} - @var{y})./@var{y})} rather than % absolutely as @code{abs(@var{x} - @var{y})}. % % Type @code{example('speed')} to see some real examples. Note for % obscure reasons, you can't run examples 1 and 2 directly using % @code{demo('speed')}. Instead use, @code{eval(example('speed',1))} % and @code{eval(example('speed',2))}. % @end deftypefn