
In these cases it would be useful to have an API which managed the data transfer to and from the GPU automatically and could be used as a direct replacement for CPU BLAS libraries.Īdditionally, there is the common case where the input matrices to the BLAS operations are too large to fit on the GPU.
Gnu octave matrix operations code#
But it is less convenient when just a few BLAS routines need to be accelerated (simple data copy) or when vast amounts of code need to be modified (large programmer effort). Such an API permits the fine tuning required to minimize redundant data copies to and from the GPU in arbitrarily complicated scenarios such that maximum performance is achieved.

Gnu octave matrix operations full#
Make it a full matrix, and so the example,įor addition and subtraction, the sparse matrices are already converted to full and the NaN is added/subtracted and the operation follows IEEE guidelines. For example adding a scalar constant to a sparse matrix will almost always Operators and functions that have a high probability of returning a full matrix will always Makes sense to store it as a sparse matrix, but rather as a full matrix. Therefore, there is a certain density of nonzero elements of a matrix where it no longer Time on a sparse matrix operator or function is roughly linear with the number of nonzero The two are closely related in that the computation The two basic reasons to use sparse matrices are to reduce the memory usage and to not have From the documentation,Ģ2.1.4.2 Return Types of Operators and Functions In general, even without sparse_auto_mutate set to true, Octave will convert to full for operations that are likely to result in a full matrix. In fact, the situation will be worse than a full matrix because there will be the extra overhead of the sparse matrix implementation. After one of the operations involving NaN, every single element will become NaN and require storage. Mon 05:49:15 PM UTC, comment #6: The memory problem already exists. The "intersection" function for sparse matrices in C++ could be changed to include elements where only one of the matrices had a NaN. Z = X(elements_to_operate_on) op Y(elements_to_operate_on) For sparse matrices, I believe the code is equivalent toĮlements_to_operate_on = intersection (nonzero1, nonzero2) Of course, all of that would need to be in C++.Īnother choice would be to look carefully at the code which performs the operation 'op'. Pseudo-code which might work for Z = X op Y: The subsequent decision, about whether to convert to full if it would save space, should probably be left up to Octave, or to the sparse_auto_mutate code. At a minimum, Octave needs to check whether each operand contains a NaN and it needs to specifically guarantee that those locations result in NaN at the output.

At least we understand what behavior needs to be implemented.
