Example Programs¶
All the example code can be found in the examples/ subdirectory at
the installation prefix location.
Minimal Example¶
This example shows the minimal code needed to call a QML function.
This example constructs matrices A and B that are both 1024x1024
filled with unit values. It then calls sgemm to multiply the matrices
and store the 1024x1024 result in matrix C. The first value in the resulting
matrix is displayed, which should be 1024.
#include <qml.h>
#include <iostream>
using namespace std;
int main()
{
const uint32_t matrixSize = 1024*1024;
float *A = new float[matrixSize];
float *B = new float[matrixSize];
float *C = new float[matrixSize];
for(uint32_t i=0; i < matrixSize; i++)
{
A[i] = B[i] = C[i] = 1.0;
}
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 1024, 1024, 1024,
1.0, A, 1024, B, 1024, 0.0, C, 1024);
cout << "Value of C[0] is: " << C[0] << endl;
delete[] C;
delete[] B;
delete[] A;
return 0;
}
CBLAS Example¶
This example shows how row-major indexing works in CBLAS and how offsets and leading dimensions can be passed to QML functions to operate on subregions of larger matrices.
#include <qml.h>
#include <iostream>
/* Example program that uses the CBLAS interface to permute part of a matrix.
This program constructs a matrix A of size 6x8 with entries:
[ 1, 2, 3, 4, 5, 6, 7, 8; ]
[ 9, 10, 11, 12, 13, 14, 15, 16; ]
[ 17, 18, 19, 20, 21, 22, 23, 24; ]
[ 25, 26, 27, 28, 29, 30, 31, 32; ]
[ 33, 34, 35, 36, 37, 38, 39, 40; ]
[ 41, 42, 43, 44, 45, 46, 47, 48; ]
It constructs an explicit permutation matrix of size 4x4 with entries:
[ 0, 1, 0, 0; ]
[ 1, 0, 0, 0; ]
[ 0, 0, 0, 1; ]
[ 0, 0, 1, 0; ]
The operation to be performed is to apply the permutation matrix
to the lower left quadrant of A on the right and store the result in
a new matrix C. The result matrix C should be:
[ 17, 18, 19, 20; ] [ 0, 1, 0, 0; ] [ 18, 17, 20, 19; ]
[ 25, 26, 27, 28; ] * [ 1, 0, 0, 0; ] = [ 26, 25, 28, 27; ]
[ 33, 34, 35, 36; ] [ 0, 0, 0, 1; ] [ 34, 33, 36, 35; ]
[ 41, 42, 43, 44; ] [ 0, 0, 1, 0; ] [ 42, 41, 44, 43; ]
All matrices are stored in row-major order in double precision.
*/
int main()
{
// A is 6 x 8
// Create on the heap
const uint32_t A_rows = 6;
const uint32_t A_cols = 8;
double *A = new double[A_rows * A_cols];
const uint32_t LDA = A_cols;
// Fill out A
// Row-major means adjacent memory locations are increasing
for(uint32_t i=0; i < A_rows * A_cols; i++)
{
A[i] = i + 1;
}
// P is k x k
// Create on the stack
const uint32_t P_size = 4;
double P[] = { 0, 1, 0, 0,
1, 0, 0, 0,
0, 0, 0, 1,
0, 0, 1, 0 };
const uint32_t LDP = P_size;
// C is k x k
// Create on the heap
double *C = new double[P_size * P_size];
const uint32_t LDC = P_size;
// CBLAS call to dgemm
// Operation:
// C := 1.0 * P * A[3..6][1..4] + 0.0 * C
cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, P_size, P_size,
P_size, 1.0, A + LDA * 2, LDA, P, LDP, 0.0, C, LDC);
// Show result matrix C in row-major order
for (uint32_t row = 0; row < P_size; row++)
{
for (uint32_t col = 0; col < P_size; col++)
{
std::cout << C[col + LDC * row] << " ";
}
std::cout << std::endl;
}
// Cleanup
delete[] A;
delete[] C;
return 0;
}
BLAS Solve Example¶
This example shows how to use the Fortran BLAS interface from C++ code to solve a system of linear equations.
#include <qml.h>
#include <iostream>
/* Example program that uses the BLAS interface to solve a system of
equations.
This program constructs a matrix A of size 7x7 with entries:
[ 1, 0, 0, 0, 0 ]
[ 0, 2, -1, 0, 0 ]
[ 0, 0, 1.5, -1, 0 ]
[ 0, 0, 0, 1.333, -1 ]
[ 0, 0, 0, 0, 1 ]
and a vector b with entries:
[ 1 ]
[ 2 ]
[ 2 ]
[ 2.333 ]
[ 1 ]
We would like to solve the system of equations A * x = b for x.
Matrix A is stored in column-major order using single precision.
*/
int main()
{
// A is 5x5
// Create on the heap
const qml_long A_size = 5;
float *A = new float[A_size * A_size]{}; // Initialize to 0
const qml_long LDA = A_size;
// Fill out non-zero A entries
A[0 + 0 * LDA] = 1.0f;
A[1 + 1 * LDA] = 2.0f;
A[2 + 2 * LDA] = 1.5f;
A[3 + 3 * LDA] = 1.33333333f;
A[4 + 4 * LDA] = 1.0f;
A[1 + 2 * LDA] = -1.0f;
A[2 + 3 * LDA] = -1.0f;
A[3 + 4 * LDA] = -1.0f;
// b is length 7
// Create on the stack, name V since it will also hold x
const qml_long V_size = 5;
float V[] = { 1.0f, 2.0f, 2.0f, 2.33333333f, 1.0f };
const qml_long INCV = 1;
// BLAS call to strsv (single precision triangular solver)
// Operation:
// Solve A x = b for x
// Upper triangular
// Non-transposed solve
// Non-unit diagonal
// Note we pass addresses of scalars following Fortran convention
strsv("U", "N", "N", &A_size, A, &LDA, V, &INCV);
// Show result vector
for (qml_long i = 0; i < V_size; i++)
{
std::cout << V[i] << std::endl;
}
// Cleanup
delete[] A;
return 0;
}
LAPACK Least Squares Example¶
This example shows how to use the LAPACK interface of QML from C++
code to find the best-fit quadratic equation through a set of points.
The example calls dgels to compute the linear least squares solution.
#include <qml.h>
#include <iostream>
/* Example program that uses the LAPACK interface to compute
the best-fit quadratic equation through a set of data points.
The problem is to find values for beta_0, beta_1, and beta_2 so that
the formula
y = beta_0 + beta_1 * x + beta_2 * x^2
is the "best" approximation to the following observed data points.
x | y
-----+-----
1 | 3
2 | 4
4 | 13
5 | 27
Here "best" means minimizing the sum of the squares of the
differences.
We use the LAPACK GELS function to compute a least-squares solution
to the overdetermined A * x = b.
A is the matrix:
[ 1 1 1 ]
[ 1 2 4 ]
[ 1 4 16 ]
[ 1 5 25 ]
x is the unknown vector:
[ beta_0 ]
[ beta_1 ]
[ beta_2 ]
b is the matrix:
[ 3 ]
[ 4 ]
[ 13 ]
[ 27 ]
Matrix A is stored in column-major order in double precision.
Even though x and b are vectors here, GELS can handle multiple right
hand sides simultaneously so it takes x and b as matrices.
*/
int main()
{
// A is 4x3
// Create on the heap
const qml_long A_rows = 4, A_cols = 3;
// Column major, each text row below is a column of A
double *A = new double[A_rows * A_cols]{
1.0, 1.0, 1.0, 1.0,
1.0, 2.0, 4.0, 5.0,
1.0, 4.0, 16.0, 25.0
};
const qml_long LDA = A_rows;
const qml_long NRHS = 1;
// b is length 4
// Will contain x after dgels returns
const qml_long B_size = 4;
const qml_long X_size = 3;
double *B = new double[B_size]{ 3.0, 4.0, 13.0, 27.0 };
const qml_long LDB = B_size;
// Query required workspace size
qml_long lwork = -1; // query into first position of workspace
double opt_worksize;
qml_long info;
dgels("N", &A_rows, &A_cols, &NRHS, A, &LDA, B, &LDB, &opt_worksize, &lwork, &info);
// Allocate optimal scratch space
// (Need cast to integer type because optimal size is given as a double)
lwork = static_cast<qml_long>(opt_worksize);
double *WORK = new double[lwork]{};
// LAPACK call to dgels (double precision least squares solver)
dgels("N", &A_rows, &A_cols, &NRHS, A, &LDA, B, &LDB, WORK, &lwork, &info);
// Check for success
if (info != 0)
{
std::cout << "ERROR computing least squares solution" << std::endl;
return 1;
}
// Show result (should be y = 8.73333 * x^0 + -7.3 * x^1 + 2.16667 * x^2)
std::cout << "Best fit equation is y = ";
for (qml_long i = 0; i < X_size; i++)
{
std::cout << B[i] << " * x^" << i << " ";
if (i < X_size - 1)
{
std::cout << " + ";
}
}
std::cout << std::endl;
// Cleanup
delete[] A;
delete[] B;
delete[] WORK;
return 0;
}
LAPACK Singular Value Decomposition¶
This example shows how to use the LAPACK interface of QML from C++
code to find the singular value decomposition of a matrix. It calls
dgesvd to do the decomposition.
#include <qml.h>
#include <iostream>
/* Example program that uses the LAPACK interface to compute
the singular value decomposition (SVD) of a matrix.
The problem is to find U, SIGMA, and V^T such that
U * SIGMA * V^T = [ 3 2 2 ]
[ 2 3 -2 ]
where U is a 2x2 orthogonal matrix, V^T is a 3x3 orthogonal matrix,
and SIGMA is a 2x3 matrix with non-zero elements only on the diagonal.
*/
int main()
{
// A is 2x3
// Create on the heap
const qml_long rows = 2, cols = 3;
// Column major, each text row below is a column of A
double *A = new double[rows * cols]{
3.0, 2.0,
2.0, 3.0,
2.0, -2.0
};
// S is diagonal entries of SIGMA
double *S = new double[rows]{};
// U is 2x2
double *U = new double[rows * rows]{};
// VT is 3x3
double *VT = new double[cols * cols]{};
// Query required workspace size
qml_long lwork = -1; // query into first position of workspace
double opt_worksize;
qml_long info;
dgesvd("A", "A", &rows, &cols, A, &rows, S, U, &rows, VT, &cols, &opt_worksize, &lwork, &info);
// Allocate optimal scratch space
// (Need cast to integer type because optimal size is given as a double)
lwork = static_cast<qml_long>(opt_worksize);
double *WORK = new double[lwork]{};
// Do SVD
dgesvd("A", "A", &rows, &cols, A, &rows, S, U, &rows, VT, &cols, WORK, &lwork, &info);
// Check for success
if (info != 0)
{
std::cout << "ERROR computing SVD" << std::endl;
return 1;
}
// Show singular values (should be [ 5 3 ])
std::cout << "Singular values are [ ";
for (qml_long i = 0; i < rows; i++)
{
std::cout << S[i] << " ";
}
std::cout << "]\n";
// Show the first left singular vector (should be [ -0.707107 -0.707107 ])
std::cout << "First left singular vector is [ ";
for (qml_long i = 0; i < rows; i++)
{
std::cout << U[i] << " ";
}
std::cout << "]\n";
// Cleanup
delete[] A;
delete[] S;
delete[] U;
delete[] VT;
delete[] WORK;
return 0;
}
Building the Examples¶
One way to build your application or library and link against QML
is to use the official Android NDK. You declare that QML is a
pre-built library using directives in an Android.mk file inside the
jni directory of an application tree.
include $(CLEAR_VARS)
LOCAL_MODULE := QML
LOCAL_SRC_FILES := <install-prefix>/lib<name>.so
LOCAL_EXPORT_C_INCLUDES := <install-prefix>/include
include $(PREBUILT_SHARED_LIBRARY)
Once this module has been declared, you can let the build system
know that your application links against QML by adding the following
directive to your application’s Android.mk:
LOCAL_SHARED_LIBRARIES += QML
To have access to newer C++11 features you also need to link against a
full-featured version of the C++ runtime library and enable C++11
features in the compiler. These settings are set in Application.mk.
APP_STL := gnustl_shared
APP_CPPFLAGS += -std=c++11
To build the provided example programs in this way, run ndk-build
from the examples/ directory of the installation within the desired
architecture. This will produce executables and library files inside
libs/<arch>/. Within the context of a complete application, these
executables and libraries will be included as part of the final APK
and installed in the correct location on application install.
Running the Examples¶
There are two main ways to run the examples. If you control the Android platform on the device, you can add QML to the system libraries that are available to all applications. If you do not control the platform then you will include the libraries as part of your application so they will be installed in the private application area during installation.
Platform control¶
If you control the Android platform, you can install the QML
into the system libraries location of the device. The typical
location would be /system/vendor/lib/ for 32-bit architectures
or /system/vendor/lib64/ for 64-bit architectures.
With root access in ADB this can be done for testing purposes using adb push.
Once the libraries are installed in system locations, running the examples requires copying the executables to any location on the device and running them.
Local install¶
To run the executables without root access, copy the libraries and executables from
libs/<arch>/ to an accessible directory on the device such as /data/local/test. Once QML,
and the test application are all in the same directory, run the executable with a command such as:
LD_LIBRARY_PATH=. ./MinimalExample
Complete applications packaged inside an APK will automatically install
the libraries into the correct locations during installation if the libraries
are declared as previously described in the Android.mk file.