File elemwise.h

Custom elementwise operations generator.

Defines

GE_SCALAR 0x0001

Argument is a scalar passed from the CPU, requires nd == 0.

GE_READ 0x0002

Array is read from in the expression.

GE_WRITE 0x0004

Array is written to in the expression.

GE_NOADDR64 0x0001

Don’t precompile kernels for 64-bits addressing.

GE_CONVERT_F16 0x0002

Convert float16 inputs to float32 for computation.

GE_BROADCAST 0x0100

Allow broadcasting of dimensions of size 1.

GE_NOCOLLAPSE 0x0200

Disable dimension collapsing (not recommended).

Typedefs

typedef struct _GpuElemwise GpuElemwise

Elementwise generator structure.

The contents are private.

Functions

GpuElemwise* GpuElemwise_new(gpucontext * ctx, const char * preamble, const char * expr, unsigned int n, gpuelemwise_arg * args, unsigned int nd, int flags)

Create a new GpuElemwise.

This will allocate and initialized a new GpuElemwise object. This object can be used to run the specified operation on different sets of arrays.

The argument descriptor name the arguments and provide their data types and geometry (arrays or scalars). They also specify if the arguments are used for reading or writing. An argument can be used for both.

The expression is a C-like string performing an operation with scalar values named according to the argument descriptors. All of the indexing and selection of the right values is handled by the GpuElemwise code.

Return
a new GpuElemwise object or NULL
Parameters
  • ctx: the context in which to run the operations
  • preamble: code to be inserted before the kernel code
  • expr: the expression to compute
  • n: the number of arguments
  • args: the argument descriptors
  • nd: the number of dimensions to precompile for
  • flags: see GpuElemwise flags

void GpuElemwise_free(GpuElemwise * ge)

Free all storage associated with a GpuElemwise.

Parameters
  • ge: the GpuElemwise object to free.

int GpuElemwise_call(GpuElemwise * ge, void ** args, int flags)

Run a GpuElemwise on some inputs.

Parameters
  • ge: the GpuElemwise to run
  • args: pointers to the arguments (must macth what was described by the argument descriptors)
  • flags: see GpuElemwise call flags

struct gpuelemwise_arg
#include <elemwise.h>

Argument information structure for GpuElemwise.

Public Members

const char* name

Name of this argument in the associated expression, mandatory.

int typecode

Type of argument, mandatory (not GA_BUFFER, the content dtype)

int flags

Argument flags, mandatory (see GpuElemwise argument flags).