Mathematica on Twitter

Posted by kashif on May 01, 2009

A great article about Twittering with Mathematica on the Wolfram blog. I had investigated a while ago a Mathematica twitter bot for doing “Micro-calculations” with the results from Mathematica being less than 140 chars. Not very useful but a fun bot.

Anyways if you are interested, I made a gist for it. Its  in Java and uses JLink to communicate with Mathematica. It was never running for long as I suspect it violated some end user license, but basically one would send a Mathematica command to @mathematica and it would tweet you back your result evaluated by the MathKernel. I am hoping Wolfram might create a similar bot themselves for when you need to know the value of a special function quickly :-)

Mathematica Cuda plug-in now on github 2

Posted by kashif on December 23, 2008

I have decided to push the initial Mathematica Cuda plug-in to a public repo on github. Feel free to download or fork it.

The basic structure of the project follows that of the Nvidia’s Cuda SDK, in that the individual projects are in their own folder inside the projects folder. Right now I have the scalarProd example from Nvidia. I have also included Nvidia’s cuda utilities cutils and extended the make system to handle Mathematica template files.

Currently I have tested it only on 64-bit Linux, but hopefully I will see if I can get it working under Mac and Windows. I also plan to add more documentation in the project’s wiki on github, and hopefully  get some more useful examples implemented, perhaps FFT.

Mathematica plug-in for CUDA 3

Posted by kashif on June 26, 2008

Since there is a Matlab plug-in for CUDA that provides some examples of off-loading computation to the GPU, I thought it might be neat to have something similar for Mathematica. So as a start, I decided to try out a simple scalar product example using MathLink.

The initial template of my function is in the scalarProd.tm file:

double scalarProd P(( double*, long, double*, long));

:Begin:
:Function:      scalarProd
:Pattern:       ScalarProd[ x:{___Real}, y:{___Real} ]
:Arguments:     { x, y }
:ArgumentTypes: { RealList,  RealList}
:ReturnType:    Real
:End:

:Evaluate: ScalarProd::usage = "ScalarProd[x, y] gives the scalar product of two real lists {x1,x2,...,xn} and {y1,y2,...yn}."

which describes the ScalarProd[] function in Mathematica, and links it to the scalarProd() C method, which is where we receive the two arrays from Mathematica and use CUDA to calculate their scalar product and send the result back. This and the main() function for Linux and Mac, which is what I was using, are in the scalarProd.cu file. Note that Windows has a slightly different main() method.

#include 
#include 
#include "mathlink.h"
#include "cuda_runtime.h"
int main(int argc, char* argv[])
{
  int deviceCount;
  cudaGetDeviceCount(&deviceCount);
  if (deviceCount == 0) {
    fprintf(stderr, "cutil error: no devices supporting CUDA.n");
    exit(EXIT_FAILURE);
  }
  int dev = 0;
  cudaDeviceProp deviceProp;
  cudaGetDeviceProperties(&deviceProp, dev);
  if (deviceProp.major < 1) {
    fprintf(stderr, "cutil error: device does not support CUDA.n");
    exit(EXIT_FAILURE);
  }
  fprintf(stderr, "Using device %d: %sn", dev, deviceProp.name);
  cudaSetDevice(dev);
 
  return MLMain(argc, argv);
}

and in the same scalarProd.cu we now include the scalarProd_kernel.cu kernel from CUDA’s SDK together with our scalarProd() C function:

#include "scalarProd_kernel.cu"
double scalarProd( double* x, long lenx, double *y, long leny)
{
  float *d_A, *d_B, *d_C;
  float *h_A, *h_B;
  float *h_C_GPU;
  int i;
 
  printf("Initializing data...n");
  h_A     = (float *)malloc(sizeof(float)*lenx);
  h_B     = (float *)malloc(sizeof(float)*leny);
 
  //convert double to float
  for(i = 0; i < lenx; i++){
    h_A[i] = (float)x[i];
    h_B[i] = (float)y[i];
  }
  h_C_GPU = (float *)malloc(sizeof(float));
 
  printf("...allocating GPU memory.n");
  cudaMalloc((void **)&d_A, sizeof(float)*lenx);
  cudaMalloc((void **)&d_B, sizeof(float)*leny);
  cudaMalloc((void **)&d_C, sizeof(float));
  printf("...copying input data to GPU mem.n");
  cudaMemcpy(d_A, h_A, sizeof(float)*lenx, cudaMemcpyHostToDevice);
  cudaMemcpy(d_B, h_B, sizeof(float)*leny, cudaMemcpyHostToDevice);
  printf("Data init done.n");
 
  printf("Executing GPU kernel...n");
  cudaThreadSynchronize();
  scalarProdGPU<<<128, 256>>>(d_C, d_A, d_B, 1, lenx);
  cudaThreadSynchronize();
 
  printf("Reading back GPU result...n");
  cudaMemcpy(h_C_GPU, d_C, sizeof(float), cudaMemcpyDeviceToHost);
 
  cudaFree(d_C);
  cudaFree(d_B);
  cudaFree(d_A);
  free(h_B);
  free(h_A);
 
  return h_C_GPU[0];
}

Now we are ready to run Mathematica’s mprep pre-processor from MathLink to generate a scalarProdtm.cu file, and on this we run CUDA’s compiler nvcc and compile everything with the appropriate CUDA and MathLink libraries to generate our scalarProd binary, which we can now call from within Mathematica:

In[1]:= Install["scalarProd"]
Out[1]= LinkObject[./scalarProd,8,7]

In[2]:= ScalarProd[{2.1,3.2},{4.3,5.4}]
Out[2]= 26.31

In[3]:= UnInstall[%1]
Out[3]= UnInstall[LinkObject[./scalarProd,8,7]]