<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>high performance computing</title>
	<atom:link href="http://hpc.nomad-labs.com/feed" rel="self" type="application/rss+xml" />
	<link>http://hpc.nomad-labs.com</link>
	<description>at nomad-labs</description>
	<lastBuildDate>Tue, 06 Dec 2011 10:59:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Slides to my CUDA Deep Dive talk</title>
		<link>http://hpc.nomad-labs.com/archives/133</link>
		<comments>http://hpc.nomad-labs.com/archives/133#comments</comments>
		<pubDate>Wed, 16 Nov 2011 01:10:24 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[cuda]]></category>
		<category><![CDATA[gpgpu]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[nvidia]]></category>
		<category><![CDATA[opengl]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=133</guid>
		<description><![CDATA[The slides from my CUDA Deep Dive talk are here: CUDA Deep Dive View more presentations from krasul]]></description>
			<content:encoded><![CDATA[<p>The slides from my CUDA Deep Dive talk are here:</p>
<p><strong style="display: block; margin: 12px 0 4px;"><a title="CUDA Deep Dive" href="http://www.slideshare.net/krasul/cuda-deepdive" target="_blank">CUDA Deep Dive</a></strong></p>
<div id="__ss_10084570" style="width: 425px;"><iframe src="http://www.slideshare.net/slideshow/embed_code/10084570" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" width="425" height="355"></iframe></p>
<div style="padding: 5px 0 12px;">View more <a href="http://www.slideshare.net/" target="_blank">presentations</a> from <a href="http://www.slideshare.net/krasul" target="_blank">krasul</a></div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/133/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CUDA 4.0 MultiGPU on an Amazon EC2 instance</title>
		<link>http://hpc.nomad-labs.com/archives/65</link>
		<comments>http://hpc.nomad-labs.com/archives/65#comments</comments>
		<pubDate>Thu, 23 Jun 2011 14:30:50 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[amazon]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[ec2]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65</guid>
		<description><![CDATA[This post will take you through starting and configuring an Amazon EC2 instance to use the Multi GPU features of CUDA 4.0. Motivation CUDA 4.0 comes with some new exciting features such as: the ability to share GPUs across multiple threads; or use all GPUs in the system concurrently from a single host thread; and unified [...]]]></description>
			<content:encoded><![CDATA[<p>This post will take you through starting and configuring an <a href="http://aws.amazon.com/ec2/">Amazon EC2</a> instance to use the Multi GPU features of <a href="http://developer.nvidia.com/cuda-toolkit-40">CUDA 4.0</a>.</p>
<h2>Motivation</h2>
<p>CUDA 4.0 comes with some new exciting features such as:</p>
<ul>
<li>the ability to share GPUs across multiple threads;</li>
<li>or use all GPUs in the system concurrently from a single host thread;</li>
<li>and unified virtual addressing for faster multi GPU programming;</li>
</ul>
<p>and many more.</p>
<p>The ability to access all the GPUs in a system is particularly nice on Amazon, since the large GPU enabled instances come with two <a href="http://www.nvidia.com/object/preconfigured-clusters.html">Tesla M2050 Fermi boards</a>, <em>each</em> capable of 1030 GFlops theoretical peak performance with 448 cores and 3GB of memory.</p>
<h2>Getting started</h2>
<p>Signing up to Amazon&#8217;s AWS is easy enough with a Credit Card, and once you are logged in, go to the EC2 tab of your console which should look something like this:</p>
<div id="attachment_82" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-1.png"><img class="size-full wp-image-82" title="EC2 console" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-1.png" alt="The EC2 console page" width="600" height="401" /></a><p class="wp-caption-text">The EC2 console page.</p></div>
<p>Now press the <strong>Launch Instance</strong> button and in the <strong>Community AMIs</strong> tab set the <strong>Viewing</strong> option to <strong>Amazon Images</strong> and search for <code>gpu</code> and <strong>Select</strong> the CentOS 5.5 GPU HVM AMI and press <strong>Continue</strong>:</p>
<div id="attachment_89" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-2.png"><img class="size-full wp-image-89" title="Choose an AMI" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-2.png" alt="Choose an AMI" width="600" height="401" /></a><p class="wp-caption-text">Choose the CentOS 5.5 GPU HVM AMI (bottom one).</p></div>
<p>Next we need to select the <strong>Instance Type</strong> and its important here to select the <strong>Cluster GPU</strong> type, and then press <strong>Continue</strong>:</p>
<div id="attachment_92" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec-3.png"><img class="size-full wp-image-92" title="Instance type" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec-3.png" alt="Instance type" width="600" height="401" /></a><p class="wp-caption-text">Select the Cluster GPU Instance Type.</p></div>
<p>Next we need to <strong>Create a New Key Pair</strong>, by giving it a name like <code>amazon-gpu</code> and press <strong>Create &amp; Download your Key Pair</strong> to download it to your local computer as a file called <code>amazon-gpu.pem</code>:</p>
<div id="attachment_95" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-4.png"><img class="size-full wp-image-95" title="Create Key Pair" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-4.png" alt="Create Key Pair" width="600" height="401" /></a><p class="wp-caption-text">Create and download Key Pair.</p></div>
<p>We press <strong>Continue</strong> to go to the Firewall setting. Here we<strong> Create a new Security Group</strong>, give it a name and description, and then<strong> Create a new rule</strong> for <strong>ssh</strong> so that we can log into our instance once its up and running, and press <strong>Continue</strong>:</p>
<div id="attachment_98" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-5.png"><img class="size-full wp-image-98" title="Security Group" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-5.png" alt="Security Group" width="600" height="401" /></a><p class="wp-caption-text">Create a new Security Group and a new ssh rule.</p></div>
<p>And finally we can review our settings and <strong>Launch</strong> it:</p>
<div id="attachment_102" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-6.png"><img class="size-full wp-image-102" title="Review and Launch" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-6.png" alt="Review and Launch" width="600" height="401" /></a><p class="wp-caption-text">Review and Launch instance.</p></div>
<p>Back in our EC2 console we can go to our <strong>Instances</strong> and see our new AMI&#8217;s <strong>Status</strong>. It should be <strong>booting</strong> or <strong>running</strong>, rather than <strong>stopped</strong> as in the case below:</p>
<div id="attachment_104" class="wp-caption aligncenter" style="width: 610px"><a href="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-7.png"><img class="size-full wp-image-104" title="AMI Instance" src="http://hpc.nomad-labs.com/wp-content/uploads/2011/06/ec2-7.png" alt="AMI Instance" width="600" height="401" /></a><p class="wp-caption-text">AMI Instance&#39;s Status and Description.</p></div>
<p>The <strong>Description</strong> tab will also contain the <strong>Public DNS</strong> which we can use together with the Key Pair we downloaded locally to ssh into our instance:</p>
<p><code>$ chmod 400 amazon-gpu.pem<br />
$ ssh root@ec2-50-16-170-159.compute-1.amazonaws.com -i amazon-gpu.pem</code></p>
<p><code>__|  __|_  ) CentOS<br />
_|  (     /    v5.5<br />
___|\___|___| HVMx64 GPU</code></p>
<p><code>Welcome to an EC2 Public Image<br />
Please view /root/README<br />
 <img src='http://hpc.nomad-labs.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </code></p>
<p><code> </code></p>
<p>&nbsp;</p>
<p><code>[root@ip-10-16-7-119 ~]#<br />
</code></p>
<h2>Updating CUDA to 4.0</h2>
<p>Now we need to update the CUDA driver and toolkit on our instance, so the first thing we do is to update the Linux Kernel and reboot the instance via the web console:</p>
<p><code>[root@ip-10-16-7-119 ~]# yum update kernel kernel-devel kernel-headers<br />
Loaded plugins: fastestmirror<br />
Determining fastest mirrors<br />
* addons: mirror.cogentco.com<br />
* base: mirror.umoss.org<br />
* extras: mirror.symnds.com<br />
* updates: mirror.umoss.org<br />
addons                                                   |  951 B     00:00<br />
base                                                     | 2.1 kB     00:00<br />
base/primary_db                                          | 2.2 MB     00:00<br />
extras                                                   | 2.1 kB     00:00<br />
extras/primary_db                                        | 260 kB     00:00<br />
updates                                                  | 1.9 kB     00:00<br />
updates/primary_db                                       | 635 kB     00:00<br />
Setting up Update Process<br />
Resolving Dependencies<br />
--&gt; Running transaction check<br />
---&gt; Package kernel.x86_64 0:2.6.18-238.12.1.el5 set to be installed<br />
---&gt; Package kernel-devel.x86_64 0:2.6.18-238.12.1.el5 set to be installed<br />
---&gt; Package kernel-headers.x86_64 0:2.6.18-238.12.1.el5 set to be updated<br />
--&gt; Finished Dependency Resolution</code></p>
<p><code>Dependencies Resolved</code></p>
<p><code>================================================================================<br />
Package             Arch        Version                     Repository    Size<br />
================================================================================<br />
Installing:<br />
kernel              x86_64      2.6.18-238.12.1.el5         updates       19 M<br />
kernel-devel        x86_64      2.6.18-238.12.1.el5         updates      5.5 M<br />
Updating:<br />
kernel-headers      x86_64      2.6.18-238.12.1.el5         updates      1.2 M</code></p>
<p><code>Transaction Summary<br />
================================================================================<br />
Install       2 Package(s)<br />
Upgrade       1 Package(s)</p>
<p>Total download size: 26 M<br />
Is this ok [y/N]: y<br />
Downloading Packages:<br />
(1/3): kernel-headers-2.6.18-238.12.1.el5.x86_64.rpm     | 1.2 MB     00:00<br />
(2/3): kernel-devel-2.6.18-238.12.1.el5.x86_64.rpm       | 5.5 MB     00:00<br />
(3/3): kernel-2.6.18-238.12.1.el5.x86_64.rpm             |  19 MB     00:00<br />
--------------------------------------------------------------------------------<br />
Total                                            18 MB/s |  26 MB     00:01<br />
Running rpm_check_debug<br />
Running Transaction Test<br />
Finished Transaction Test<br />
Transaction Test Succeeded<br />
Running Transaction<br />
Installing     : kernel-devel                                             1/4<br />
Installing     : kernel                                                   2/4<br />
Updating       : kernel-headers                                           3/4<br />
Cleanup        : kernel-headers                                           4/4</p>
<p>Installed:<br />
kernel.x86_64 0:2.6.18-238.12.1.el5 kernel-devel.x86_64 0:2.6.18-238.12.1.el5</p>
<p>Updated:<br />
kernel-headers.x86_64 0:2.6.18-238.12.1.el5</p>
<p></code></p>
<p>&nbsp;</p>
<p><code>Complete!<br />
</code></p>
<p>I leave it as an exercise to figure out how to reboot the instance from the console, but once its back up and running, we can ssh back into it to download and install the CUDA 4.0 drivers, toolkit and SDK. For example:</p>
<p><code>[root@ip-10-16-7-119 ~]# wget http://developer.download.nvidia.com/compute/cuda<br />
/4_0/toolkit/cudatoolkit_4.0.17_linux_64_rhel5.5.run<br />
--2011-06-23 04:47:05--  http://developer.download.nvidia.com/compute/cuda/4_0/toolkit/cudatoolkit_4.0.17_linux_64_rhel5.5.run<br />
Resolving developer.download.nvidia.com... 168.143.242.144, 168.143.242.203<br />
Connecting to developer.download.nvidia.com|168.143.242.144|:80... connected.<br />
HTTP request sent, awaiting response... 200 OK<br />
Length: 212338897 (203M) [application/octet-stream]<br />
Saving to: `cudatoolkit_4.0.17_linux_64_rhel5.5.run'</code></p>
<p><code>100%[======================================&gt;] 212,338,897 33.2M/s   in 6.3s</code></p>
<p><code>2011-06-23 04:47:12 (32.0 MB/s) - `cudatoolkit_4.0.17_linux_64_rhel5.5.run' saved [212338897/212338897]</p>
<p></code></p>
<p>&nbsp;</p>
<p><code>[root@ip-10-16-7-119 ~]# chmod +x cudatoolkit_4.0.17_linux_64_rhel5.5.run<br />
[root@ip-10-16-7-119 ~]# ./cudatoolkit_4.0.17_linux_64_rhel5.5.run<br />
</code></p>
<p>will install the CUDA toolkit. Similarly install the drivers and SDK and finally check everything is working by typing:</p>
<p><code>[root@ip-10-16-7-119 ~]# nvidia-smi  -a -q</p>
<p>==============NVSMI LOG==============</p>
<p>Timestamp                       : Thu Jun 23 04:46:42 2011</p>
<p>Driver Version                  : 270.41.19</p>
<p>Attached GPUs                   : 2</p>
<p>GPU 0:0:3<br />
    Product Name                : Tesla M2050<br />
    Display Mode                : Disabled<br />
    Persistence Mode            : Disabled<br />
    Driver Model<br />
    ...<br />
GPU 0:0:4<br />
    ....<br />
</code></p>
<h2>MultiGPU example</h2>
<p>Once CUDA 4.0 is installed and working, we can test out the <code>MultiGPU</code> example that comes with the SDK installed earlier. Firstly we will need to install the C++ compiler:</p>
<p><code>[root@ip-10-16-7-119 simpleMultiGPU]# yum install gcc-c++</code></p>
<p>and then we need to set our LD_LIBRARY_PATH to include the CUDA libraries:</p>
<p><code>[root@ip-10-16-7-119 release]# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/lib</code></p>
<p>After that, we can go to the <code>NVIDIA_GPU_Computing_SDK/C/</code> folder and type <code>make</code>. The binaries will be installed in the <code>NVIDIA_GPU_Computing_SDK/C/bin/linux/release/</code> directory and if we go there, we can run the <code>simpleMultiGPU</code> example:</p>
<p><code>[root@ip-10-16-7-119 release]# ./simpleMultiGPU<br />
[simpleMultiGPU] starting...<br />
CUDA-capable device count: 2<br />
Generating input data...</p>
<p>Computing with 2 GPU's...<br />
  GPU Processing time: 24.472000 (ms)</p>
<p>Computing with Host CPU...</p>
<p>Comparing GPU and Host CPU results...<br />
  GPU sum: 16777280.000000<br />
  CPU sum: 16777294.395033<br />
  Relative difference: 8.580068E-07 </p>
<p>[simpleMultiGPU] test results...<br />
PASSED</p>
<p>Press ENTER to exit...<br />
</code></p>
<h2>MultiGPU Cluster Setup</h2>
<p>Using the above setup and this <a href="http://dicjtockkg63v.cloudfront.net/hpc-video-1.mov">video</a>, it is also possible to configure an 8 node cluster of GPU instances as described <a href="http://aws.amazon.com/ec2/hpc-applications/">here</a> for high performance computing applications. I will try to do a MultiGPU and <a href="http://www.open-mpi.org/">Open MPI</a> example in another blog post so stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/65/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
<enclosure url="http://dicjtockkg63v.cloudfront.net/hpc-video-1.mov" length="56884896" type="video/quicktime" />
		</item>
		<item>
		<title>Slides for 2009 GPU Tech. Conf. talk</title>
		<link>http://hpc.nomad-labs.com/archives/54</link>
		<comments>http://hpc.nomad-labs.com/archives/54#comments</comments>
		<pubDate>Mon, 19 Oct 2009 19:31:39 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[cuda]]></category>
		<category><![CDATA[gtc]]></category>
		<category><![CDATA[mathematica]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[nvidia]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=54</guid>
		<description><![CDATA[Here are the slides for the talk as promised. Note that slideshare is not showing some of the images etc., so you might be better to download the pdf from slideshare. Using CUDA Within Mathematica View more documents from krasul.]]></description>
			<content:encoded><![CDATA[<p>Here are the slides for the talk as promised. Note that slideshare is not showing some of the images etc., so you might be better to download the pdf from slideshare.</p>
<div id="__ss_2282569" style="width: 425px; text-align: left;"><a style="font: 14px Helvetica,Arial,Sans-serif; display: block; margin: 12px 0 3px 0; text-decoration: underline;" title="Using CUDA Within Mathematica" href="http://www.slideshare.net/krasul/using-cuda-within-mathematica-2282569">Using CUDA Within Mathematica</a><object style="margin: 0px;" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="355" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowScriptAccess" value="always" /><param name="src" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=usingcudawithinmathematicalite-091019141441-phpapp01&amp;stripped_title=using-cuda-within-mathematica-2282569" /><param name="allowfullscreen" value="true" /><embed style="margin: 0px;" type="application/x-shockwave-flash" width="425" height="355" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=usingcudawithinmathematicalite-091019141441-phpapp01&amp;stripped_title=using-cuda-within-mathematica-2282569" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<div style="font-size: 11px; font-family: tahoma,arial; height: 26px; padding-top: 2px;">View more <a style="text-decoration: underline;" href="http://www.slideshare.net/">documents</a> from <a style="text-decoration: underline;" href="http://www.slideshare.net/krasul">krasul</a>.</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/54/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Speaking at the GPU Technology Conference in San Jose</title>
		<link>http://hpc.nomad-labs.com/archives/49</link>
		<comments>http://hpc.nomad-labs.com/archives/49#comments</comments>
		<pubDate>Thu, 24 Sep 2009 20:26:42 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[cuda]]></category>
		<category><![CDATA[gtc]]></category>
		<category><![CDATA[mathematica]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=49</guid>
		<description><![CDATA[We will be speaking at this year&#8217;s GPU Tech. Conf. in San Jose, which goes from Sept. 30 to Oct. 2, about using CUDA within Mathematica. The slides are almost ready and we are just organizing some logistics etc. I thought we might write a bit about the talk in order to get some initial [...]]]></description>
			<content:encoded><![CDATA[<p>We will be speaking at this year&#8217;s <a href="http://www.nvidia.com/object/gpu_technology_conference.html">GPU Tech. Conf.</a> in San Jose, which goes from Sept. 30 to Oct. 2, about <a href="http://developer.download.nvidia.com/compute/cuda/docs/GTC09Materials.htm">using CUDA within Mathematica</a>. The slides are almost ready and we are just organizing some logistics etc. I thought we might write a bit about the talk in order to get some initial feedback on the content.</p>
<p>The talk is divided into three parts, initially we introduce the structure of Mathematica, in particular its MathLink API and go into the basics idea of creating a simple C++ application which we can call from Mathematica. Then we discuss the API in a bit more details, especially receiving and sending arrays to and from Mathematica. Its here where we also discus how to receive and send complex numbers, which is handy when doing FFT for example. We then briefly discuss running MathLink applications on remote computers, which is specially useful if you share your  CUDA enabled computer with others. Finally we go through some basic error and interruption handling in the MathLink API.</p>
<p>The second part then concentrates on the CUDA aspect of the MathLink application, in some sense the whole philosophy of the talk. If we create a CUDA application that can get and receive data from Mathematica, via the MathLink API, then we are done! In particular we give an overview of a simple example using the <code>mathematica_cuda</code> plugin, which lets you do just this. For a more universal solution, one that works under Windows, there is the excellent CMake module: FindCUDA together with my FindMathLink module which I wrote about previously. We then finish this part by going through a complete example: FFT via CUFFT and show how one goes about getting it working in Mathematica.</p>
<p>The last part, time permitting, is where we show some of the work we have been doing with sending computations to the GPU from Mathematica. In particular I will show some of the work I have been doing with image deconvolution of Confocal and Wide-field images. I am using the GPU to do my deconvolution experiments and using Mathematica to read in the images and analyze the results. Shoaib will present his work on calculating the vegetation index in multi and hyper-spectral satellite images.</p>
<p>I hope you find this overview helpful. We will put the slides up here when the tutorial is over, and if you plan to attend the conference it would be great to see you and get your feedback. Also if there is something specific you would like us to cover, you still have a few days to let us know.</p>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/49/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CMake module for Mathematica&#8217;s MathLink API</title>
		<link>http://hpc.nomad-labs.com/archives/37</link>
		<comments>http://hpc.nomad-labs.com/archives/37#comments</comments>
		<pubDate>Sat, 05 Sep 2009 10:39:38 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[cmake]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[mathematica]]></category>
		<category><![CDATA[FindCUDA]]></category>
		<category><![CDATA[FindMathLink]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=37</guid>
		<description><![CDATA[In order to get a more universal solution to my mathematica_cuda plugin, one that works on Windows as well as on Mac and Linux, I decided to use CMake, which comes with the excellent FindCUDA module together with a MathLink module which would offer the same functionality as the current mathematica_cuda plugin, plus more. I [...]]]></description>
			<content:encoded><![CDATA[<p>In order to get a more universal solution to my <a href="http://github.com/kashif/mathematica_cuda/tree/master">mathematica_cuda</a> plugin, one that works on Windows as well as on Mac and Linux, I decided to use <a href="http://www.cmake.org/">CMake</a>, which comes with the excellent FindCUDA module together with a <a href="http://www.wolfram.com/solutions/mathlink/mathlink.html">MathLink</a> module which would offer the same functionality as the current <code>mathematica_cuda</code> plugin, plus more.</p>
<p>I looked on the web if someone else had already written such a module for MathLink, and in the end found Erik Franken who sent me a version he had modified from a version by Jan Woetzel and others:<script src="http://gist.github.com/181152.js"></script></p>
<p>By this time I had a version on <a href="https://github.com/kashif">github</a> which I wrote up. Feel free to download it from <a href="http://github.com/kashif/FindMathLink/tree/master">here</a>.</p>
<p>Recently Markus van Almsick sent me a more advanced version which I will integrate into my version soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/37/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Mathematica on Twitter</title>
		<link>http://hpc.nomad-labs.com/archives/22</link>
		<comments>http://hpc.nomad-labs.com/archives/22#comments</comments>
		<pubDate>Thu, 30 Apr 2009 23:05:28 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[mathematica]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=22</guid>
		<description><![CDATA[A great article about Twittering with Mathematica on the Wolfram blog. I had investigated a while ago a Mathematica twitter bot for doing &#8220;Micro-calculations&#8221; with the results from Mathematica being less than 140 chars. Not very useful but a fun bot. Anyways if you are interested, I made a gist for it. Its  in Java [...]]]></description>
			<content:encoded><![CDATA[<p>A great article about<a title="Twittering with Mathematica" href="http://blog.wolfram.com/2009/04/30/twittering-with-mathematica/"> Twittering with Mathematica </a>on the Wolfram blog. I had investigated a while ago a Mathematica twitter bot for doing &#8220;Micro-calculations&#8221; with the results from Mathematica being less than 140 chars. Not very useful but a fun bot.</p>
<p>Anyways if you are interested, I made a gist for it. Its  in Java and uses JLink to communicate with Mathematica. It was never running for long as I suspect it violated some end user license, but basically one would send a Mathematica command to <a href="http://twitter.com/mathematica">@mathematica</a> and it would tweet you back your result evaluated by the MathKernel. I am hoping Wolfram might create a similar bot themselves for when you need to know the value of a special function quickly <img src='http://hpc.nomad-labs.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p><script src="http://gist.github.com/104743.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/22/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Mathematica Cuda plug-in now on github</title>
		<link>http://hpc.nomad-labs.com/archives/8</link>
		<comments>http://hpc.nomad-labs.com/archives/8#comments</comments>
		<pubDate>Tue, 23 Dec 2008 16:45:04 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[cuda]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[mathematica]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=8</guid>
		<description><![CDATA[I have decided to push the initial Mathematica Cuda plug-in to a public repo on github. Feel free to download or fork it. The basic structure of the project follows that of the Nvidia&#8217;s Cuda SDK, in that the individual projects are in their own folder inside the projects folder. Right now I have the [...]]]></description>
			<content:encoded><![CDATA[<p>I have decided to push the initial Mathematica Cuda plug-in to a public <a href="http://github.com/kashif/mathematica_cuda/tree/master">repo</a> on <a href="http://github.com/">github</a>. Feel free to download or <strong>fork it</strong>.</p>
<p>The basic structure of the project follows that of the Nvidia&#8217;s Cuda SDK, in that the individual projects are in their own folder inside the projects folder. Right now I have the scalarProd example from Nvidia. I have also included Nvidia&#8217;s cuda utilities <strong>cutils</strong> and extended the make system to handle Mathematica template files.</p>
<p>Currently I have tested it only on 64-bit Linux, but hopefully I will see if I can get it working under Mac and Windows. I also plan to add more documentation in the project&#8217;s <a href="http://github.com/kashif/mathematica_cuda/wikis">wiki</a> on github, and hopefully  get some more useful examples implemented, perhaps FFT.</p>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/8/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Mathematica plug-in for CUDA</title>
		<link>http://hpc.nomad-labs.com/archives/3</link>
		<comments>http://hpc.nomad-labs.com/archives/3#comments</comments>
		<pubDate>Wed, 25 Jun 2008 22:15:15 +0000</pubDate>
		<dc:creator>kashif</dc:creator>
				<category><![CDATA[cuda]]></category>
		<category><![CDATA[mathematica]]></category>
		<category><![CDATA[mathlink]]></category>

		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=3</guid>
		<description><![CDATA[Since there is a Matlab plug-in for CUDA that provides some examples of off-loading computation to the GPU, I thought it might be neat to have something similar for Mathematica. So as a start, I decided to try out a simple scalar product example using MathLink. The initial template of my function is in the [...]]]></description>
			<content:encoded><![CDATA[<p>Since there is a <a title="Matlab" href="http://www.mathworks.com/">Matlab</a> <a title="Matlab plug-in for CUDA" href="http://developer.nvidia.com/object/matlab_cuda.html">plug-in</a> for <a title="CUDA" href="http://www.nvidia.com/object/cuda_home.html">CUDA</a> that provides some examples of off-loading computation to the GPU, I thought it might be neat to have something similar for <a title="Mathematica" href="http://www.wolfram.com/">Mathematica</a>. So as a start, I decided to try out a simple scalar product example using <a title="MathLink" href="http://www.wolfram.com/solutions/mathlink/mathlink.html">MathLink</a>.</p>
<p>The initial template of my function is in the <strong>scalarProd.tm</strong> file:</p>
<p><script src="http://gist.github.com/180970.js"></script></p>
<p>which describes the <strong>ScalarProd[]</strong> function in Mathematica, and links it to the <strong>scalarProd()</strong> C method, which is where we  receive the two arrays from Mathematica and use CUDA to calculate their scalar product and send the result back. This and the <strong>main()</strong> function for Linux and Mac, which is what I was using, are in the <strong>scalarProd.cu</strong> file. Note that Windows has a slightly different <strong>main()</strong> method.<br />
<script src="http://gist.github.com/180971.js"></script><br />
and in the same <strong>scalarProd.cu</strong> we now include the  <strong>scalarProd_kernel.cu</strong> kernel from CUDA&#8217;s SDK together with our <strong>scalarProd()</strong> C function:</p>
<p><script src="http://gist.github.com/180972.js"></script></p>
<p>Now we are ready to run Mathematica&#8217;s <strong>mprep</strong> pre-processor from MathLink to generate a <strong>scalarProdtm.cu</strong> file, and on this we run CUDA&#8217;s compiler <strong>nvcc</strong> and compile everything with the appropriate CUDA and MathLink libraries to generate our <strong>scalarProd</strong> binary, which we can now call from within Mathematica:</p>
<p><script src="http://gist.github.com/180974.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://hpc.nomad-labs.com/archives/3/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Served from: hpc.nomad-labs.com @ 2012-02-23 07:22:26 -->
