<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: CUDA 4.0 MultiGPU on an Amazon EC2 instance</title>
	<atom:link href="http://hpc.nomad-labs.com/archives/65/feed" rel="self" type="application/rss+xml" />
	<link>http://hpc.nomad-labs.com/archives/65</link>
	<description>at nomad-labs</description>
	<lastBuildDate>Wed, 16 Nov 2011 02:19:25 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: kashif</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-469</link>
		<dc:creator>kashif</dc:creator>
		<pubDate>Wed, 16 Nov 2011 02:19:25 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-469</guid>
		<description>Nice! glad you got it working. I&#039;m working on a GPU Cluster tutorial. Will post it soon.</description>
		<content:encoded><![CDATA[<p>Nice! glad you got it working. I&#8217;m working on a GPU Cluster tutorial. Will post it soon.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Sorhaindo</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-468</link>
		<dc:creator>Michael Sorhaindo</dc:creator>
		<pubDate>Wed, 16 Nov 2011 02:00:05 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-468</guid>
		<description>I even got the examples built and running, although there was no simpleMultiGPU example in the SDK I built, but another similar program proved multiple CPU&#039;s worked. Anyways I&#039;m off to bed, enough spamming your blog. Thanks and goodnight.</description>
		<content:encoded><![CDATA[<p>I even got the examples built and running, although there was no simpleMultiGPU example in the SDK I built, but another similar program proved multiple CPU&#8217;s worked. Anyways I&#8217;m off to bed, enough spamming your blog. Thanks and goodnight.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Sorhaindo</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-467</link>
		<dc:creator>Michael Sorhaindo</dc:creator>
		<pubDate>Wed, 16 Nov 2011 01:47:08 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-467</guid>
		<description>I couldn&#039;t sleep so I logged in to amazon and


==============NVSMI LOG==============

Timestamp                       : Tue Nov 15 17:45:08 2011

Driver Version                  : 270.41.19

Attached GPUs                   : 2

GPU 0:0:3
    Product Name                : Tesla M2050
    Display Mode                : Disabled
    Persistence Mode            : Disabled
    Driver Model
        Current                 : N/A
        Pending                 : N/A
    Serial Number               : 0322510018511

...ect...

Thanks Kashif, awesome article!!</description>
		<content:encoded><![CDATA[<p>I couldn&#8217;t sleep so I logged in to amazon and</p>
<p>==============NVSMI LOG==============</p>
<p>Timestamp                       : Tue Nov 15 17:45:08 2011</p>
<p>Driver Version                  : 270.41.19</p>
<p>Attached GPUs                   : 2</p>
<p>GPU 0:0:3<br />
    Product Name                : Tesla M2050<br />
    Display Mode                : Disabled<br />
    Persistence Mode            : Disabled<br />
    Driver Model<br />
        Current                 : N/A<br />
        Pending                 : N/A<br />
    Serial Number               : 0322510018511</p>
<p>&#8230;ect&#8230;</p>
<p>Thanks Kashif, awesome article!!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Sorhaindo</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-466</link>
		<dc:creator>Michael Sorhaindo</dc:creator>
		<pubDate>Wed, 16 Nov 2011 00:51:37 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-466</guid>
		<description>Thanks Kashif! I&#039;m going to try this out tomorrow!</description>
		<content:encoded><![CDATA[<p>Thanks Kashif! I&#8217;m going to try this out tomorrow!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Oschler</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-427</link>
		<dc:creator>Robert Oschler</dc:creator>
		<pubDate>Wed, 29 Jun 2011 07:44:55 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-427</guid>
		<description>Thanks kashif.  I&#039;ll be watching your blog to see if you do a PyCUDA post. :)</description>
		<content:encoded><![CDATA[<p>Thanks kashif.  I&#8217;ll be watching your blog to see if you do a PyCUDA post. <img src='http://hpc.nomad-labs.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kashif</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-426</link>
		<dc:creator>kashif</dc:creator>
		<pubDate>Tue, 28 Jun 2011 20:09:55 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-426</guid>
		<description>Yes pycuda (or pyopencl for that matter) should work quite well also, just a matter of compiling it up on the instances.</description>
		<content:encoded><![CDATA[<p>Yes pycuda (or pyopencl for that matter) should work quite well also, just a matter of compiling it up on the instances.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Oschler</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-425</link>
		<dc:creator>Robert Oschler</dc:creator>
		<pubDate>Tue, 28 Jun 2011 00:59:27 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-425</guid>
		<description>Thanks Kashif, you answered all my questions.  Nearly 1,000 cores per box, that&#039;s really fantastic.

Has anybody tried using PyCUDA  in this scenario?  If not, do you think it would work?</description>
		<content:encoded><![CDATA[<p>Thanks Kashif, you answered all my questions.  Nearly 1,000 cores per box, that&#8217;s really fantastic.</p>
<p>Has anybody tried using PyCUDA  in this scenario?  If not, do you think it would work?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kashif</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-422</link>
		<dc:creator>kashif</dc:creator>
		<pubDate>Fri, 24 Jun 2011 11:55:40 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-422</guid>
		<description>Thanks Robert. To answer your questions:

1. So this single node cost me $2.10 per hour. I suspect for an 8 node GPU cluster the price will be more than $16.80 per hour not counting the price for storage and network access etc. Check here: &lt;a href=&quot;http://aws.amazon.com/ec2/pricing/&quot; rel=&quot;nofollow&quot;&gt;Amazon EC2 Pricing&lt;/a&gt;.

2. Not sure if I understand the question, the example above is using all the 448*2 cores of the two GPU&#039;s on the instance, and its the programmer&#039;s job to keep these cores &quot;feed&quot; by balancing the latency and computational requirements of the algorithm, but all that is done in the code when we call the CUDA kernel. I suspect the Fermi chips on these boards are the same one would get on a Tesla or GeForce board, but I will need to check that... it might be that Nvidia sorts out the GPU&#039;s according to quality of working cores and the low quality chips end up on Geforce cards and the higher quality ones on their Quadro or Tesla line... not sure though. Is that what you wanted to know?

3. As far as I know, the Azure platform does not have GPU support yet. One could perhaps try to provision a Windows 64 bit HVM AMI (if such an AMI exists) on the Cluster GPU instances on Amazon and try to install the CUDA drivers etc., and then copy over your CUDA binary and run it... but I think thats not possible either. Perhaps you can also check out &lt;a href=&quot;http://www.hoopoe-cloud.com/Features.aspx&quot; rel=&quot;nofollow&quot;&gt;Hoopoe&lt;/a&gt;.

Hope this answers your questions.</description>
		<content:encoded><![CDATA[<p>Thanks Robert. To answer your questions:</p>
<p>1. So this single node cost me $2.10 per hour. I suspect for an 8 node GPU cluster the price will be more than $16.80 per hour not counting the price for storage and network access etc. Check here: <a href="http://aws.amazon.com/ec2/pricing/" rel="nofollow">Amazon EC2 Pricing</a>.</p>
<p>2. Not sure if I understand the question, the example above is using all the 448*2 cores of the two GPU&#8217;s on the instance, and its the programmer&#8217;s job to keep these cores &#8220;feed&#8221; by balancing the latency and computational requirements of the algorithm, but all that is done in the code when we call the CUDA kernel. I suspect the Fermi chips on these boards are the same one would get on a Tesla or GeForce board, but I will need to check that&#8230; it might be that Nvidia sorts out the GPU&#8217;s according to quality of working cores and the low quality chips end up on Geforce cards and the higher quality ones on their Quadro or Tesla line&#8230; not sure though. Is that what you wanted to know?</p>
<p>3. As far as I know, the Azure platform does not have GPU support yet. One could perhaps try to provision a Windows 64 bit HVM AMI (if such an AMI exists) on the Cluster GPU instances on Amazon and try to install the CUDA drivers etc., and then copy over your CUDA binary and run it&#8230; but I think thats not possible either. Perhaps you can also check out <a href="http://www.hoopoe-cloud.com/Features.aspx" rel="nofollow">Hoopoe</a>.</p>
<p>Hope this answers your questions.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Oschler</title>
		<link>http://hpc.nomad-labs.com/archives/65/comment-page-1#comment-421</link>
		<dc:creator>Robert Oschler</dc:creator>
		<pubDate>Fri, 24 Jun 2011 10:40:55 +0000</pubDate>
		<guid isPermaLink="false">http://hpc.nomad-labs.com/?p=65#comment-421</guid>
		<description>Great article.  What does an 8-node cluster of GPU instances cost to run per hour on EC2 and how does a single GPU core in this example compare to in core count, for example, a single chip Fermi board?  Also, do you know of any CUDA cloud providers that provided Windows instances instead of CentOS?</description>
		<content:encoded><![CDATA[<p>Great article.  What does an 8-node cluster of GPU instances cost to run per hour on EC2 and how does a single GPU core in this example compare to in core count, for example, a single chip Fermi board?  Also, do you know of any CUDA cloud providers that provided Windows instances instead of CentOS?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Served from: hpc.nomad-labs.com @ 2012-02-23 07:44:31 -->
