Archive for the 'Customers' Category

SuperComputing 2008 Conference

Five of us from Plura attended the SuperComputing 2008 conference in Austin.  Although we did not set up a booth, we still had a great week and talked to lots of potential customers and vendors about Plura and our coming infrastructure needs.  To our new friends, we hope you enjoyed the show as much as we did and had a safe trip home.

I’ve attended several SuperComputing conferences in the past and it is interesting how it has evolved.  Two years ago, GPUs and FPGAs were exploratory.  This year, they are everywhere.  Many people are predicting that the current GPU focus is temporary until Intel and AMD start embedding many smaller cores on the chip (see Intel Larabee for Intel’s version).

GigaOm makes a good point that SuperComputing is more and more about software.  Everyone knows how to build large clusters these days.  The software and infrastructure to tie them together to solve meaningful problems is the critical piece.  We’re working hard to solve real problems that are difficult on traditional infrastructures.

Also, one quick brag.  The new Top 500 list of fastest SuperComputers was released during the show.  Even in our limited private beta state, our current Plura capacity would place us in the top 10 SuperComputers in the world based on GFLOPs.  We’re pretty proud of this given that our private beta started less than a month ago.

Example applications using Plura

Plura is a unique blend of processing power and bandwidth.  It is not quite a cluster and has features that make it unique from grid computing projects like Folding@home.

I’d like to give some examples of problems that are being solved with Plura today.  I’ll post more details about these in the future:

Stock Market SimulationsQuant R&D is using Plura to analyze the stock market using an all-vs-all data strategy.  Quant’s problem boils down to wanting to run complex simulations on pairs of stocks from the entire stock market.  Each simulation is relatively expensive, but it can be broken up into pieces.  Quant uses 1-minute stock market data and one stock’s data over the time period they analyze is about 1MB compressed.  To stay within Plura node memory limitations and to minimize bandwidth needs, they want to reuse the stocks as much as possible.  So, they use Plura data groups to ensure Plura nodes get to reuse the stock data they have already downloaded as often as possible.  This means that a typical Plura node will download two 1MB pieces of data and will then proceed to work on WUs (work units) using those two pieces of data for quite some time (probably longer than the life of the node).

Custom Web Crawling80legs is using Plura to do distributed web crawling.  Rather than having data centers with very fat pipes, they use a portion of the bandwidth of the Plura nodes to crawl the web.  In order to improve the success ratio for each work unit, 80legs sends out Plura WUs with very few URLs to crawl.  The economics of this are incredibly dramatic compared to data center bandwidth.

Prime Number Search – As a sample application, Plura has created a distributed prime factoring engine that is doing pre-factoring for large Mersenne primes.  We are using this application to demonstrate a different data model for Plura.  This model has no specific data, but each WU contains a specific amount of factoring to do.  In the next weeks or months, we will release yet another prime number application that implements the Lucas-Lehmer test for Mersenne primes.  This will show yet another data model for Plura, so stay tuned for more information.

We have other customers that are in various stages of evaluating Plura usage for a variety of applications.  We’ll release details on these if and when they agree to do so.  In the mean time, if you want to explore the possibility of running any particular algorithm on Plura, feel free to contact us.

Comparing Plura to Amazon’s EC2 for High Performance Computing

Amazon has recently come out of beta with their EC2 service, as noted by several sources:

EC2 has been extremely successful since its launch.

I’d like to do a quick comparison between Plura and EC2 for high performance computing.  (Note: All EC2 information was taken from Amazon’s website.)

Compute Performance

EC2 High-CPU Medium Instance nodes provide 1.7 GB of memory, 350 GB of instance storage and local network accesses between EC2 nodes at no charge.  Plura nodes vary in memory and size and we never use the disk.  Plura applications can request nodes with a minimum memory size if necessary.  Also, Plura nodes are not connected.  Each node knows only about its own tasks and can’t share with others.  This lends Plura to what’s called embarrassingly parallel applications of HPC.  That said, we have techniques for making some very difficult algorithms embarrassingly parallel.

Plura has a significant advantage over EC2 in terms of amount of available computing power.  Right now, the maximum # of nodes a customer can have on EC2 is about 1,000 to 3,000 nodes.  On Plura, customers have access to the entire node pool, which is currently over 50,000 nodes.  This gives Plura users significantly more compute power.

Cost

Each EC2 High-CPU Medium instance costs $0.20/hour.  This is equivalent to paying $1752/year for a  5-6 GHz CPU.  Plura charges approximately $100/year for a 2 GHz CPU (the average speed of nodes on our network).

The “conversion factor” between Plura nodes and EC2 nodes would be 2.5 Plura nodes = 1 EC2 High-CPU Medium Instance.  So Plura charges $250/year for the equivalent of 1 EC2 node, while Amazon charges $1752/year.

Side note: If you need to use EC2 nodes with Internet bandwidth, the $/year goes by $1000s/year for each node.  For now, Plura builds this in for free.

Conclusion

For HPC applications that need a lot of inter-node communication, EC2 will probably suit your needs better even at a higher cost.  However, if your application is suitable for Plura, you can save 7X on your compute costs.  If you need the equivalent of 1000 5 GHz nodes for a year, Plura will save you over $1.5 million ($250K vs $1.75M).



Follow

Get every new post delivered to your Inbox.