Computing in the Cloud

Page last modified 00:17, 31 Jan 2010 by mndoci | Page History
Table of contents

Here is a list of scientific/parallel/distributed computing frameworks and platforms for Amazon EC2.  Note that since you do have root access to EC2, you can run and deploy most distributed/parallel computing frameworks.  The following list covers those that either are exclusively designed for EC2 (usually via some abstraction) or have built in EC2 support.

Non-commercial

  • cloud-crowd:CloudCrowd is intended to make distributed processing easy for Ruby programmers.
  • ec2cluster: Rails REST web service and dashboard UI for launching MPI clusters on Amazon EC2 and running user submitted jobs
  • StarCluster:StarCluster is a utility for creating and managing general purpose computing clusters hosted on Amazon's Elastic Compute Cloud (EC2). StarCluster minimizes the administrative overhead associated with obtaining, configuring, and managing a traditional computing cluster used in research labs or for general distributed computing applications.
  • ec2mpi: A command line interface for managing MPI clusters on Amazon EC2.
  • Apache Hadoop: The open source Hadoop distribution has native support for Amazon EC2
  • Crane: A clojure library that sits on top of Cascading and Hadoop.  Also drives the statistical learning part of Incanter
  • Disco:Disco is an open-source implementation of the Map-Reduce framework for distributed computing. As the original framework, Disco supports parallel computations over large data sets on unreliable cluster of computers.

Commercial

  • Amazon Elastic MapReduce: Amazon's service to manage and run Hadoop clusters.  Also supports Pig, Hive and Cascading
  • CycleCloud: CycleCloud takes the delays, configuration, administration, and sunken hardware costs out of Grid Computing allowing you to focus on running your jobs.  Condor based
  • UniCloud: With UniCloud, companies can form an elastic compute infrastructure or cloud environment that unifies provisioning, configuration and virtualization management with application configuration into a single RESTful web-services-based framework.  SGE based
  • Sun Grid Engine: Sun Grid Engine software is the world's leading - and most widely deployed - distributed resource manager. SGE 6.2U5 supports multiple EC2 AMIs and also has native Hadoop support
  • Cloudera Hadoop AMI: Cloudera's Hadoop distribution for Amazon EC2
  • RightGrid: The Grid Edition lets you control and manage any background or batch processing worker tasks in a scalable, fault-tolerant, and audited environment. It is ideal for processing numerous datasets.

I also maintain a list of Life Science Apps on EC2

My own notes on using and configuring EC2 and other AWS resources can be found here

If you want to help me maintain this page just let me know.  firstname@firstnamelastname.net

Tag page
Pages that link here
Page statistics
4931 view(s), 11 edit(s), and 4892 character(s)

Comments

You must login to post a comment.

Attach file

Attachments