In the recent Amazon AWS Newsletter, they asked the following:
Some customers have asked us about ways to easily convert virtual machines from VMware vSphere, Citrix Xen Server, and Microsoft Hyper-V to Amazon EC2 instances - and vice versa. If this is something that you're interested in, we would like to hear from you. Please send an email to email@example.com describing your needs and use case.
I'll share my reply here for comment!
This is a killer feature that allows a number of important activities.
1. Product VMs. Many suppliers are starting to provide third-party products in the form of VMs instead of software to ease install complexity, or in an attempt to move from a hardware appliance approach to a more-software approach. This pretty much prevents their use in EC2. <cue sad music> As opposed to "Hey, if you can VM-ize your stuff then you're pretty close to being able to offer it as an Amazon AMI or even SaaS offering." <schwing!>
2. Leveraging VM Investments. For any organization that already has a VM infrastructure, it allows for reduction of cost and complexity to be able to manage images in the same way. It also allows for the much promised but under-delivered "cloud bursting" theory where you can run the same systems locally and use Amazon for excess capacity. In the current scheme I could make some AMIs "mostly" like my local VMs - but "close" is not good enough to use in production.
3. Local testing. I'd love to be able to bring my AMIs "down to me" for rapid redeploy. I often find myself having to transfer 2.5 gigs of software up to the cloud, install it, find a problem, have our devs fix it and cut another release, transfer it up again (2 hour wait time again, plus paying $$ for the transfer)...
4. Local troubleshooting. We get an app installed up in the cloud and it's not acting quite right and we need to instrument it somehow to debug. This process is much easier on a local LAN with the developers' PCs with all their stuff installed.
5. Local development. A lot of our development exercises the Amazon APIs. This is one area where Azure has a distinct advantage and can be a threat; in Visual Studio there is a "local Azure fabric" and a dev can write their app and have it running "in Azure" but on their machine, and then when they're ready deploy it up. This is slightly more than VM consumption, it's VMs plus Eucalyptus or similar porting of the Amazon API to the client side, but it's a killer feature.
Xen or VMWare would be fine - frankly this would be big enough for us I'd change virtualization solutions to the one that worked with EC2.
I just asked one of our developers for his take on value for being able to transition between VMs and EC2 to include in this email, and his response is "Well, it's just a no-brainer, right?" Right.
Here's a couple tidbits I've gleaned that are useful.
When you start an "instance-store" Amazon EC2 instance, you get a certain amount of ephemeral storage allocated and mounted automatically. The amount of space varies by instance size and is defined here. The storage location and format also varies by instance size and is defined here.
The upshot is that if you start an "instance-store" small Linux EC2 instance, it automagically has a free 150 GB /mnt disk and a 1 GB swap partition up and runnin' for ya. (mount points vary by image, but that's where they are in the Amazon Fedora starter.)
[root@domU-12-31-39-00-B2-01 ~]# df -k
Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 10321208 1636668 8160252 17% / /dev/sda2 153899044 192072 145889348 1% /mnt none 873828 0 873828 0% /dev/shm
[root@domU-12-31-39-00-B2-01 ~]# free total used free shared buffers cached Mem: 1747660 84560 1663100 0 4552 37356 -/+ buffers/cache: 42652 1705008 Swap: 917496 0 917496
But, you say, I am not old or insane! I use EBS-backed images, just as God intended. Well, that's a good point. But when you pull up an EBS image, these ephemeral disk areas are not available to you. The good news is, that's just by default.
The ephemeral storage is still available and can be used (for free!) by an EBS-backed image. You just have to set the block devices up either explicitly when you run the instance or bake them into the image.
You refer to the ephemeral chunks as "ephemeral0", "ephemeral1", etc. - they don't tell you explicitly which is which but basically you just count up based on your instance type (review the doc). For a small image, it has an ephemeral0 (ext3, 15 GB) and an ephemeral1 (swap, 1 GB). To add them to an EBS instance and mount them in the "normal" places, you do:
ec2-run-instances <ami id> -k <your key> --block-device-mapping '/dev/sda2=ephemeral0' --block-device-mapping '/dev/sda3=ephemeral1'
On the instance you have to mount them - add these to /etc/fstab and mount -a or do whatever else it is you like to do:
/dev/sda3 swap swap defaults 0 0 /dev/sda2 /mnt ext3 defaults 0 0
And if you want to turn the swap on immediately, "swapon /dev/sda3".
You can also bake them into an image. Add a fstab like the one above and when you create the image, do it like this, using the exact same --block-device-mapping flag:
ec2-register -n <ami id> -d "AMI Description" --block-device-mapping /dev/sda2=ephemeral0 --block-device-mapping '/dev/sda3=ephemeral1' --snapshot your-snapname --architecture i386 --kernel<aki id> --ramdisk <ari id>
Ta da. Free storage that doesn't persist. Very useful as /tmp space. Opinion is split among the Linuxerati about whether you want swap space nowadays or not; some people say some mix of "if you're using more than 1.8 GB of RAM you're doing it wrong" and "swapping is horrid, just let bad procs die due to lack of memory and fix them." YMMV.
As another helpful tip, let's say you're adding an EBS to an image that you don't want to be persistent when the instance dies. By default, all EBSes are persistent and stick around muddying up your account till you clean them up. If you don't want certain EBS-backed drives to persist, what you do is of the form:
ec2-modify-instance-attribute --block-device-mapping "/dev/sdb=vol-f64c8e9f:true" i-e2a0b08a
Where 'true' means "yes, please, delete me when I'm done." This command throws a stack trace to the tune of
Unexpected error: java.lang.ClassCastException: com.amazon.aes.webservices.client.InstanceBlockDeviceMappingDescription cannot be cast to com.amazon.aes.webservices.client.InstanceBlockDeviceMappingResponseDescription
But it works, that's just a lame API tools bug.
For those of us getting into amazon's Elastic Compute Cloud (ec2), this is a really cool idea. The idea is that your load grows and a new node is ready to handle additional capacity. Once load lessens, boxes are turned off. Integrating this with box stats, response times, monitoring per service makes sense.
I wanted everyone to be thinking of the consumable computing model. Pay as you go for what you use is really attractive. No more do you have to have 10 boxes in your www cluster all day long if your spike is only during 8am to 3pm. Now you can run the 10 boxes during those times and use less boxes during non peak times... Pretty cool. And cheap!