qcow2 image format and cluster_size

  • 0

qcow2 image format and cluster_size

Get Social!

There are various things you need to consider when creating a virtual disk for a virtual machine, such as the size of the disk, if the disk is sparse or not, compression, encryption, and various other things.

Many of these things will depend on the type of load placed upon the disk, and the requirements that load has. For example, you can reduce the physical size of the disk if you use compression, and you can increase the security of your data if you use encryption.

Often the low level details of virtual disks are overlooked and left with the default values which may or may not be sensible for your scenario. One such detail is the qcow2 virtual disk property cluster_size.

A virtual disk, much like how operating systems treat physical disks, are split up into clusters; each cluster being a predefined size and holding a single unit of data. A cluster is the smallest amount of data that can be read or written to in a single operation. There is then an index lookup that’s often kept in memory that knows what information is stored in each cluster and where that cluster is located.

A qcow2 filesystem is copy-on-write (q’cow’2) which means that if a block of data needs to be altered then the whole block is re-written, rather than just updating the changed data. This means that if you have a block size of 1024 (bytes) and you change 1 byte then 1023 bytes have to be read from the original block and then 1024 bytes have to be written – that’s an overhead of 1023 bytes being read and written above the 1 byte change you created. That over head isn’t too bad in the grand scheme, but imagine if you had a block size of 1 MB and still only changed a single byte!

On the other hand, with much large writes another problem can be noticed. If you are constantly writing 1MB files and have a block size of 1024 bytes then you’ll have to split that 1MB file into 1024 parts and store each part in a cluster. Each time a cluster is written to, the metadata must be updated to reflect the new stored data. Again then, there is a performance penalty in storing data this way. A more efficient way of writing 1MB files would be to have a cluster size of 1MB so that each file will occupy a single block with only one metadata reference.

Testing qcow2 cluster_size

The below table shows how the cluster_size affects the performance of a qcow2 virtual disk image. The tests are all performed on the same hardware and on a single hard disk that’s on it’s own dedicated bus with no other traffic. The disk itself is a Samsung SSD. The tests are using the same size virtual disk image of 4GB provisioned with preallocation of full, encryption is disabled and lazy_refcounts are off.

Several qcow2 virtual disks have been created with varying cluster_size attributes and a single 134MB file has been written to each disk. The below table shows various statistics and timings resulting from each test.

cluster_size Time to create disk Time to write MB/ s
512 1m41.889s 2.69157s 49.9MB/ s
1K 49.697s 2.30576s 58.2MB/ s
64K 19.085s 1.69912s 79.0MB/ s
2M 1.085s 1.46358s 91.7MB/ s

 

The above table covers the smallest cluster_size of 512, the default of 64K (65536) and the largest possible of 2M.

 

As always, test and tune the parameters for your workload.


  • 1

qcow2 Physical Size With Different preallocation Settings

Get Social!

The qcow2 image format is the defacto image format for KVM/ QEMU virtual machines. The format provides various parameters that can be configured when creating the image, each with their benefits and drawbacks.

The below section describes the preallocation attribute and how it can effect the size and performance of a virtual machine.

Please see this blog post for more information on preallocation, and then continue on to the results!

The below tests are all performed on the same hardware and on a single hard disk that’s on it’s own dedicated bus with no other traffic. The disk itself is a mechanical Western Digital Green 2TB. I’ve done it on this rather than on an SSD so that the results are more dramatic so that we can understand how much IO performance makes a difference. The tests are also using the same size virtual disk image of 4GB, encryption is disabled, cluster_size is the default 65536 and lazy_refcounts are off unless otherwise specified.

Virtual Disk Creation Time

The first example shows how long it takes to create each virtual disk image and how much physical disk space is being used/ reserved for the image.

preallocation setting Time to create Physical size on disk
off 0.312s 196K
metadata 0.507s 844K
falloc 0.015s 4.0G
full 39.402s 4.0G

As you can see, it takes a huge amount of time to use the full allocation setting because the filesystem it’s being written to has to assign the full size of the file and write empty data to it (in our case around 4GB). The least is taken up with falloc and that’s because qemu-img uses the underlying filesystems fallocate function to allocate the disk space without having to write data to consume the full size.

You can download the bash script used for the above test Disk Test preallocation Disk Size Script.

Virtual Disk Performance

The next thing to consider is the performance of each virtual disk type. For this test each virtual disk is mounted and written to using dd. The performance hit here is when the virtual disk has to expand and allocate physical disk space for new data clusters and new metadata, with metadata creation being by far the biggest overhead.

preallocation setting Time to create MB/ s
off 184.23s 729kB/ s
metadata 85.87s 1.6MB/ s
falloc 100.77s 1.3MB/ s
full 84.31s 1.6MB/ s

You can immediately see that virtual disks with no preallocation take by far the longest to write to, and virtual disks with full preallocation are the quickest. Interestingly a preallocation value of metadata is a very close second to full which indicates much of the performance hit is down to assigning and managing metadata.

You can download the bash script used for the above test Disk Test preallocation Write Performance.

 

 


  • 28

Upload OVA to Proxmox/ KVM

Get Social!

Proxmox does not have native support for an OVA template which is surprising considering it’s the open format for creating packaged virtual machines, or virtual appliances as they are often referred.

We can still get an OVA template running in Proxmox but it will take a little bit of work to transform it into a functional VM.

ovf-upload

First off, lets get the OVA file uploaded to the Proxmox server; you can do this using SCP or the Proxmox web GUI. If you use the Proxmox web GUI you will need to rename the OVA to end in a iso extension and upload it as a ISO image content. Depending on the size of the OVA file and the bandwidth you have available, it may take a while to upload the file. The file will then be available in the dump folder in the selected storage.

SSH onto your Proxmox server and locate the OVA file. An OVA file is simply a tar file containing an image file and some configuration for things like CPU, RAM, etc. Run the tar command to extract the components of the OVA file onto your file system.

The output will be two or more files – one will be an OVF file which contains the settings and configuration of the virtual machine and one or more files will be VMDKs which are the disk images of the virtual machine.

Although you can run a VMDK file in Proxmox, it’s recommended to use qcow2 which is the default file format for Proxmox virtual machines. Run the VMDK file through the converter – note this can take a while with large files.

We now need to get the image into a VM with some hardware so that we can begin to use it. This is where things get tricky – the OVF file is not compatible with Proxmox and needs to be manually understood. The principle here is we are going to use the Proxmox web GUI to create a VM and replace the empty disk image which is created with our recently converted qcow2 image.

You can use vi to open the OVF file and understand some of the basic settings which are required for the VM. Open the OVF file and look for the following XML tags:

  • OperatingSystemSection
  • VirtualHardwareSection
  • Network
  • StorageControllers

You should be able to get a rough idea of the requirements for the KVM. In the Proxmox web GUI, click on Create VM and create a VM which meets the requirements of the image you converted. Make sure that you select qcow2 for the disk format. After clicking Finish an empty VM will be created – in this example I used local storage and VMID 101 so the disk images are stored in /var/lib/vz/images/101.

proxmox-complete-create-vm

Copy the previously converted qcow2 image over the existing image – be sure to overwrite the existing image otherwise your image will not be used and KVM will try to start with a blank, empty image.

Thats it – you can now start up the image from the Proxmox web GUI.


  • 18

Reclaim disk space from a sparse image file (qcow2/ vmdk)

Get Social!

western-digital-diskSparse disk image formats such as qcow2 only consume the physical disk space which they need. For example, if a guest is given a qcow2 image with a size of 100GB but has only written to 10GB then only 10GB of physical disk space will be used. There is some slight overhead associated, so the above example may not be strictly true, but you get the idea.

Sparse disk image files allow you to over allocate virtual disk space – this means that you could allocate 5 virtual machines 100GB of disk space, even if you only have 300GB of physical disk space. If all the guests need 100% of their 100GB disk space then you will have a problem. If you use over allocation of disk space you will need to monitor the physical disk usage very carefully.

There is another problem with sparse disk formats, they don’t automatically shrink. Let’s say you fill 100GB of a sparse disk (we know this will roughly consume 100GB of physical disk space) and then delete some files so that you are only using 50GB. The physical disk space used should be 50GB, right? Wrong. Because the disk image doesn’t shrink, it will always be 100GB on the file system even if the guest is now using less. The below steps will detail how to get round this issue.

On Linux

We need to fill the disk of the guest with zero’s (or any other character) so that the disk image can be re-compressed.

In a terminal, run the below command until you run out of disk space. Before running this, be sure to stop any applications running on the guest otherwise errors may result.

Once the command errors out (this may take a while depending on your disk image size and physical disk speed) delete the file.

Shutdown the guest and follow the steps below under All OS’s.

On Windows

You will need to download a tool called sdelete from Microsoft which is will fill the entire disk with zeros which can be re-compressed later.

Download: http://technet.microsoft.com/en-gb/sysinternals/bb897443.aspx

Once you have downloaded and extracted sdelete, open up a command prompt and enter the following. This assumes that sdelete was extracted into c:\ and c:\ is the disk you would like to use to reclaim space

Once this completes (this may take a while depending on your disk image size and physical disk speed), shutdown the guest and follow the below steps under All OS’s.

All OS’s

The rest of the process is done on the host so open up a terminal window and SSH to your Proxmox host. Move to the directory where the disk image is stored and run the below commands.

Make sure you have shut down the virtual machine which is using the qcow2 image file before running the below commands.

The above commands move the original image file, and then re-compress it to it’s original name. This will shrink the qcow2 image to consume less physical disk space.

You can now start the guest and check that everything is in working order. If it is, you can remove the original_image.qcow2_backup file.