According to this presentation http://channel9.msdn.com/Shows/TechNet+Radio/TechNet-Radio-IT-Time--Part-5-Real-World-Azure--Provisioning-Storage-for-IO-Intensive-Applications-o the presenter was able to create multiple disks in multiple storage accounts attached to a single instance. How is this achieved?
I have been unable to find a mechanism to do so in either the portal, or linux / powershell command-line tools.
The motivation for doing this is IO performance. I have been having difficulty coaxing performance out of Windows Azure. At present I am experimenting with an Large instance with 8 25GB data drives in an RAID0 configuration with no geo-replication using mdadm under Ubuntu 12.04 LTS.
While I get very impressive benchmark results (using bonnie++) for a VM (exceeding my SSD MBP) I get very poor real-world performance attempting to import an MySQL database. For comparison I have a Small EC2 w/ 4 12.5 GB EBS instances in an RAID10 configuration (since EBS is not high-reliability as Azure Storage) that with identical MySQL memory restrictions is able to import a sizable database in 12 Minutes vs 30 Minutes for my Azure Instance. Relaxing restrictions allows my Azure instance to exceed the performance by roughly 15 seconds (e.g. utilizing 75% of memory for MySQL as would be expected for a dedicated MySQL instance). I understand that a data import is going to be very much limited by running in a single thread and likely CPU performance just as much as it will be by IO but I'm finding this performance troublesome. For reference I am able to do this import in ~6 minutes on my laptop which has a fraction of the ram available to it.
There is nothing else using this storage account heavily. My understanding is that every Storage Account has at least 5,000 IOPS (maybe more, I am not sure if the EOY 2012 targets have actually been met, the documentation is very vague) and every drive has a 500 IOPS limit, which means I should be under the limit with comfortable margin. Some of my colleagues disagree hence me trying to actually test a machine with multiple disks across multiple storage accounts. As an aside, is there performance difference that varies depending on the drive size (i.e. EBS 1TB disks outperform EBS 12.5GB disks by a fair margin).
Any assistance and or guidance is highly appreciated. In my testing the .sql data file (a list of statements) and database both live on and use the RAIDed system. I have attached details about the configuration below.
Windows Azure
/dev/md0:
Version : 1.2
Creation Time : Thu Aug 1 01:01:42 2013
Raid Level : raid0
Array Size : 209711104 (200.00 GiB 214.74 GB)
Raid Devices : 8
Total Devices : 8
Persistence : Superblock is persistent
Update Time : Thu Aug 1 01:01:42 2013
State : clean
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Chunk Size : 512K
Name : s-zw-dat-004:0 (local to host s-zw-dat-004)
UUID : 359f8647:9c352ec6:2a6ef10f:42571fcb
Events : 0
Number Major Minor RaidDevice State
0 8 32 0 active sync /dev/sdc
1 8 48 1 active sync /dev/sdd
2 8 64 2 active sync /dev/sde
3 8 80 3 active sync /dev/sdf
4 8 96 4 active sync /dev/sdg
5 8 112 5 active sync /dev/sdh
6 8 128 6 active sync /dev/sdi
7 8 144 7 active sync /dev/sdj
Version 1.96 | Sequential Output | Sequential Input | Random Seeks | Sequential Create | Random Create | |||||||||||||||||||||
Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | |||
Azure | 2000M | 1170 | 95 | 147668 | 19 | 137878 | 16 | 2767 | 99 | 1882100 | 99 | 7700 | 105 | 16 | 5861 | 16 | +++++ | +++ | 19882 | 51 | 13639 | 39 | +++++ | +++ | 19496 | 53 |
Latency | 8428us | 17417us | 21752us | 4955us | 145us | 4297us | Latency | 461us | 270us | 179us | 355us | 159us | 133us |
Amazon EC2 Medium 4 25 GB EBS in RAID10 + Spare
/dev/md0:
Version : 1.2
Creation Time : Tue Aug 21 18:15:34 2012
Raid Level : raid10
Array Size : 25148416 (23.98 GiB 25.75 GB)
Used Dev Size : 12574208 (11.99 GiB 12.88 GB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Thu Aug 1 15:13:32 2013
State : clean
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : near=2
Chunk Size : 512K
Name : db-staging:0 (local to host db-staging)
UUID : 351a523e:be20087f:dfbc589e:0093362e
Events : 71
Number Major Minor RaidDevice State
0 202 80 0 active sync /dev/xvdf
1 202 96 1 active sync /dev/xvdg
2 202 112 2 active sync /dev/xvdh
3 202 128 3 active sync /dev/xvdi
4 202 144 - spare /dev/xvdj
Version 1.96 | Sequential Output | Sequential Input | Random Seeks | Sequential Create | Random Create | |||||||||||||||||||||
Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | |||
Amazon | 2000M | 396 | 97 | 57836 | 8 | 48226 | 9 | 568 | 99 | 1256279 | 99 | 10902 | 128 | 16 | 15646 | 74 | +++++ | +++ | 22482 | 87 | 16147 | 71 | +++++ | +++ | 19174 | 78 |
Latency | 48278us | 37909us | 367ms | 37005us | 24718us | 89032us | Latency | 53326us | 181us | 39763us | 32377us | 195us | 41141us |
Local Machine (Mid 2010 MBP w/ 240GB M4 SSD 2.4 Ghz Core i5)
Version 1.96 | Sequential Output | Sequential Input | Random Seeks | Sequential Create | Random Create | |||||||||||||||||||||
Size | Per Char | Block | Rewrite | Per Char | Block | Num Files | Create | Read | Delete | Create | Read | Delete | ||||||||||||||
K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | K/sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | /sec | % CPU | |||
Local | 2000M | 592 | 99 | 257922 | 36 | 116145 | 15 | 1382 | 99 | 224145 | 17 | +++++ | +++ | 16 | 7960 | 67 | +++++ | +++ | 16680 | 72 | 7783 | 76 | +++++ | +++ | 5158 | 31 |
Latency | 24348us | 45140us | 43212us | 10129us | 4661us | 2516us | Latency | 10544us | 148us | 10487us | 20255us | 46us | 25662us |
* Benchmarks may not be entirely accurate due to the use of an test file smaller then available memory.