Azure – Oracle Cloud (OCI) Interconnect Network Latency Shootout
What you should know: I am currently working for Oracle as Principal Cloud Architect, but any posts on this blog reflect my own views and opinions only.
The interconnect between Microsoft Azure and Oracle Cloud seems to be a hot topic lately. I often hear of the scenario to run an Oracle database on OCI and some applications on Azure, which is interesting to companies moving to the cloud. But one of the major concerns always seem to be that latency between Azure and OCI might be too high for their use cases. Since there are no performance SLAs or official testing results, it is always up to the user to either believe the numbers given or do a test on their own.
For the purpose of easier testing the network connectivity i created a couple of terraform scripts that will set up an environment on Azure and OCI with tooling useful for this purpose. In this post i will show you what i did and what i found out so far.
Apart from the general disclaimer, i really need to emphasize that the results i will show here are one-shot statistics generated by me in a testing run. They are in no way official statistics by either Microsoft or Oracle.
For this test i created resources in OCI and Azure in the London (UK South) region. If you try this on your own, the region can easily be changed in my terraform scripts.
The image below shows the setup in OCI (red) and Azure (blue). In each cloud 2 virtual networks were created – 1 for the the general test and 1 for a VPN for comparison. Virtual machines are deployed for a client – server latency testing setup. Finally 3 types of connections between the clouds are set up:
- Interconnect via FastConnect and ExpressRoute – this is the thing we actually want to test.
- VPN connection as a reference.
- Public internet connections as a reference.
When testing the network in cloud infrastructure, one needs to understand that the providers do some optimizations to increase performance. This optimizations sometimes might not be applicable to all protocols. Therefore not only classic ICMP latency (“ping”) must be considered, but TCP and UDP latency as well.
For measuring ICMP i will stick to good old-fashioned ping, while for TCP and UDP qperf is used. The latter tool works with a client-server setup, so it needs to be installed on 2 machines at least.
To collect the statics over some period of time, cronjobs are installed on the VMs to execute ping and qperf against their designated destinations. All is written to log files, which in the end can easily be transformed into csv for further analysis.
With this setup, data is collected for several routes:
- Latency Azure to Azure (a). This is a reference value to get intracloud latency between 2 hosts in separate subnets.
- Latency Azure to OCI via interconnect (b). Here two different configurations are tested. One using basic ExpressRoute and another one using an VNG SKU (currently only UltraPerformance) that supports FastPath. FastPath will bypass some hops from the Azure edge to the Azure VM.
- Latency Azure to OCI via public internet (c). Used as a reference value.
- Latency Azure to OCI via VPN (d). Used as a reference value.
- Latency OCI to OCI (e). This is a reference value to get intracloud latency between 2 hosts in separate subnets.
The results show a consistent pattern of latencies for the different configurations.
- Lowest latency is achieved for intra-cloud traffic. Now this shouldn´t be a surprise to anyone.
- Next best is a configuration using FastConnect/ExpressRoute using FastPath. This adds around 1 ms to the rtt latency.
- Then you see that the configurations using FastConnect/ExpressRoute and public internet are getting higher latencies with the dedicated connection having slightly better values.
- Finally there is the configuration using VPN. This by far leads to the worst latency. This as well should not surprise much.
The following tables show the results for the different configurations. (For the statistics friends: N actually would have been too low for calculation of variance – so take the standard deviation not too serious).
ICMP latency in ms measured with ping.
|Azure - Azure internal||0,97||0,06||0,87||1,09|
|Azure - OCI via Interconnect||2,91||0,44||2,34||3,75|
|Azure - OCI via Interconnect Fastpath||1,79||0,21||1,65||2,34|
|Azure - OCI via public internet||3,24||0,86||2,35||7,22|
|Azure - OCI via VPN||5,84||1,04||4,64||7,02|
|OCI - OCI internal||0,30||0,04||0,26||0,40|
TCP latency in ms measured with qperf.
|Azure - Azure internal||0,24||0,06||0,18||0,34|
|Azure - OCI via Interconnect||1,26||0,14||1,08||1,49|
|Azure - OCI via Interconnect FastPath||0,66||0,06||0,58||0,74|
|Azure - OCI via public internet||1,40||0,18||1,11||1,84|
|Azure - OCI via VPN||2,46||0,18||2,25||2,74|
|OCI - OCI internal||0,08||0,01||0,07||0,11|
UDP latency in ms measured with qperf.
|Azure - Azure internal||0,23||0,06||0,17||0,33|
|Azure - OCI via Interconnect||1,28||0,14||1,02||1,48|
|Azure - OCI via Interconnect FastPath||0,63||0,06||0,57||0,71|
|Azure - OCI via public internet||1,49||0,39||1,02||2,76|
|Azure - OCI via VPN||2,53||0,09||2,39||2,61|
|OCI - OCI internal||0,08||0,01||0,07||0,11|
If thinking about a multi-cloud setup with strong interdependencies, always consider opting for an ExpressRoute setup on Azure that supports FastPath. All measurements show that there is a considerable performance gain (in terms of latency) by choosing this option. While this still cannot beat the performance within a single network, you still get numbers in the ballpark of read latency of (slow) SSD drives.
The basic interconnect without FastPath will give you only minor performance gain over public internet, although the redundant dedicated lines of ExpressRoute and FastConnect will increase resiliency of your network, consistency of performance and eventually enhance security. Please note that the rather good performance of public internet connectivity might be caused due to Azure and OCI datacenters being geographically located very close together in most cases.
Using site-to-site VPN connections will lead to a considerable increase in latency. If secured connections between datacenters are a concern, consider evaluating a setup using FastConnect and ExpressRoute.
As i already mentioned above, these results are just a single observation from the London (UK South) region. For your specific requirements or region the results may vary, while i believe that the general observations will stay valid. Using the scripts i provide you can easily try out setting up an interconnect yourself and get some numbers yourself. For doing this, the credits in the trials for OCI and Azure will be sufficient.