Home Networking. Part 3 - VeloCloud Architecture

Before I blog about my experience in configuring VeloCloud from Orchestrator to Edge, it is important to understand the architecture and how the VeloCloud SD-WAN platform functions. With this knowledge one can make the best decisions about how to configure their SD-WAN. SD-WAN solutions provide the software abstraction to create a network overlay and decouple network software services from the underlying hardware.

There are three major components to the VeloCloud Platform: Orchestrator, Gateways, and Edges. I will describe and summarize their functions and relationships to each other in this blog.

VeloCloud Orchestrator Operator Menu

VeloCloud Orchestrator Operator Menu

Orchestrator (VCO).

The VCO is the portal that is used to create, configure, and monitor VeloCloud SD-WANs. VeloCloud Orchestrator is multi-tenant and very powerful. Through a single Orchestator and its associated Gateways, one can create SD-WANs, or Software-defined Wide Area Networks, for Customers or Partners. A customer is able to manage and monitor their own VeloCloud Edges, Network Profiles, Business Policies, Firewall Rules, and more through the VCO. Partners are able to create their own customers within the VCO and manage their customer environments directly. The VCO is also used to activate and configure Edges. The VCO is a virtual machine that can run on vSphere, KVM, or AWS.

Gateway.

A VeloCloud Gateway, or VCG, is the device that an Edge routes traffic through when the traffic is defined to take a “multi-path” route (there will be more on route types in a future blog) or for non-VeloCloud VPNs. There are two main types of configurations for a Gateway, default and Partner. In the VCO, the VeloCloud Operator creates one or more Gateway pools and then places Gateways into that pool. Gateways are virtual machines that can run on vSphere, KVM, or AWS.

Gateway Pools are then assigned to Partners and/or Customers.

VCO Gateway Pools

VCO Gateway Pools

VCO Customers

VCO Customers

In a Cloud Hosted Model where Gateways are in default mode, an Edge is assigned a primary and secondary Gateway based on Geo location through the Maxmind database. The Edge’s peered Gateways are geographically closest to that Edge. The Edge device sends all “multi-path” traffic to its primary VCG and the Gateway then sends the traffic on to the intended destination. Return traffic is sent to the Gateway and then back to the Edge device. If the Edge identifies one of its Gateways as unreachable after 60 seconds, it marks the routes as stale. If the VCG is still unavailable after another 60 seconds, the Edge removes the routes for this Gateway. If all gateways are down, the routes are retained, and the timer is restarted. If the Gateways reconnect, the routes are refreshed on the Edge.

An SD-WAN with Partner Gateways gives the Operator the ability to route traffic to specific VCGs from Edges based on subnet. This is a value-add beyond the Cloud Hosted Model. A Partner will place Gateways geographically close to the services that they offer. When an Edge peered with a Partner VCG wants to access that service, the Edge leverages the tunnel to the Partner Gateway assigned for that service by subnet. Often Edges that are peered with partner Gateways have an average of 4 Gateways manually assigned. This number generally equals the number and locations of the services that the Partner is providing the customer such as SaaS offerings, Cloud services, etc.

You can see in the screenshot below that I checked the box for Partner Gateway during the Gateway creation and was given an option to define which subnets should be routed by that Gateway.

It is important to note that VCGs do not talk to each other and are not aware of each other’s state. Traffic is not routed between Gateways. The Edge sends “multi-path” traffic to its Gateway, that traffic is sent to its destination. When the destination responds, it is routed back through the Gateway to the Edge.

Gateways can be assigned to multiple Gateway Pools. Gateway Pools can be assigned to multiple Customers and Partners within the VCO. Partner Gateways should be placed closest (within 5-10 ms latency) to services that the Edges will access. Default Gateways should be geographically close to the Edges deployed in the Customer SD-WAN. It is not ideal for an Edge on the west coast of the US to send traffic to a Gateway on the east coast of the US before it is routed to its destination, for example.

Edge.

VeloCloud Edge, or VCE, devices are where the magic happens! Edge devices can be physical or virtual. They are implemented in enterprise datacenters, remote locations, and hyperscalers. Edge devices are able to aggregate multiple WAN links from different providers and send traffic on a per packet basis through the best WAN link to its peered Gateway. An Edge can aggregate the multiple WAN links and remediate issues found on public Internet providers such as loss, jitter, and latency. Even if just one WAN Link is connected to a VCE, improvement can be seen because of remediation capabilities of the Edge device.

In this screenshot you can see that VoIP traffic quality was greatly improved by the VCE. This VCE only has one WAN link.

VeloCloud Voice Enhancements

VeloCloud Voice Enhancements

All VCE Management is performed via the VCO in the customer portal. The Enterprise Administrator uses Profiles to manage Edge devices. This makes it very easy for thousands of VCEs to be managed with the modification of a single profile. Enterprise Administrators can also override the profile settings to give individual VCEs a unique configuration that is necessary for it specific site.

Edges can be configured in three main functions. As a default VCE, a Hub, or Internet Backhaul. The default VCE routes traffic as described above leveraging its profile rules and Business Policies. It might connect to a Hub or Internet Backhaul. A Hub is when one or more VCEs act as a central location for other VCEs to connect over VPN. A Hub is generally created at major data centers. An Internet Backhaul is a destination for traffic is routed via Business Policy rules from VCEs back to a single location such as a data center. This is often used for security or compliance purposes. I will provide more information on Business Policies in a future blog.

VCEs are created within the VCO by the Enterprise Administrator and assigned a profile. This profile includes all configuration items for interfaces, Wi-fi, static routes, firewall rules, business policies, VPNs, security services, and more. When the VCE is activated by the VCO, all configuration is pushed to the VCE by the VCO, and the VCE is peered with its primary and secondary gateway and Partner VCGs, if any.

Once the VCE is online, the VCO displays data about the traffic type, source, destination, and quality that passes through each VCE. A world map is displayed that shows all VCE locations and their status in the customer portal.

VCE Monitoring. Applications Tab.

VCE Monitoring. Applications Tab.

There are three ways that the VCE will route traffic. The way that traffic is routed is determined by Business Policies in the Edge Profile. These three routing types are defined as Network Services. They are Multi-Path, Direct, and Internet Backhaul. Multi-Path means that the VCE determines the best carrier for each packet from all WAN links. Each packet is routed to a Cloud or Partner Gateway. Direct is when the Enterprise Administrator routes specific application traffic by defining a single WAN link and does not route through a VCG. Internet Backhaul is described above.

The VeloCloud platform is extremely robust and easy to use at the same time. The ability to configure VCEs and provide security and services to 1000s of sites with a few clicks is nothing short of amazing. If you are looking to improve WAN quality, move away from expensive MPLS, aggregate multiple WAN links, create VPNs across the enterprise, provide security services, and have an easy to use portal to accomplish it all, definitely look at VeloCloud.

Thank you for reading! I will provide details on how to deploy and configure VCO, VCGs, and VCEs in the next blog of this series.

I want to give a shoutout to Cliff Lane at VMware for spending a lot of time answering my numerous questions about how VeloCloud works. Without him, this post would not be possible (or at least correct)! Thanks Cliff!

Home Networking. Part 2 – Foundational Configuration.

Now that the UniFi Security Gateway, or USG, and switches were online and updated to the latest firmware, I was anxious to really start using my VeloCloud Edge. I have access to a VeloCloud Orchestrator that is hosted and managed by VMware. But as an Enterprise Administrator, I can only configure and monitor Edges in a customer environment. There was a lot to the platform that I hadn’t seen. I would have Operator privileges in my own environment!

However, my home lab wasn’t ready because the VeloCloud Orchestrator, or VCO, is distributed as an OVF that requires vCenter to setup the VCO before it boots. I was hoping that I could deploy the OVF through the direct host management that I had been using. I gave it a try and was able to deploy the OVF. However, because I was not deploying through vCenter I wasn’t able to set the host name, password, or SSH keys. After the VCO booted up, I couldn’t log in or do anything. I deleted the VM and turned my attention to VLANs.

Velocloud OVF Configuration

Velocloud OVF Configuration

Because the UniFi switches are only layer 2 capable, they cannot route traffic between VLANs. This means all inter-VLAN traffic must be routed through the USG. Because I planned to have at least 5 separate VLANs, I began to feel concerned about the CPU utilization on the USG. It already would be performing DPI and other security features. Now, it will need to route most of the packets on my network. At the time of writing this, less than 25% of my devices are online. Every day a few more are connected. It will be interesting to how chatty these devices are with just a few human users. Here is the latest usage chart. CPU is sitting at about 25% with a few devices streaming and a couple of people using cell phones and laptops.

USG Performance Chart

USG Performance Chart

Setting up VLANs in the controller software is very easy. It’s configuring firewall rules that I find to be kludgy because they give you about 6 different ways to make the same thing happen. For software that is for home and small/medium business use, I think they should make this simpler and more intuitive. In the screenshot below you see I am creating a new network named Demo. I typed in 10.10.200.1/24, and it automatically populated everything below the Gateway/Subnet box. If the UniFi Site is setup with the correct DHCP and DNS servers, you won’t need to change those settings unless you wish. You’ll notice there are multiple purposes when creating a new network. To create a VLAN as one might expect to use it in an enterprise environment, select Corporate.

Network Creation in UniFi Controller

Network Creation in UniFi Controller

A network set with the Guest Purpose is used for Guest networks where you do not want those devices to access everything such as visitors who want to use Wi-Fi instead of data on their phones. If you want to use tokens or hotspot authentication, that is built into the guest profile and enabling it takes only a few clicks. This is certainly easier than manually setting up those firewall rules.

After creating VLANs for the different types of devices I would have on my network, it was time to prevent communication between the VLANs where it is unnecessary. When I looked at the site settings for routing and firewall, I was amazed. Why does it need to be so complicated? Nine different places I could create a rule seems excessive. To make matters worse, members of the Ubiquiti community give misinformation in the forums as to how to create a firewall rule such as creating an “IN” rule when there should truly be an “OUT” rule. I don’t think this is their fault, it is due to how the GUI is built and possibly to how the USG handles rules. For example, you must create the rules in order in the GUI that you wish for them to be enforced by the USG. You do not get to edit them to reorder them or even set the rule index during creation. This is just silly. There is more flexibility in the CLI, but then we have to get JSON involved for the settings to be remembered whenever the USG is rebooted or provisioned with a new setting. I would suggest a different product if you want simple firewall management at home. I don’t know what that would be. It seems a lot of people like pfsense. I’ve never used it, so I can’t recommend or not recommend.

UniFi Firewall Rules

UniFi Firewall Rules

It has been a very long time since I did anything that resembled real network administration. Many years ago, I spent a few days in San Jose to take Cisco ACE training. I am pretty sure administering the ACE was more intuitive than creating firewall rules in the USG’s GUI. This is saying a lot. But I prevailed and my IoT devices no longer had access to internal systems or the internet. No botnets coming from my house! Not that they’d have the bandwidth to do much destruction to the world.

USG Threat Management

USG Threat Management

Another feature that I’ve decided to turn on in the USG is Threat Management. We all accidentally click a wrong link every so often. Limiting my internet speed to 85 Mbps? No problem! This is another opportunity to look closely at the specs of the USG Pro if you can pull more than 80 Mbps. Since I do not have a Pro, I don’t know what its throughput would be reduced to.

I thought I was finally ready to install vCenter. But alas, I didn’t have a DNS server running on my network. And if I’m going to have a home lab with a bunch of VMs, I certainly need Active Directory. Creating a domain controller for a new forest in a home lab in 2020 is far less nerve-wracking than running DCPromo.exe in 2001 in an enterprise environment, that’s for sure!

After creating A and PTR records in DNS, it was finally time for the VCSA. As all of you probably know, the tiny deployment of vCenter requires 10 GB of RAM. That certainly wasn’t going to fly with my hardware limitations!

Gaming PC ESXi Host

Gaming PC ESXi Host

VCSA Config

VCSA Config

My host came to a grinding halt. I reduced the VCSA to 6 GB of RAM. It could barely boot and could not load the UI. I set it to 8 GB, and at least it ran with minimal complaining long enough for me to deploy the VeloCloud Orchestrator. After that, it was shut down until it was time to deploy a VeloCloud Gateway and subsequently powered off after that.

I was certainly happy to see this login screen after going through host resource gymnastics!

VCO VM

VCO VM

A production VM of VCO wants more resources than my host can provide.

VCO VM Resource Consumption

VCO VM Resource Consumption

Luckily, it is well behaved and only consumes what it needs while powered on.

One last note for before closing out this blog. VeloCloud Orchestrator must have a publicly accessible IP address. The default route must egress to the internet.

VCO OVF Network Selection

VCO OVF Network Selection

This means if you want to do this in your own environment at scale for true internet routing purposes, you might want to have a separate NIC that isn’t hidden behind NAT from something like, a USG for example! There will be many more things that you would need in addition to this, so it is unlikely that an individual would be running their own VCO instance for true SD-WAN multi-pathing into the world. But running it in your home lab to familiarize yourself with the platform behind a firewall and NAT is just fine.

Thank you for reading Part 2! Part 3 will address the VeloCloud architecture. I will describe what the individual components do and how they talk to each other.