Configuring Failover with 3CX
The Failover feature in 3CX allows you to create a replica of your PBX. In the event that your PBX fails, your replica PBX will become active minimizing downtime and data loss. Below are the steps required in order to activate this functionality.
One Enterprise (ENT) or one Professional (PRO) license key is required in order to enable the failover functionality. With an ENT license key the DNS TTL resolution for 3CX provided FQDN is set to 5 minutes thereby Pro license key uses an 6 hour TTL causing a much longer reconnect time for IP-Phones, 3CX Clients, 3CX SBCs or the 3CX WebClient in case of an Fail.
3CX uses an active - passive approach using build in replication of the configuration with a maximum offset of 24h. The active host processes calls and presence information where by the passive host acts as monitor of the active host. In case of a failure of the active host (independent of application, OS or hardware failure), the passive host stops its monitoring role and takes over the role as active host. It is depending on the configuration of the passive host in which state the master host shall be declared at fault and the switch shall be initiated, more to this later.
The following network failover cases are covered by the failover process:
- On Premise (NAT)
- Scenario A: with only remote extensions
- Scenario B: with local and/or remote extensions
- Cloud (Public Host)
- Scenario C: with only remote extensions as STUN / SBC
A failover from an On Premise host to a Cloud host (and vice versa) is not supported. Documentation and processing is solely tested and supported while using 3CX provided public FQDNs. While theoretically it is possible to use custom public FQDN for the process, it is the administrator's obligation to control, update and manage all DNS entries on their own evaluation.
Before configuring or enabling 3CX failover on your 2 servers, the 3CX Installations need to have a specific configuration.
- One 3CX Cloud PBX (with Public IP) to another 3CX Cloud PBX (with it’s own Public IP) both installed with external FQDN
- Both Servers should each be installed with identical settings, including FQDN, SSL Certificate, SIP, Tunnel, web server ports and web server type.
- When you install 3CX, you need to select 3CX FQDN.
- The IP Phones Provisioning dropdown, need to have the Select interface set to the FQDN (NOT IP)
You can have other variants like Custom FQDN, with one server in the LAN and another in the Cloud. However, for these configurations you will need to add complex scripting to update your DNS and FQDNs when failover occurs. See bottom of this article for more info.
“Overview of how it works:”
Configuring the Active and Passive Servers
Step 1: Configuring the ACTIVE SERVER
- Go to Server 1. We will assume that this is your production server where 3CX V15 is already pre-installed, with a configuration on it.
- All Extensions need to be configured to provision using the FQDN. Configure IP Phone extension provisioning > Phone Provisioning > “Select Interface” >> Select from the drop down your FQDN.
- Go to Backup and restore > Location and select Google Drive as the backup location (you can also use other backup options). In this example we will store all backups in a Google Drive folder named “SIP3CXCOMBackups”.
- Click on Backup Schedule and configure what backup options you want to include in your backup and when the backup will be made. We recommend to do this daily and at night. In the above example, at 1:00 AM a backup will start and the backup will be uploaded to Google Drive named “3CXScheduledBackup.zip”. This is a hardcoded name for the latest backup. Press OK.
- Whilst still in the Backup and restore section, click the “Failover” button. Enable Failover checkbox and select “Active”. Press OK to save.
Active machine is successfully configured. Automatic backups will be made and stored in GDrive. Go to Step 2 to configure the Passive server.
Step 2: Configuring the PASSIVE SERVER
Important: For scenarios where the Active and Primary Servers are behind different Public IP addresses, when you first complete the installation of the passive server and you run through the installation options, your Public FQDN will be rewritten to the Public IP of the failover server. To re-write your External FQDN to resolve back to the Public IP of the Active Server you will need to go to your primary server press to edit the license and then hit Apply so that the FQDN is switched back to the 3CX Active server and restart its services.
Failover Server Configuration
Go to Server 2 and install 3CX Phone System using the same configuration settings as your active server.
- Whilst on server 2 click on backup and restore > “Restore Schedule”. Enable Schedule Restore and set a time when the restore should be applied. Check the option “Do not start services after restore”. Press OK to save.
- Click on the “Failover” button, check “Enable Failover” and select the radio button “Passive”
- Enter the IP Address of the Active server - in this example 126.96.36.199
- Select which services you want to monitor: SIP Server, Web Server or Tunnel Server.
- Select the interval you want the heartbeat checks to be made, (default 30 seconds) and configure whether failover should occur if one or all tests fail.
- Press OK to save the configuration and start monitoring.
When the Active server fails, the passive will detect and take over. The backup would be already restored and the failover action will trigger the DNS Change on the 3CX DNS Servers updating the FQDN to the IP Address of the now new Active server.
It is important that the EX-Active server is shutdown because if some services are still running, these might conflict with the server that just took over.
Note: When it comes to Gateways (FXS/FXO) in a failover scenario, these are not supported as gateways are only supported when Local to your System (reachable via Local LAN). Even in cases where there is a site to Site VPN between NATed Servers, we cannot support Gateway Registration and failover to the Failover server as it depends on manufacturer/model/gateway capabilities.
For users that want to use custom FQDN, and LAN to LAN or LAN to CLOUD scenarios, you will need to make use of advanced scripting and services like Active Directory to allow powershell scripts to run with elevation.
You can find sample scripts for Windows based machines here.
Some scripts might need to run under impersonated user accounts. To achieve this you will need to follow additional steps documented here: https://www.3cx.com/docs/failover-script-user/