Clustering and High Availability

From MiRTA PBX documentation
Revision as of 21:20, 31 May 2016 by Manager (talk | contribs) (Created page with "Several MiRTA PBX are able to work in cooperative mode, building a cluster of servers, providing superior performance and high availability. MiRTA PBX cannot be used for load...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Several MiRTA PBX are able to work in cooperative mode, building a cluster of servers, providing superior performance and high availability. MiRTA PBX cannot be used for load balancing without any external tool, but can be used for a load sharing cluster. The best way to setup the system is by using DNS SRV. DNS SRV is often referred as a way to provision high availabilty. It is a special DNS record listing all the servers providing a service. For each service offered a “priority” and “weight” are defined, so the load can be shared among several servers. A typical DNS SRV record has the following format (from Wikipedia)

_sip._udp.example.com 86400 IN SRV 10 60 5060 bigbox.example.com.
_sip._udp.example.com 86400 IN SRV 10 20 5060 smallbox1.example.com.
_sip._udp.example.com 86400 IN SRV 10 10 5060 smallbox2.example.com.
_sip._udp.example.com 86400 IN SRV 10 10 5066 smallbox2.example.com.
_sip._udp.example.com 86400 IN SRV 20 0 5060 backupbox.example.com.

The first four records share a priority of 10, so the weight field's value will be used by clients to determine which server (host and port combination) to contact. The sum of all four values is 100, so bigbox.example.com will be used 60% of the time. The two hosts smallbox1 and smallbox2 will be used for 20% of requests each, with half of the requests that are sent to smallbox2 (i.e. 10% of the total requests) going to port 5060 and the remaining half to port 5066. If bigbox is unavailable, these two remaining machines will share the load equally, since they will each be selected 50% of the time. If all four servers with priority 10 are unavailable, the record with the next lowest priority value will be chosen, which is backupbox.example.com. This might be a machine in another physical location, presumably not vulnerable to anything that would cause the first four hosts to become unavailable.

The load balancing provided by SRV records is inherently limited, since the information is essentially static. Current load of servers is not taken into account. The most common setup for MiRTA PBX comprises two servers acting each one as asterisk, web and database server. A possible DNS SRV record for this setup can be the following:

sip.udp.pbx.domain.com 86400 IN SRV 10 10 5060 voip1.domain.com.
sip.udp.pbx.domain.com 86400 IN SRV 20 10 5060 voip2.domain.com.

In this way all the phone will register on voip1.domain.com and in case of any problem, the phone will move on voip2.domain.com. If a phone is registered on voip2 and a call arrives from voip1, the system will route the call accordingly and the client will not notice any difference. A tenant can have half the phones on a server and half on another server without noticing any difference. Even if this configuration is possible, it is not really advisable due to the additional load due to the routing of the calls between the servers. It can be good to work towards having all the phones for a tenant on the same server. A more advanced setup will consist in creating two pools of servers as following:

sip.udp.pbxA.domain.com 86400 IN SRV 10 10 5060 voip1.domain.com.
sip.udp.pbxA.domain.com 86400 IN SRV 20 10 5060 voip2.domain.com.
sip.udp.pbxB.domain.com 86400 IN SRV 20 10 5060 voip1.domain.com.
sip.udp.pbxB.domain.com 86400 IN SRV 10 10 5060 voip2.domain.com.

The first pool, pbxA.domain.com will list voip1.domain.com as primary server and voip2.domain.com as secondary server. The second pool will list voip2.domain.com as primary and voip1.domain.com as secondary. All the phones using pbxA as DNS SRV address will normally connect to voip1. It is perfectly normal to find around 10% of the phones connected to the secondary server due to normal packet loss. All the phones using pbxB as DNS SRV address will use voip2.domin.com as primary. Carefully choosing which pool configure on tenant's phones, the load of the system can be effectively shared among multiple servers while providing resilience.