What causes call quality problems and what can you do about it?
If you have ever been on a voip call, chances are you have experienced call quality issues at some point. In fact, a lot of my work involves fixing these issues and helping companies to identify why they are having issues and to help them resolve it. This post will attempt to teach a bit about the basic theory behind why this happens and what you can do about it.
VOIP primarily consists of two parts. The “signaling” and the “media”. The signaling is what controls how the devices connect and the media is the actual signals and audio that gets transmitted.
SIP communication in a nutshell.
SIP (Session Initiation Protocol) is a communications protocol for signaling and controlling multimedia communication sessions in applications of Internet telephony for voice and video calls, in private IP telephone systems, as well as in instant messaging over Internet Protocol (IP) networks.
Or more simply, its a series of “signals” or “commands” which get passed back and forth between two devices that are trying to talk to each other. The signaling will control what type of compression the two devices can use, which ports and IP address the media will use to go back and forth, and other options get negotiated. Also, events such as hangups, transfers, keepalives, etc will get transmitted via the signaling.
If we were talking about the pony express right now, then the signaling would be the addresses on the envelope and the riders who take the messages from one place to another. The contents of the letters in this pony express example would be the media.
Types of Call Quality Problems
Dropped calls, audio cutting out, strange sounding robotic voices, etc.
Most of the time when you experience problems on voip calls, it would be related to some sort of a problem with either the signaling, media, or both not being able to reach its destination, or if the “packets” do reach the destination they took too long or arrived in the wrong order.
In this example, to help you visualize, imagine that we are standing face to face and I have 3 different colored buckets of ping-pong balls in front of me. In this example, I have a bucket of RED, BLUE, and GREEN ping-pong balls in front of me. Lets just assume that the RED represent “SIGNALING” information, the BLUE represent “MEDIA” and the GREEN represent regular internet traffic such as web browsing, email, and downloads.
This would be akin to you using your computer to download an email, and speak on a voip call at the same time. In order to get the “packets” (ping-pong balls) I start tossing the different colored ones at you. First I throw a RED, then a BLUE, then a GREEN. You catch each one in turn and put it in the correct bucket in front of you. No problem there.
Now, Imagine I just pick random colors and start throwing them at you faster and faster. You’re catching most of them, but every once in a while you drop one. This would be akin to “packet loss” The packet tried to come in but since you were too busy or overwhelmed at that moment, you didn’t have a free hand to catch the ball and it got “dropped” Whatever data was in that ball is lost forever. If that was a fragment of “media” then maybe you would have lost part of a word. If that was a fragment of “signaling” then maybe your phone didn’t get a signal it needed to keep the call active and your phone decides to drop the call.
The point is, at some point, because I’m just throwing the balls at you randomly in no particular order, or with any regard to how fast you can catch them, you start dropping them and that will cause loss of sound or worse, if signalling gets dropped could make the entire call malfunction and drop completely.
What can be done? Isn’t there a way for the “carrier” who is throwing all the different packets at you to do it in an intelligent way? What if instead of always throwing a random color your way, they would always prioritize the RED balls first which represents the “signaling”. If your device always gets the right signaling first, then it’s not going to miss a critical signal because it was dropped and make the entire call malfunction. So I prioritize the signaling packets first and foremost. Next, if all the RED signal packets have been thrown, then I check for BLUE packets which represent “media”. I carefully throw all the BLUE packets your way in the correct order and then all the actual sounds and words that are getting transmitted can be correctly played by the device.
Finally, after I have made sure all the signals and all the media has been passed your way, only then do I start working on transmitting the GREEN packets which represent all the lower priority email, and web traffic.
What I have just described is “QOS” otherwise known as Quality of Service.
Enabling QOS on a data link is what allows a carrier to guarantee that higher priority voice packets will always get priority over lower priority email or web packets. If the voice packets always get sent in the correct order first, then your calls will always sound crystal clear and will function correctly.
This is a problem for most “VOIP providers”. Since most voip only providers such as Ring Central, Skype, 8×8 and others rely on the open internet to send the packets, they don’t have any ability to control the order the packets get sent. The cant provide QOS and they have to rely on “best effort” to deliver the packets. In order to get QOS that most enterprise customers need, you need to get the phone service and the bandwidth from the same carrier and that carrier needs to have a strong grasp on running advanced routing and QOS protocols over wide area networks. If the carrier controls the packets that run over the network, and they have all the equipment configured to send the packets in the correct order, they can guarantee the quality of the voice all the way to your building. You need a voice carrier that is also well versed in bandwidth delivery to do this. Most other voip carriers don’t have this ability.
There is a second place that can cause issues with voice calls…
Your INTERNAL NETWORK.
So you have selected a carrier who can provider QOS and they are prioritizing all the packets all the way to your server room. You should be done right? Not quite. There is still an internal network to deal with. The packets come in through a router of some kind, and then probably go to a switch or two, and then over some wires, and finally into your phone. Problems with your internal network which might not be otherwise obvious could be causing packet loss and jitter with you even realizing it.
Common issues with internal networks are network loops. Lets say your network is slightly complex and you have Router plugged into switch A, and then switch A is plugged into switch B, and then switch A is also plugged into switch C. Finally since you want to make sure your switches are fully redundant, you plug switch B in to switch C just so you have an extra path for the signal to travel. That should be ok, right?
Well no, not unless your using advanced dynamic routing protocols, you have just introduced a “loop” in your internal network. When the packets try to get from point A to point B, they take a path through switch A, then B, then C, then maybe back to B, then back to C, then to A, and so on, until they finally make it to the computer they were trying to go. The problem is that when switches have two paths to take, many random factors will cause the packets to go one or the other routes. At each hop, the packet might needlessly continue through the wrong path until finally, it makes it to the correct spot. This loop will cause performance degradation, packet loss, and many other strange unpredictable issues. Unless you’re running some advanced dynamic routing protocol, such as OSPF, routing loops are extremely bad and extremely easy to introduce into a network with more than one switch. If you’re using enterprise-grade switches, then usually the switches will have protection against network loops which is called “spanning tree protection”. If spanning tree protection is turned on, then the switch if it detects a network loop will automatically shut down the port which is causing the loop. It’s not as good as running dynamic routing protocols since shutting down ports will cause other random problems but at least your entire network will probably not experience problems, just the ports that are looped.
Another common issue is not having proper wiring in the building. Little “under the desk” switches are everywhere. What harm can that cause right? Well, each network device adds an extra point of failure and extra latency. The optimal situation is to have all the devices “home ran” back to the core network switches. Sometimes you simply need to get a few more drops ran to eliminate the under the desk switches.
The lesson here is, proper network design is a nuanced thing which must be taken seriously if you want to ensure that not only your voice calls can work clearly, but your network as a whole can run in the best possible way.
This is another advantage to using a company such as ours is that we provide the network design services and switch hardware to our enterprise voice customers. We don’t like having problems just as much as you don’t. We provide the design and hardware just to make our own job easier. It just happens to have the side effect of making the voice work better for our customers. We often augment in-house IT with our expertise in switching and routing as this is a deep field that is beyond the experience of most “IT Generalists”
In closing, how do you stop call quality problems? Get a voice carrier who understands wide area networks and delivering high quality bandwidth with QOS. Also, make sure your internal network is optimized with a correct design, proper config and not secretly causing you issues.
If you need help designing or implementing high-performance networks or voice applications please let us know. We would love to help!