Can APNs handle large numbers of VoIP requests in real time?

Question

h-takahashi OP

Created Apr ’25

Replies 4

Boosts 0

Participants 3

I am developing a system to send VoIP notifications to terminals with APNs.

I understand that the maximum JSON Payload for VoIP is 5kb.

If I want to send VoIP notifications to 3000 terminals, I am considering sending 3000 requests in parallel from the system to the APNs, will the APNs guarantee that the notifications will be sent to each terminal without a significant time lag when receiving 3000 requests simultaneously?

Answered by Engineer in 832418022

3000 notifications in parallel will not be an issue for APNs. But if will want to be sure that you have enough bandwidth for the requests and the responses, and you have enough HTTP/2 connections sufficiently spread across multiple APNs hosts, so you don't DoS yourself by not having enough parallel connections causing queueing of requests, and do not DoS APNs by pinning to one or only a few APNs hosts.

You would want to use as many hosts as you can afford, open as many connections per host that it has resources for, and make sure you use non-cached DNS requests to make sure you get a fresh IP address every time.

To make sure you are fairly balanced across multiple APNs hosts, we suggest the following:

do not bring up all the the connections simultaneously
spread out the time between opening each connection for a few seconds
use uncached DNS requests to make sure you are getting a fresh reply every time
if you are able to, avoid reusing already used IP addresses for each connection (of course this depends on how many total connections you will have - repeats are unavoidable)
do not create a static list of IP addresses for reconnections, always use a fresh DNS query
once you create the connections maintain them as long as possible

How many push requests you can send per second per connection will depend on your available bandwidth.

Argun Tekant /  DTS Engineer / Core Technologies

Boost

Answer 1

Engineer OP

Apple

Apr ’25

Recommended

3000 notifications in parallel will not be an issue for APNs. But if will want to be sure that you have enough bandwidth for the requests and the responses, and you have enough HTTP/2 connections sufficiently spread across multiple APNs hosts, so you don't DoS yourself by not having enough parallel connections causing queueing of requests, and do not DoS APNs by pinning to one or only a few APNs hosts.

You would want to use as many hosts as you can afford, open as many connections per host that it has resources for, and make sure you use non-cached DNS requests to make sure you get a fresh IP address every time.

To make sure you are fairly balanced across multiple APNs hosts, we suggest the following:

do not bring up all the the connections simultaneously
spread out the time between opening each connection for a few seconds
use uncached DNS requests to make sure you are getting a fresh reply every time
if you are able to, avoid reusing already used IP addresses for each connection (of course this depends on how many total connections you will have - repeats are unavoidable)
do not create a static list of IP addresses for reconnections, always use a fresh DNS query
once you create the connections maintain them as long as possible

How many push requests you can send per second per connection will depend on your available bandwidth.

Argun Tekant /  DTS Engineer / Core Technologies

0

Answer 2

DTS Engineer OP

Apple

Apr ’25

Argun has covered the general push side of this fairly well, so focusing on the voip side, the main thing I'll says is that whatever problems happen are basically "never" PushKit/APNS fault. I've wasted a lot of time investigating "why didn't my push arrive" and the answer has always been:

The device wasn't connected to a network.
The device was connected to a network, but that network was broken in a way that prevented pushes from reaching the device.
The push reached the device, the system processed it properly, and the target app then failed in way that prevented delivery or the target app had ALREADY failed so many times that the system gave up entirely.

The first two points are particularly important if you're planning to work primarily on WiFi, as that's where "all" of case #2 has occurred.

will be sent to each terminal without a significant time lag

The word "significant" is also tricky. Here's what I'll say:

My rule of thumb is that the latency delay between a server sending a push and that push reach the app is "typically" around ~4s and up to ~10s. Note that this does NOT include app delays like how long your app takes to launch.
There are cases where voip pushes can be delivered LONG (minutes/hours) after they should have expired. This happens when the device reestablishes it's APNS connection and we deliver pushes that where queued at exactly the right/wrong moment.
All voip pushes should be sent with "apns-expiration=0". Using a non-zero expiration will greatly increase the frequency of expired push delivery.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0

Answer 3

h-takahashi OP

Apr ’25

Argun and Kevin, thanks for sharing this very informative information.

Please tell me one additional point.

You mentioned that it will be distributed across multiple APNs hosts. If I send to “api.push.apple.com”, will it be distributed on the APNs side?

0

Answer 4

DTS Engineer OP

Apple

Apr ’25

You mentioned that it will be distributed across multiple APNs hosts. If I send to “api.push.apple.com”, will it be distributed on the APNs side?

One thing to understand here is that the process of submitting pushes to our servers isn't directly connected to the process of sending pushes to their final target. What's actually happening here is:

The target iOS device already has a connection (the push communication channel) to on of our servers.
You send a push to our server.
The push you submitted to server #2, is routed to server #1, which then delivers that payload.

The key point here is that when you submit payload "a" immediately followed by "b", the EXACT timing when they actually reach the target device has FAR more to do with:

Timing differences in the process of routing to their specific target server.
Differences in connection latency between the routes to the different devices.
Random chance.

I've never actually tried this, but my guess is that if you repeatedly sent a series of pushes to the same two devices and carefully tracked delivery time, what you'd is either:

The "first" arrival changed fairly randomly. This happens when delivery performance of the two devices is similar enough that the "random" factors above determine ordering.
One device consistently arrived first, but that submission order no longer matters ("a" arrives "first", even if you submit as "b, a"). This happens because one of that devices happens to have a "faster" delivery configuration than the other device, so it always wins.

Now, with that background, as the number of pushes your submitting "at once" increase, the time required to process each payload does become a significant factor. That is, if you submit a single stream of 1 million pushes, it's pretty likely that "push 1" will arrive before "push 1 million". That's the processes Argun is describing addresses. That is, if you want to have "1 million" pushes arrive "at the same time", you so by setting up multiple connections to our push servers, ideally in a way that ensures each connection is going to a different push server, then spreading your pushes requests across those server.

Having said all that, I suspect you don't actually need multiple connection. Again, 3000 pushes isn't actually THAT many pushes and my intuition is that the factors above will still be what mainly defines delivery order.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0