Goeller on Telecom
Traffic
A First Course in
Telephone Traffic Engineering
Chapter 4: Erlang C and Queuing. Lost Calls Delayed
What Is a Queue?
When people wait in line to be served, they are standing in a
queue. The reason they stand in a queue waiting to be served is
because there are not enough servers to be sure that at least one is
available when somebody requests service. There is nothing wrong
with this. Servers are often expensive, and must be kept busy if
they are to earn their keep. If people are standing in line, as soon
as one is served, the next is in position to keep the server busy.
When it comes to telephone systems, queuing is often advertised
as the latest, greatest thing since sliced bread, guaranteed to save
vast sums of money on long distance calling. One must not be carried
away by such claims. Queuing is a tool, but it is only one of three
used in network design. With proper design, a good deal of money can
often be saved, but queuing is only one of several approaches.
Actually, there are three kinds of queuing: retries, off-hook or
hold-on queuing, and call-back queuing. With retries, you just keep
trying until you get through. With hold-on queuing, you stand there
clutching the phone, listening to commercials on hold, and when the
WATS line comes free and it's your turn, you are connected through.
Finally, with call-back queuing, you place your order and hang up.
When the appropriate facility comes free and you are next in line,
the system calls back; upon your answer, the call is put through to
the called party.
Retries can be used with any system, and without any special
training for the callers. However, the most aggressive callers get
the best service and only a relatively small improvement in circuit
occupancy is possible. Further, under overload conditions, the
number of retries can increase dramatically; since the system
control has to do just about as much work for an unsuccessful
attempt as for one that works, it is easy to overload the system.
This is particularly true with small PBXs that use a single
microprocessor; an 8080 or 6800 can only perform so many operations
per second.
Hold-on queuing is easy from the system design standpoint because
the caller is kept on the phone for the convenience of the system.
He does not make countless retries, and he does not have to be
called back and gotten on the line. When the trunk comes free, the
system connects him through. Typically, hold-on queuing is limited
to something on the order of 30 to 45 seconds. Then the user is
either given overflow tone or else is taken via some more expensive
facility. Because the average wait in queue tends to be so short,
little improvement in facility loading occurs.
Call-back queuing can permit waits of several minutes; thus, it
can be helpful in increasing the loading of expensive facilities,
particularly if they are in small groups. However, call-back may go
astray because hunting or call-forwarding in the PBX may deliver the
call-back to someone else, the PBX may wipe the user out of the
queue without letting him know (some PBXs do this if you make or
receive a call while waiting for call back), or the user may simply
be unavailable by the time his turn comes up.
Very short and very long queues are not particularly effective.
With regard to short queues, it was mentioned in Chapter 1 that the
average wait for the first person in line is the average holding
time of the call divided by the number of servers in the group. For
2 FX lines and 6-minute calls, this comes out to 3 minutes. If the
queue is half a minute long, the odds are one in six of getting a
facility before the queue times out.
For long holding times, such as ten minutes or so, the trunk
group is running just about full; since a server of any sort cannot
do more than one hour's work in one hour, there is little to be
gained in terms of circuit occupancy by delaying the caller any
longer. Thus practical queuing systems hold people in queue for
perhaps one to five minutes, and then give the caller a chance to go
via some more expensive facility.
Running Erlang C
There are two programs on the disk for Erlang C: ERL-C1 and
ERL-C2. Let's call up ERL-C1 first and see what it gives us. Just as
with ERL-B, the program asks us for offered traffic. One of the
principal assumptions of Erlang C is that nobody drops out of the
queue, and callers use the system for one full holding time once
they are served. This is the "lost calls delayed" assumption.
Offered and carried traffic are, by definition, the same. We know
people drop out of the queue when the wait gets long, but Erlang C
works surprisingly well, all things considered.
Let's give the program the same amount of traffic we used in the
last chapter: 14.63 Erlangs. The computer broods for a few seconds,
and then pumps out a screen-full of data, shown in Printout 4-1.
Table III in Appendix II shows the same sort of information. When we
get it, what do we have? Well, N, as before, is the number of trunks
in the group, and P, as usual, is the probability that exactly N
trunks in the group are busy. Here, this turns out to be the
probability that a call, arriving at random, will go into the queue.
If we have fewer than N calls in the system, there will naturally be
no queue.
The next four columns, headed D1, D2, Q1 and Q2 are measures of
Grade of Service. Let's start with Q1 and Q2 first, because they are
easier to explain. Q1 tells us the average number of calls waiting
in queue over the entire hour. But P is less than 100%, so we know
that for some portion of an hour, there will be no queue at all. Q2
allows for this. It is the average number of calls in the queue
during just that part of the hour that all trunks are busy. Because
we have the same number of calls waiting for service, but we now
recognize that they are waiting only over a part of the hour, Q2 is
always bigger than Q1.
Now, let's look at D1 and D2. What are they? Well, D1 is the
average wait in queue for all calls, and D2 is the average wait in
queue for only those calls that get into the queue in the first
place. Note the similarity with Q1 and Q2. But what are the units in
which D1 and D2 are measured? They are not CCS, minutes or hours.
They are "holding times." D1 and D2 measure average delay in terms
of how long, on the average, people talk when they finally get a
circuit. D1 is used to plot the dashed curves in Fig. 3-1. A glance
back at that figure shows that a D1 of 0.1, or about 30 seconds on a
5 minute call, gives about the same occupancy as B.15, while D1 of
0.5 or 1.0 give appreciably better occupancies on small trunk
groups.
We see that 14.63 Erlangs offered to 15 trunks produces a D1 of
2.41 holding times. If we know that the average holding time is 5.35
minutes from other measurements, then we know that, on the average,
a call arriving at random can expect to wait for 12.89 minutes, or
about 12 minutes and 53 seconds. Some calls, however, will luck out
and will arrive just when random traffic fluctuations have let the
queue clear out and leave a trunk or two free. This small proportion
of calls will not be subject to any delay. Thus the delay prorated
over just the calls delayed is 2.7 holding times, or about 14.45
minutes.
Adding just one more trunk does wonders for service. D1 drops to
0.47 holding times (2.51 minutes), and D2 becomes 0.73 (3.83
minutes). Both delays are much shorter, but we see the ratio of D2
to D1 increase greatly. That's because, of course, a much smaller
proportion of calls get into queue in the first place.
The remaining columns in the display, P8, P4, P2, P1 and PP, are
simply the probability that a call coming in at random will be
delayed more than an eighth, quarter, half, one or two holding
times. With 15 trunks, we see that the probability of a wait longer
than two holding times (PP) is 0.43; if we add one trunk, the
probability of delay drops to 0.04.
Some Problems with Queuing
D1 and D2 help us get a feel for a problem that shows up in
queuing situations and other places as well. The standard way of
defining delay is in terms of holding times. But ten 5-minute calls
and five 10-minute calls represent the same amount of total traffic
in Erlangs (0.83). If we think about this, we see that delay in the
queue, as experienced by the individual caller, is only partially a
function of the total amount of traffic, but it is a direct function
of holding time. Thus 10-minute calls will produce twice the delay
of 5-minute calls.
Let's try just this problem. Run ERL-C1, and give it 0.83 Erlangs
(five 10-minute or ten 5-minute calls). We immediately see that with
a single trunk, D1 will be 4.88 holding times and D2 will be 5.88.
For 5- minute calls, we multiply these values by 5 minutes per
holding time to get better than 24 and 29 minutes delay,
respectively. For 10-minute calls, we multiply by 10 minutes rather
than 5, and get delays of about 49 and 59 minutes. Thus we see that
for the same amount of traffic, a queued caller has to wait twice as
long. If we used two trunks, 0.21 and 0.85 holding times become 1.05
and 4.25 minutes for 5-minute calls, and 2.1 and 8.5 minutes for
10-minute calls. This is a considerable improvement in service, but,
with the same traffic and twice as many trunks, our occupancy has
been cut in half.
Long holding times make for long delay in queue, and people react
accordingly. It turns out that, the harder it is for people to get a
trunk, the longer they keep it once they have it. Thus a long wait
in a queue tends to make for longer holding times, which increases
the delay in queue. Without some sort of counter measures, queuing
tends to be self-defeating.
Another problem with queuing involves "tandem" systems, systems
where connections are established through several trunks, one after
another. We might, for instance, use a tie-trunk to go from PBX A to
PBX B, and then use an FX line to get into a Central Office remote
from PBX B (and, presumably, even more remote from PBX A). If we are
queued for access to the tie-trunk and then, when our call arrives
at PBX B, we are again queued for access to the FX line, the
tie-trunk will be occupied with a lot of non- productive holding
time (the time we are waiting in queue for the FX line to come
free). This increases the occupancy of the tie-trunk and thus tends
to reduce its cost per minute, but it gives an incorrect picture of
the costs involved. It does no good to tie up a facility with dead
waiting time; business is not being transacted. Further, other
people are also not using the tie-trunk to transact business. Thus,
one should not, in general, queue for facilities used in tandem
connections.
There is one more problem. One should not queue on two-way
facilities such as FX lines and tie-trunks. Consider the first of
two basic situations you can encounter: one PBX is smart and can
queue while the machine on the other end is dumb and cannot. We
queue to get higher facility occupancy. We get the most occupancy
when an ATB condition obtains. When we do not have ATB, we do not
have a queuing situation. Thus, it may be said that the purpose of
queuing is to maintain ATB.
Now, assume we have ATB on our trunk group. Both ends have calls
to place. The smart end has them stacked neatly in a queue, while
the other end simply experiences users retrying over and over. As
soon as a trunk comes free, the smart end pounces on it and sets up
the call. People retrying haven't got a ghost of a chance of showing
up just when the trunk comes free and before the smart end reseizes
with electronic speeds. Thus the retrying callers are frozen out.
Now, let us consider the second variation: both ends are smart
switches and have queuing in effect. As soon as a trunk comes free,
both ends pounce on it and connect signaling equipment to tell the
far end what they want. This produces a condition known as glare.
Both senders want to talk, neither wants to listen, and, depending
on the hand-shaking routines in effect, they can have quite a bit of
trouble figuring out what to do next. It follows that one should
not, at the present state of the art, queue on two-way facilities.
In the future, when Common Channel Interoffice Signaling (CCIS)
is more widely available, these problems will be greatly reduced. If
one has a CCIS data link between smart PBX control computers, they
can talk to each other and work out some sort of compromise. But at
present, most PBXs and older COs communicate with each other over
the same trunk that will later be used for human conversation. The
protocols used (wink start and delay dial, in particular) are based
on telegraph techniques in use prior to the World War I, and leave
much to be desired. In particular, the signal that tells the distant
switch that a trunk has been seized (off hook) is exactly the same
as the one used to acknowledge seizure from the distant end. Tricky
timing is used, but there are other, better ways to detect and deal
with glare. For instance, off-hook could still be used as a seizure
signal, but dial tone could be used to acknowledge seizure for
machines, just as it is for people. Unfortunately, this idea is
almost impossible to sell.
More on Delay
Having delay expressed in holding times is useful for theoretical
purposes, and for printing out paper tables. But computers can make
the modification to "real" units such as seconds or minutes far
better and easier than people can. Thus ERL-C2 is provided. It
requests the average holding time in addition to the offered
traffic, and provides D1 and D2 in seconds. Further, the columns
over on the right hand side give the probabilities that a call,
arriving at random, will be delayed longer than 10, 30 and 60
seconds, and 5 and 10 minutes. Run the program for 6 Erlangs and
5-minute calls and then for 10-minute calls, and see how the delays
change (or look at Printout 4-2). For different average holding
times, different delays appear. When using ERL-C2, the "real" units
are quite convenient. However, they depend on a separate measurement
of holding time for their accuracy. Assuming a holding time that is
too long or too short can lead to problems.
Both ERL-C1 and ERL-C2 print out their information starting with
N being the next larger integer than the offered traffic in Erlangs.
Why? The reason lies in one of the basic assumptions made in the
derivation of Erlang C: nobody drops out of the queue. Although this
is an accurate model for cars on the George Washington or Golden
Gate bridges, it is hardly true for telephone callers. But this is
the assumption. Now, if the offered traffic equals the carried
traffic, it follows that one must have at least as many trunks as
there are hours of use per hour. If one has fewer trunks, the queue
will get longer and longer, and the condition of "statistical
equilibrium" cannot be met. If calls come in faster than they go
out, hour after hour, even a queue can't help you.
Let's go one step further. Let's assume we our traffic is 5.5
Erlangs. With six trunks handling 5-minute calls, we will have a D1
of less than 8 minutes (469 seconds), and we can handle the load.
But now, suppose we have an unusually busy afternoon so that our
busy hour traffic is 10% heavier than usual. Normally, one might not
think of a 10% overload as being much of a problem. But adding 0.55
Erlangs to the 5.5 for which we designed takes us beyond 6 Erlangs,
and, as a result, the program tells us we need 7 trunks minimum.
It is instructive to run ERL-C1 or ERL-C2 for 5.5, 5.6, 5.7
Erlangs, etc., and watch D1 and D2 shoot up. It also illustrates a
very important fact: the fuller you run a trunk group during average
times, so that you can get good economies, the more subject you are
to the troubles produced by overload. When designing a system, it
always pays to check what happens with a 5, 10 and 20% overload. If
you use some kind of an "average" busy hour, you can expect some
"busy" busy hours to exceed it as a matter of arithmetic. If you
don't know what to expect, you haven't really designed your network.
Note that overloading is not limited to queuing systems. There are
other ways to pack trunk groups full, and when you do, you have less
reserve space available to help soak up excess traffic under
overload.
Queuing produces higher occupancy and, as a result, it can
sometimes produce a lower cost per minute of use. Indeed, we almost
always queue to get lower cost per minute. But when we use
specialized carriers or toll, cost per minute in independent of
circuit occupancy. What then?
Then, of course, there is usually no point in queuing. So it
becomes obvious that the way in which facility costs are billed has
a major impact on how we design networks. There are no abstract
principles of traffic theory that produce ideal solutions all by
themselves. We must understand how pricing techniques and traffic
interact to produce optimization. The next chapter introduces some
of the ideas that relate cost and traffic.
Summary
The Erlang C formulas assume offered = carried traffic; that is,
nobody drops out of the queue. A call stays in the queue until
served, and then uses the facility for one holding time.
Unfortunately, humans delayed in a queue tend to keep the trunk
longer when they finally get it. Because delay is based on average
holding time, queuing tends to be self-defeating. Queuing can
greatly improve occupancy for small trunk groups (see Fig. 3-1) but
does little for larger groups which run naturally at higher
occupancies. No matter how clever the queuing algorithm in modern
telephone equipment, it cannot put 6 Erlangs into 5 trunks in one
hour.
[
Top ] [ Next Chapter ] [
Table of Contents ] |