|
I0OJJ > DPBOX 28.03.24 21:05l 297 Lines 14864 Bytes #999 (0) @ WW
BID : S3YI0OJJ_006
Read: GUEST
Subj: this is very good!
Path: IZ3LSV<I0OJJ
Sent: 240328/2005z @:I0OJJ.ITA.EU [Rome] obcm1.08-6-g5b69
From: I0OJJ @ I0OJJ.ITA.EU (Gustavo)
To: DPBOX @ WW
X-Info: No login password
just to read for learning something new after 24 years :)
73 and ciao, gustavo i0ojj/ir0aab/ir0eq
non multa, sed multum
Active Routing Protocol for Amateur Packet Radio BBS Networks
Berlin, Apr-27-2000
Hi,
this is a brief description of the router extension for the WPROT protocol.
The router extension is set up by Joachim Schurig, DL8HBS, Feb. 2000.
This is revision #1 of this protocol extension.
You should have basic skills in WPROT as defined by OE3DZW.
Preliminary:
============
Apart of reliability of operation and error protection in message exchange, the biggest
challenge today for a bbs system operating in store & forward mode is still the routing of
messages.
Many different models of routing algorithms and methods were proposed and used in the past.
Basically, there exist static and dynamic routing models.
Static models use an administrator setup ("forward files") to determine routing of messages.
Mostly, the setup consists of hierarchic definitions of forward zones, sub-zones and direct
neighbours.
Dynamic models try to compute a flexible route map by some kind of connection sampling and
quality determination.
Dynamic models may work passively (without router protocol) or actively (with use of a router
protocol).
Static models are broadly used. They are easy to implement, but their weak point is the
required administrator brainwork and their inflexibility in case of changing interconnection
of the involved stations. Also, they do not use the available ressources efficiently, as
routed traffic always follows the main road, even in case of congestion. The setup by the
administrator is a reasonable source of conflicts in routing. The ping pong and dead end
problems are such conflicts.
Passive dynamic models exist in three implementations:
a) hop counting algorithms
b) traffic time algorithms
c) combination of a + b
Implementation a) counts hops (stations) that messages took to reach the own station,
implementation b) (tries to) measure time that messages needed to reach the own station.
While method b) uses a promising approach, it lacks an exact measuring method for the traffic
time. The only and very unreliable way to get an idea about how long the message travelled is
the extraction of timestamps in the forward headers. We all know about the inaccuracy of
those timestamps. And,
beside of that, two more drawbacks exist: the algorithm can only check _used_ routes, and it
checks the wrong direction. Routing back to another station may
result in totally different parameters.
Nonetheless, there were and are implementations of these methods:
a) is used by TheBox, results are far from being usable. But it was worth the try.
c) is used by DPBOX, it works more or less acceptable in a closed network and with somewhat
correct computer clocks. But it is far away from being perfect,
and with fast networks with message transfer times of some seconds it does not work anymore,
because the timestamps in routing headers suppress seconds.
Basic idea for an active dynamic model:
=======================================
It is evident that real auto routing is only achievable with real throughput tests for
connections in forward direction. The results of those tests have to be communicated to the
other routers in the network, as all of them have to build routing tables.
Tests have shown that throughput tests are easy to implement, even without any kind of router
protocol: the own router only has to check constantly the transmission speed of forwarded
messages sent to its own neighbours. With the exception of some special situations, no extra
data is needed to collect these information.
To implement a router model on this data, a simple communication protocol is needed to
exchange the connection qualities between involved routers.
WPROT offers the flexibility for this task.
See the following network:
A --------------- B ----------------- C
| | |
| | |
D --------------- E F
| | |
| | |
G --------------- H ----------------- I
Each station (A through I) checks the connections to its neighbours.
Now, station A wants to make itself known in the network. It sends a status message with only
three information: its id (callsign), a timestamp, and a quality. When station A sends the
information, the quality is setup with 0.
This status message now is sent to stations B and D. Both stations add the measured quality
for their respective connection to A to the sent quality (0) of A's status message. Now, they
consult their routing table: If the timestamp of this message is newer than the timestamp of
the last received message, the new message is accepted. The old entry in the routing table is
overwritten, it now contains the new quality, and the routing points directly to A. Now,
stations B and D send the updated message to their remaining neighbours (not to A!), but at
this time, the quality in the status message is filled up with the new value computed at
stations B/D.
This status message now travels along the network. Each station adds the measured quality of
the forward connection at which it received the message.
If the message is received another time via another forward connection, the timestamp of the
message is equal to the timestamp in the routing table. Now, the stored routing quality is
compared with the quality of the later received message. If this quality is better than that
one already stored, the routing table is overwritten and the newly received message forwarded
to the remaining neighbours. The routing table now points to the neighbour from which the
later message was received. If the quality of the later received message is worser than the
already known quality of a message with the same timestamp, the message is thrown away.
To inhibit random switching of routing entries, two sets of routing entries are maintained:
one for the currently received routing messages, and another one with the best measurement
from the last broadcast interval. The latter one is actually used for routing of messages,
and its timestamp typically counts 5 hours less than that one of the current broadcast
interval.
Constraints:
============
There is nothing amazingly new with this routing model. It follows the concepts used in many
non-amateur radio networks, and even in amateur radio networks, TNN and Flexnet and possibly
many more use it. The only difference is: Those implementations work very closely with the
transport layer. The use of this concept for mailboxes results in an implementation of a
second network layer.
But stop complaining: nothing else is done (but much worser) by manual configuration of
forward routing by system administrators.
The most logical solution would require a paradigmatic change: It is the concept of store &
forward that requires a second routing layer. Consequently, other networks gave up the store
& forward concept and forward messages directly from station A to station I. Unfortunately,
the amateur radio network is very heterogeneous, and unreliable. Forwarding on a radio
connection from Berlin to Sydney is far from being reality, and even a connection from Berlin
to Vienna is hard to do. The store & forward system easily accomplishes this task.
It is out of the focus of this paper, but it shall be noted that there is another extension
being developed for the WPROT protocol allowing direct forward connections between stations
that easily can get connected, thus using the router of the underlying packet radio network
and no more store & forward.
The routing model described in this paper has the advantage that it is completely independent
of the underlying connection network, and indeed even forward connections by physically
transported floppy disks are thinkable :)
Parameters:
===========
As with all other routing protocols, the usability of this protocol is depending on some
chosen basic parameters. Among these are: update frequency of routing messages, forwarding
priority of routing messages, and chosen throughput measurement method.
These parameters determine the amount of network load created by the routing protocol, the
reliability of routing tables and their ability to adapt to changed network loads and
configuration, and its possible extension.
The possible maximum extension is restricted by the MAXHOPS parameter of the used WPROT
protocol itself. MAXHOPS is 50 (stations).
Broadcast frequency of messages was chosen with 5 hours.
Forward frequency of interchanged broadcast messages was chosen with 30 minutes.
As forward priority of WPROT messages is low, the network is updated in hours and not in
minutes. This meets the reality of packet radio mailboxes.
Implementation:
===============
The routing protocol is implemented as extension of WPROT. Please see the general description
of WPROT at other place.
Routing messages are single WPROT lines in a WPROT message. They use the following format:
<checksum> <Flag R> <Callsign of originating station> <protocol version> <ansi timestamp>
<hops> <quality>
All fields are mandatory.
Example:
F3 R DB0SIF 10 950307915 2 16404
F3 = checksum as usual with WPROT
R = flag for routing message
DB0SIF = call of originating station w/o hierarchical path
10 = protocol version (see WPROT definition)
950307915 = ANSII timestamp (11.02.2000 20:00:04)
2 = Hop count (to be increased by one at every station)
16404 = Quality (Seconds for transmission of 100 KBytes data) (this is an ASCII
representation of an unsigned long internal value)
Connection quality measurement:
===============================
Quality is measured by observation of sent data to neighbours of one's station. The only data
that reliably can be used without creating extra test data is sent messages of the running
forward. The only forward condition in which no additional idle time may be added by the
neighbour bbs is between start of message send (not start of proposal) and reception
acknowledgement. The time between both events is counted, and normalized to the amount of
needed seconds for 100 KBytes of data.
To reflect the gained bandwith when using a compressed forward connection, not the amount of
net data bytes is counted but the uncompressed size of sent
messages. This makes compressed forward links better than a link of same net bandwith with
uncompressed forward. This is desired.
Small amounts of data lower than 512 bytes are not used for measurements, because the
influence of other latency factors than bandwith gets reasonable. If the last measurement is
more than 60 minutes ago, all blocks of data, even small ones, are used until the next block
of data bigger than 512 bytes is sent.
The quality for the forward connection is the mean value of all measurements during the last
5 hours. This smoothens quality changes.
The last mean value is maintained, even when there were no new connections (and measurements)
during that period of 5 hours.
This value ages with 20 % of its initial value per hour, but at least with 1 (one) per hour.
After one day, it is reset to the default value for unchecked forward connections. This is
SHORT_MAX.
After three days, the quality is deleted from any table.
Maximum quality is 1 (i.e. 1 seconds for transmission of 100 KBytes of data)
Minimum quality is SHORT_MAX (i.e. 32767 seconds for transmission of 100 KBytes of data)
(Remind that these qualities are subsequently added when status messages pass the network,
so, data has to be stored as unsigned long)
If a link uses a non-interactive forward method, such as forward via import/export files, it
gets a quality of SHORT_MAX/2. For these links, no real checking is possible.
At each hop, an additional fixed aging of 100 and a percentual aging of 10% is added to the
quality.
There has to be added another aging, depending on the size of the forward queue for private
messages, but at the time being this is not yet implemented.
As DPBOX priorizes short messages and messages from different senders, this ain't worse (and
it renders it difficult to age a link due to the size of its forward queue...).
Obligation:
===========
This protocol requires that a station participating in the exchange of routing messages will
also use them for routing. That means: no forward of routing messages, if you do not auto
route messages for targets in the bypassing routing messages. Otherwise, loops will occur.
If you only want to participate silently, that means, analyzing all routing messages without
being part of the router network, you strictly have to take care that no updated message will
leave your system.
Additional targets:
===================
As not all stations in a network will adapt this protocol, it would at least be desirable if
direct neighbours to stations that use this protocol will create "phantom" messages. That
means: if a station detects that some of its direct neighbours does not create routing
messages by itself, it should create routing messages for this neighbour as if it had created
them by itself. To reduce network load, it is absolutely necessary to truncate the timestamp
of such messages to the basic broadcast interval chosen, so, to 5 hours UTC. Additionally, to
prevent interfering erroneously sent phantom broadcasts for stations that send own
broadcasts, the timestamp of phantom broadcasts is aged by another 3 * 5 hours (means 15
hours).
Conclusio:
==========
This protocol works really nice in praxi. If you decide to implement it in your own software
product, please carefully check the reference implementation in this software (DPBOX). You
have to use the same parameter set, as different parameters, especially for update frequency
and aging, would heavily interfere with this implementation.
The basic timing is defined in the source code file box_wp.h, and the basic processing in
box_wp.c AND box_rout.c. Please check them all, I tried to insert comments at necessary
places...
73 Joachim, DL8HBS
Read previous mail | Read next mail
| |