Acknowledgements: 

A lot of this material is based upon the fine introductory book: 
"An Engineering Approach to Computer Networking," by Srinivasan
Keshav, Addison-Wesley. See the link to it on the suggested texts area
on the course homepage.

==================================================================
1. What is routing ? 

- Routing is the process of finding a path from a source to every
  destination in the network.
- Operationally, it means a protocol to set up forwarding
  tables at all intermediate nodes in the network.

More carefully defined, it is the function of setting up:
     - (layer 3) forwarding tables at routers (intermediate nodes)
        such that, 
     - with a high probability, a (loop free) path from any source to any
       destination can be found 
     - by concatenation of links and forwarding (intra-router links), 
     - and such a path satisfies a well-defined optimization criterion

==================================================================


2. How does it differ from forwarding ? 

- Given a packet at a node, finding which output port it needs to go to is
  called  "forwarding."  
        - It is a per-node function. You need a forwarding
          table for any non trivial forwarding. Forwarding can also be imagined as
          setting up a link from one port to another within a node.

==================================================================
3. Can you explain the "careful definition" of routing even more
carefully ?


Sure !

- We instinctively think of a "path" as having multiple hops.
  Graph theory tells us that a "path" cannot have a loop (nor can
  a tree for that matter)

    - Routing is trivial if it is a single hop -- forwarding is
      enough. For example - routing is trivial on a broadcast LAN or a
      fully connected mesh. Observe that you do not need even
      "intermediate nodes" in this topology. Routing is also trivial
      in a ring topology - esp a unidirectional one. 

    - The first time you require control intelligence is when you
      require a forwarding table - a multi hop network. But "learning
      bridges" provide simple ways of constructing a forwarding
      table. But this is layer-2 forwarding. We know layer-2 ends at
      the boundary of the broadcast domain -- so if you want to cross
      that boundary -- you need layer 3 forwarding (forwarding tables
      and hence routing).  

    - Note that you could get by without routing and a forwarding
      table by forwarding a packet to a random output port and praying
      that it will reach the destination. But this is obviously
      inefficient, and you might run into loops. Routing aims for
      loop-free paths with a high probability.

   - Routing should first minimize the probability of looping, and
     then should build paths such that some optimization criterion is
     satisfied -- eg a least # of hops path is built by the Dijkstra's
     algorithm.

      [More specifically, forwarding is a "local" operation whereas the
       forwarding table must be setup based upon knowledge of the global
       topology. So a routing protocol must communicate global
       topological information to each forwarding node to allow it to
       make a local routing decision (input port to output port)]


Miscl:
   - By definition, routing is a "control path" function, not a "data
     path" function.

    - a "path from source to destination" may be setup dynamically on
      a per-connection basis (eg: virtual circuit routing) or in the
      background using a distributed algorithm. Random routing is
      example of per-packet routing -- not efficient.


==================================================================

4. What are the design constraints in routing ? 

- Minimizing routing table space: the routing table should grow more
  slowly than the number of destinations in the network. Since hashing
  and tree structures are used for route lookup, the lookup complexity
  is dependent upon the size of the table. This is why we need
  aggregation - so that worst case route table sizes become
  proportional to the smaller address space.


- Minimizing control messages: The amount of control messages also
  depends upon the number of destinations (and the route table
  size). This is one of the more pressing scalability reasons why
  aggregation is needed.

- Robustness: the probability of routing loops, routing oscillations
  and misrouting ("black holes") should be minimal. Authentication,
  checksumming, other consistency checks may be required to protect
  against corrupted routing tables, corrupted packets, fake peers etc.


- Optimal paths: routing protocols may try to optimize several metrics
  (distance, cost, resources used for QoS, policy) etc. It is
  important that the routers which cooperate to achieve routing have
  consistent views of the topology in terms of these metrics and use
  consistent interpretations to setup forwarding tables.


==================================================================

5. Ok, so what are the types of routing out there ? 

a) Centralized vs distributed: 
- For small domains which are centrally administered, a server could
  collect topology and find routes and send the tables to each node.
- Centralized routing does not scale and is a single point of
  failure. Most routing in reasonably large domains and internets is
  distributed.

b) Source-based vs hop-by-hop: 
- Source-based: A source may specify entire route. Also called
  explicit routing.
  Loose-source routing => source specifies partial route through which
  packet must pass through. Supported in IPv4 and v6 as options.
- source based routing suffers from unbounded header size; and
  requires the source to be aware of topological changes in the
  network.
- Internet routing is hop by hop where the packet carries the destination
  address and the routers at every hop determine forwarding based upon
  the destination address and routing information setup by a routing
  protocol.
- But source routing is not unreasonable if the route is setup once
  (eg in VCs), and short labels (instead of long addresses) are
  assigned and used at every hop (eg ATM). In fact, explicit routing
  is desired by large ISPs who would like to pin routes for delivering
  QoS, define large traffic aggregates called "trunks" and map them to
  specific paths -- this area is known as "traffic engineering"


c) Stochastic vs deterministic: 
- Deterministic => forwards packets to destination (or packets of a
  given flow/trunk) along exactly one path. 
- Stochastic => randomly choose among a set of paths to forward
  packets. While this may provide some load balancing, the benifit is
  outweighed by the common case problem of out-of-order delivery. As a
  result modern networks use deterministic routing.

d) Single vs multiple path: 
- In multiple-path (typically in telephony and soon to be introduced
  in core domains of the Internet), traffic trunks may go through a
  primary path or a set of alternative paths. This approach requires
  more routing table space and Internet routing so far has used single
  path routing.

e) Dynamic vs Static (aka State-dependent vs State independent):
- State dependent = "dynamic" routing. Do not ignore network
  state. Can find better routes than static routing in the presence of
  changes in the network state, but can have oscillatory problems.
- State independent = static: preconfigured routes

f) Distance vector vs link state: 
- Distance vector and link state are two types of algorithms used in
  packet-switched networks. The reason for their popularity is that
  they allow a router to find *global* routing information by
  exchanging routing information with only its *neighbors*. In
  distance vector you exchange distances to every destination; whereas
  in link state you exchange distances to every neighbor (but this
  information is broadcast to every node in the network).
- Distance vector routing is like setting up signposts at
  intersections of roadways (equivalent of a router in computer
  networks). You set up a signpost with an initial value and then
  update it by looking at your neighbors signposts. A common updation
  algorithm is called Bellman-Ford algorithm.
- Link state routing is like getting an entire map at every
  intersection (i.e. a router) and plotting optimal routes based on
  that map, and then sending the packet along the next hop of the
  optimal route calculated. A common shortest path algorithm is called
  Dijkstra's algorithm.
- Distance vector may converge slowly after link failures (called the
  "count to infinity problem"), but boasts minimal per-router state
  and computation complexity. Distance vectors approach using
  path-vectors (where the entire path to the destination is
  maintained) converges much faster, but maintains more state and has
  increased computational complexity.
- Link state does not have the looping & convergence problems, but has
  a lot of complexity associated with disseminating the "network map"
  to all routers in a robust manner. It allows maintainence of
  multiple maps based upon different metrics, but the consistent use
  of maps during forwarding has to be managed through some other
  means. 

[More about distance vector vs link state is in the class handouts]

==================================================================

5. How does telephony routing work ? 

- Telephone routing is SIMPLE.

- This is because historically the electromechanical relays used for
network control could not execute sophisticated programs. Instead, the
switches and links were made to be extremely reliable.

- Further, the national telephone network is structured as a fairly
rigid three-level hierarchy, where the levels represent subscriber
telephones or modems, central office switches and long-distance or
toll switching offices. The network of toll switches is called the
"core".

- Specifically, subscribers are directly connected to central office
switches (eg: Lucent's 5ESS switches) through the local loop. Many
central offices exist in a Local Exchange Carrier (LEC) such as Bell
Atlantic. Every central office switch logically connects to one (or
rarely, two) nodes at the core.

- The switches in the core are fully connected (a mesh topology) at
great cost! (in comparison Internet cores are sparsely
connected). This full core connectivity combined with reliable
components means that routing is not needed if you just desire
connectivity !

- The reason routing is still needed is because a) a call also
requires resources to be reserved, b) users expect low blocking
probability

- These tasks are also alleviated by the fact that telephone traffic
is very predictable - arrivals are poisson and call duration has a 3
min average. Moreover, call patterns during every interval of every
day of the year is known and they repeat with striking consistency.

- So routes are precomputed and downloaded into switches for different
intervals in a day using this knowledge !

- The default route in the core is always a one-hop logical path. The
only complexity is to allow alternate routes to obtain a low blocking
probability. To do this, the telephone system computes an ordered set
of two-hop paths which are tried in the predefined order if the
one-hop path is full! This is known as "Dynamic nonhierarchical
routing (DNHR)." The process in which a call is rejected on a primary
path and is retried on an alternative path is called "crankback" -
which represents the only "dynamism" in the routing protocol !

- DNHR suffers from a problem known as "metastability" where, under
heavy load, all calls alternate between using a two-hop path or a one
hop path. To counter this a scheme called "Trunk status map routing
(TSMR)" was developed where a centralized server would make minor
modifications based upon measured dynamics if necessary. The latest in
the series of such algorithms is "Real-time network routing (RTNR)"
where the routing changes are more distributed (controlled by the
ingress toll switch). RTNR is very effective: AT&T's network carries
upto 260 million calls on busy days and only about one or two are
blocked at the core !!


Perspective: 
============

A hundred billion $$ question is to use newer switching systems which
use sparser interconnection, and reduce stringent requirements for
connectivity and component reliability, but provide the features of
low blocking probability, very high reliability and global coverage!
ATM provides some of these facilities with flexible QoS routing. But
what is dragging back operators is the complex switching, billing and
signaling systems and not the routing algorithms.

New long distance companies like Qwest (and some transforming
companies like MCI Worldcom/UUnet) have the opportunity to use
advanced Internet routing techniques to bypass the requirements
mentioned above. But the challenge there is to develop the same level
of availability and manageability. By throwing bandwidth at the
problem (2 Tb/s according to Qwest using new fiber and WDM), several
of these problems are simplified. The only hitch now, however, is that
revenue emphasis is moving from voice to data services, i.e. the need
is to provide QoS for data (which is not at all
predictable!!). Diff-serv and MPLS are now coming to the rescue ...

Note that in other countries, telephone is a monopoly, and the cost of
the telephone system includes both the local loop and the long
distance, and 80% of the total system cost is in the local loop, and
90% of that cost is in the labor of installation !!