Netw0rk-King: September 2008

Monday, September 29, 2008

Solving problems in the early days of wireless

Is it weird that this paper from '94 feels like ancient CS history? Maybe it is just in my head, but reading any paper from PARC causes me to assume the paper is likely old and likely influential.

Summary:
This paper picked up where Phil Karn left off with MACA which was "just an idea" that Karn had introduced to handle the hidden terminal scenario more effectively than CSMA.

They proposed that collisions should be detected and avoided at the receivers, an idea pioneered by Karn with Multiple Access Collision Avoidance ("MACA is an acronym, not a spanish word"). This is in contrast to detecting collision from the sender's perspective, as was done with CSMA (Carrier Sense Multiple Access).

As Karn said, "Much simulation and experimental work needs to be done to answer many questions about how well it [MACA] will really work." Shenker et al took this to heart, and proposed MACA for wireless, or MACAW, which addressed some of the issues they observed in MACA, such as the use of Binary Exponential Backoff (BEB) which quickly led to unfairness in collision resolution.

They also address the problem for nodes [A] [B] [C], if C overhears an RTS from B to A, it can't tell if B is sending since "C cannot tell if the RTS-CTS exchange was a success and so does not know if B is indeed transmitting data." They chose to solve this by having the sender emit a Data Sending (DS) packet.

They also introduce, with nearly comic academic timing, the Request-for-Request-To-Send (RRTS) packet, which solves another fairness problem of two receivers in range of each other with senders out of range of each other and with one pair transmitting much smaller packets than the other. The RRTS solves an unfairness problem which feels a lot like the problem introduced by BEB.

Thoughts:

Multiplicative Increase Linear Decrease (MILD) of the backoff interval seems very similar to the concept of Additive Increase Multiplicative Decrease (AIMD) of cwnd in TCP congestion control.

Improving TCP for wireless

The authors of A Comparison of Mechanisms for Improving TCP Performance over Wireless Links first build a nice taxonomy to classify 12 schemes used to handle non-congestion related losses in TCP (see table 1). They then analyze each of the mechanism in turn.

This paper served as a nice forcing function to solidify some new and formerly vague networking concepts in my head such as SACK, Reno vs. NewReno, and split-connection protocols.

This paper was also an exercise in exploring the sort of trade-offs that were discussed in the classic end-to-end paper we read earlier this semester. In particular, their findings show one of the largest performance wins came from pushing knowledge from the ends (TCP) down two layers in the network stack into the link layer. I find this paper to represent an wonderful exposition into the final sentence of the end-to-end paper's abtract: "Low level mechanisms to support these functions are justified only as performance enhancements."

Monday, September 22, 2008

Parallelism in the router, i.e. switched backplanes

After reading today's paper on the gigabit switched router, I realized that I should have read it before reading today's other paper because this paper is more approachable and didn't assume an intimacy with internet routing technology. Figure 2 even uses a picture to explain the evolution of high performance routing! What is more, I now know what they were talking about in the other paper when they spoke of router scheduling and VOQ.

They are engineering a 16 port x 2.4Gbps per port = 38.4Gbps

Shared backplanes could (at the time of writing) support 20Gbps, not enough.

They present iSLIP, a simple effective router scheduling algorithm.

While iSLIP allows for high throughput (100%) of the router's backplane, it does not address delays which might be problematic for latency sensitive applications. Thus, they introduce the intuitive mechanisms of prioritization and speedup (get all packets to the output port as quickly as possible so they can be scheduled).

Random notes:

average internet packet is 250 bytes

Keep this paper on the list! It is very informative for those of us who don't know router architecture coming into the course (which I imagine might be a non-trivial portion of us).

Sunday, September 21, 2008

Blah blah blah, optics, blah blah blah, router.

I nominate Scaling Internet Routers Using Optics as the least fun paper to read so far this semester. They describe in excruciating detail two designs of multi-rack routers (1-hybrid electro-optical, 2-fully optical switch) which achieve 100Tb/s load balanced routing. Their router gives 100% throughput without a scheduler. They address four obstacles to the load balanced switching approach:

Rapidly configuring switch fabric is slow or expensive. Solution: replace switch fabric with fixed mesh of optical channels
Packets can be mis-sequenced. Solution: (Full Ordered Frames First) bound the length of queues at middle stage and introduce a re-sequencing buffer at a later stage
Periodic traffic can cause bad throughput. Solution: same as for problem #1 (replace crossbar switches by fixed optical mesh)
Missing/failed linecards are not easily handled. Solution: presented in sections 6.2 and 6.3, they present two solutions that allow the switch to be reconfigured to scatter traffic uniformly over all present linecards in the face of some linecard failures.

The are designing for ~3 years out (in 2003), I wonder if the future aligned with their predictions.

The paper was a bit dense on technical/hardware details so I ended up being lost for most of it :-/

Thursday, September 18, 2008

Splitting up Spectrum and wireless transmitter fingerprinting

Solutions for sharing bandwidth

Assign a different channel to each AP
Load balance
Restrict range of each AP
Scheduling

Using Channels

Least congested channel search (LCCS)- APs talk to each other
LCCS is not enough because clients can end up seeing two APs that can't see each other
Solution: get feedback from clients, need to do set coloring (conflict free set coloring)
Partially overlapping channels are frequently not used (e.g. ch 1 and ch 2 overlap a LOT so we don't use both of them), this causes wasted space (fragmentation of a sort) in our spectrum
Solution: Model interference for partial overlap. I-factor == 0 if non overlapping channel, I-factor == 1 if same channel.
I-factor = overlap between transmit spectrum mask and receiver's band pass filter profile

New problem: how to identify who is using our AP more w/ something less spoofable than MAC address. We could build Access Control Lists out of this that would work much better than MAC address white-lists.

PAssive RAdiometric Device Identification Scheme (PARADIS)

Imperfections in the signal from each card, which are an artifact of manufacturing imperfections, are unique.
compute distance of "measured signal" from "ideal signal" along several axis: phase error, magnitude vector, frequency error, SYNC correlation, I/Q origin offset
Provides surprisingly accurate identification. They experimented on the Orbit Testbed which is 130 transmitters all of the same model# with sequential serial numbers, they saw 0.34% error rate, which is very impressive.
Audience Question: what can a well funded adversary do? Can he spoof the hardware signature? Answer: the adversary would have to be able to read your signature (like the defender/security system)

Problems to design against

Temperature of NIC
Mobility (distance from AP?)
Nic aging

So, getting back to how we might use this idea for spectrum sharing, we might want to assign spectrum dynamically, and we would need to identify people more accurately to be fair with our allocation policies (i.e. leases).

Real Time Scheduling

Hidden Terminal aware Scheduling (HITS)
Hidden terminals can be detected by the higher level router which is sending packets that it can tell ahead of time are conflicting but the routers themselves can't see each other (by definition of hidden terminal problem). When the higher order router that has more information notices that two packets will conflict it can infer that there is a hidden terminal problem and it can delay the packets just right so that packets that would collide are not sent at exactly the same time.
Problem: we are not able to be accurate enough with our timings

The past future of the Internet

Today, we look at a paper written 13 year ago discussing the future of the Internet.

Observations:

When the network can't provide enough reliability or low enough variance, application developers will solve the problem themselves (by using bigger application level buffers, etc.)

... Er, this is an incomplete post... More to come (I blame prelims!).

Sunday, September 14, 2008

A clean slate approach to Congestion Control

We've discussed a lot of papers about congestion control! The most recent one (XCP) represents a refreshing clean slate approach to congestion control. You have to respect the theoretical foundation (control theory) used to justify the design principles.

Highlights

>1 bit feedback. Was the next logical step to take now that chip performance can handle it.
Separation of fairness and throughput policies for independent performance tuning and analysis

Random Early Detection (RED)

The next stop on our tour of the history of the Internet is another Congestion Avoidance protocol that exists at the gateway level. Random Early Detection sits somewhere between end point congestion avoidance and Fair Queuing in that it runs on routers, but not at the high granularity of individual flows. Instead it operates on summed averages of flows .

This work does not introduce Random Early Detection, rather it builds upon existing research on RED, contributing algorithms and simulations of more intelligent parameterizations than existing RED schemes used.

Highlights:

Flexible, works with existing transport protocols
Focuses on average queue size (using a low pass filter), allowing fluctuation in actual queue size to handle bursty traffic.
Avoid global synchronization by randomly choosing nodes to notify of congestion (via dropped packets or Explicit Congestion Notification) weighted by their share of the router's bandwidth.
Nice discussion of parameter selection, but (as Mate and I have learned) some real numbers as part of a sensitivity analysis showing the tradeoffs would be nice.

Thoughts:

"Short packets are just a likely to be dropped as long ones"? I wonder, what would be the implications of adding packet length into the random weighting algorithm?

Why it should stay in the syllabus:

Because it fits so nicely into our historical tour de networking.

Thursday, September 11, 2008

Fat Trees for a scalable data center network architecture

Amin Vahdat from UCSD spoke at the UC Berkeley Systems Seminar today, this talk was a SIGGCOM talk. I scribbled some random notes.

Neworking is super expensive. Marketing literature brags about being cheap when they offer rates of $4000 per port (which is crazy compared to the expense of the actual pc of ~$2000).

They propose something like a Network Of Inexpensive Switches (NOIS - not the official name). Essentially their idea is to use a Fat-Tree!
The idea has been around for 50 years.
Why hasn't it been done?

High level issues with fat-tree that need addressing

We have path diversity, but existing routing protocols don't take advantage of that.
Cabling explodes!
We use 8 port switches throughout, lots and lots of them! Too many?

How is Intel doing this?

Routing
- Localized load balancing switch by switch
- Utilize our global knowledge for routing! Logically centralized routing brain (replicated, of course). The routers report their statuses to the CS (Central Scheduler).
- Two level look-up (two tier routing table)

Simulations
- Used Click Modular software router
- The numbers looked good? Need more workloads. They are building a real prototype

Multicast
- Again, use central scheduling
- Central scheduler sets up routing tables for multicasting knowledge (eliminating broadcasts to unnecessary "pods" at earliest available point in routes)
- If multicast pipe grows too large, the CS will know because it is getting reports from all routers, in response to status updates
Cabling
- 14 tons of cabling in a 27,000 node DC!
- Addressed by using optical. This will cost lots and lots of money, but they think people will buy into it.
Power and Cooling
- I didn't catch this part, I think he might have said "we're working on this still"???
Work in progress: they're building it as we speak.

My thoughts

This seems a little bit like a bunch of small hypercubes?
They just cop out and fall back to optical anyway? Won't this cancel out any price wins? Why can't we do better cabling under the floor? A huge matrix of many many ethernet cable equivalents.

James Hamilton on blades in data centers

James Hamilton blogged about a data center architecture question I thought others in the class would find interesting. In short: why not to use blades.

core-Stateless Fair Queuing

Now we move to an optimized version of Flow Control, where only the edge routers do the Fair Queuing and pass that information along in the headers of the packets, while the core routers can then be less intelligent (and thus more efficient).

This represents a natural evolution of the research being done by this set of folks.

Wednesday, September 10, 2008

A third look at congestion avoidance

This time we've moved from the edge into the gateways.

Their idea was great: let's run simulations to show that our modified version of Fair Queueing can outperform traditional congestion control mechanisms.

Much of the writing was comprehensible and well justified. It seems to me that this work was low hanging fruit. Though their categorization of the flaws in Nagle's version of Fair Queuing and the solutions they provide, supported by simulations which show large performance gains, were certainly well done and non-trivial.

I feel that they used math which was overly complex and under-explained in the performance section. For the majority of the paper, I was unsure what they meant by fairness with regards to promptness allocation, and a section of dense math didn't help much. I had to wait till the end of the section to get an understandable summary of the problem they were solving (i.e. "This ability to provide lower delay to lower throughput sources completely independent of the window sizes of the FTPs is one of the most important features of fair queuing") Also, Figure 1 was poorly explained.

Monday, September 8, 2008

A second look at Congestion Avoidance

In his paper Congestion Avoidance and Control, Van Jacobson basically presents the story of most of TCP Tahoe (all except Karn's clamped retransmit backoff and fast retansmit).

To justify the addition of the the seven new algorithms he introduces, he identifies three things that cause instability in network:

not reaching equilibrium

more packets entering than leaving

equilibrium not reached because of data path limitations

The first can be handled with the ironically named slow start mechanism (which takes approximately time R*log_2(W) where R=rtt, W=window size).

The second, by better estimating round trip time for resend policies.

The third item I find a bit confusing. I think what he means by "resource limits along the path" is network congestion, so he solves this with congestion avoidance (i.e. additive increase/multiplicative decrease, which he is quick to point out he did not invent, nor even bother to justify in this paper).

He points out that dropped packets, as detected by timeouts, are enough for congestion detection and that explicit notification of congestion (as is described in today's other paper)

Like most of the papers we have read so far, this work is worth reading because it shaped the Internet, which in turn continues to shape the world as we know it.

Friday, September 5, 2008

Why to increase additively and decrease multiplicitively

In section 1.2 of this paper on congestion avoidance in computer networks, the authors present a summarized overview of the types of decision making that a system can incorporate, ranging from centralized to decentralized. They justify their decision to use a decentralized system.

I appreciated the way they articulate the classification of algorithms for managing distributed resources (centralized vs. decentralized). It is essentially one of the question we are wrestling with in the Chukwa project, and also the RAD Lab project. I intend to read some of the papers they reference when they describe this spectrum.

Zooming out, the overall thing that makes this paper amazing is the way they formalize, quantify, and visualize their algorithms.

Every systems paper should attempt to emulate the methodology found in this one. They begin by clearly articulating the problem being solved (Section 1.4) and defining the solution space (imagine a matrix with three rows: additive/multiplicitive/non-linear, and two columns: increase/decrease). They then construct a concrete mathematical model to quantify the performance impacts of the various points in the design space (the 2D vector representation of the Dynamics (2.1) was genius) which of course leads to an elegant, intuitive solution.

This paper seems to be missing empirical experimentation showing that the proposed optimal solution actually works in reality.

Thursday, September 4, 2008

Refocusing Internet Research?

I believe that the motivation for our reading Interdomain Internet Routing (Lecture 4) was to gain insight into the functional aspects of the BGP protocol. A little bit of searching shows that most of what has been published on BGP is of a technical nature.

As the beginning of this reading notes, however, the structure of the Internet--and in turn the BPG--is a function of the business model on which it is built. While everyone who writes about BGP is quick to point this fact out, they then proceed to tackle how the technologies facilitate the complex business relationships that exist among the various tiers of ISP's.

I wonder, is there a group of economists and business visionaries that are just as excited about the future of the internet, but perceive the problem from the inverse perspective? After all, technology isn't driving the internet, money is. The technology follows the economics. A lot has changed since the DARPA paper on TCP/IP was written. In those days, the objective of the Internet was to connect people that "needed" to be connected. At that time, researchers were inspired by their national spirit to build something in the name of National Defense. At least that is the romanticized version of the story that I have in my head. Is that how things really were?

Today, we don't even pretend like the Internet is a national icon, we openly admit that the Internet is a conglomeration of independent business entities (which represent their share-holders, not red-blooded American's everywhere) whose only motivation is to generate a profit.

I bring this up because, in light of this perceived transition (i.e. growing up?) of the Internet early in its formative years, I am surprised at the difficulty I have finding research into the economics that drive the development Internet. Rather than founding our guesses for the future of the Internet on things like Moore's Law, Bell's Law, or any other of today's technology trends, shouldn't we be listening more closely to Economists (and laws more like Metcalf's Law), whose job it is to study the nature and behavior of the System that sits at the core of the Internet, which is essentially a business system, not a computer system?

After reading the other paper that Randy assigned for today, I am even more convinced of the sentiments I expressed above.

What made this paper interesting was not the math or the techniques that they used, but rather the big picture question that they were asking, which nobody had even thought to ask up until that point. They asked: what are the relationships between AS's on the Internet? That this question remained unexplored until 2000 only supports my above observations that we were (and continue to be) focusing on the technology at the center of interest and the business model as the secondary interest, while it should be the other way around.

Tuesday, September 2, 2008

OSI vs. TCP/IP stacks

During our discussion in class today, I kept meaning to ask why we have two networking stack specifications. We now know a lot more of TCP/IP's history, but why was it even developed if we already had the OSI model? Did we already have the OSI model?

This webpage (the first search result returned by Google) offers a brief explanation of the differences between the two, and explains that "The adoption of TCP/IP does not conflict with the OSI standards because the two protocol stacks were developed concurrently."

The Wikipedia article on the OSI initiative says that "OSI was an industry effort, attempting to get everyone to agree to common network standards to provide multi-vendor interoperability.

The wikipedia articles on both protocols (TCP/IP Model and OSI Model) are interesting further reading for anybody who is as curios as I was.

Netw0rk-King