Today we released some key news about NetScaler.  First, we announced a new series of NetScaler appliances.  And second, we published a note on the performance of NetScaler in comparison to F5.  This got me reflecting on the very different architectural approaches these two products embody.

In this blog, I’ll talk, in particular, about the notion of fastpath and how something that used to connote elegant design and superior performance, has been reduced to not much more than a gimmick, a trick, a sleight of hand and a dubious performance claim in the hands of benchmarketers.

Now, when you hear the word fastpath, either literally, or variously disguised as a ‘FastXYZ’ profile or a turbo operating mode, and so on, it should immediately raise flags — big red flags — in your head.  For fastpath has an evil twin.  And it is called slowpath.

Let me explain. When you see a selective claim — where some special kind of traffic could benefit from a fastpath implementation – you must ask two questions:

  1. What about the traffic that cannot be handled along the fastpath?  By definition, this traffic follows a slower path.  Later on I’ll explain, why the slower path tends to get slower as time passes.
  2. Would you accept being limited to the narrow set of constraints that the traffic must adhere to, in order to get the fastpath treatment?  I’ll explain how these constraints tend to get narrower as time passes.

But first – surely not all fastpath designs are bad.   Then how does one judge the goodness of a fastpath design?  Wikipedia describe fast path pretty well.  In particular the excerpt:

“…For a fast path to be effective it must handle the most commonly occurring tasks more efficiently than the ‘normal’ path…”

The goodness of a fastpath design critically hinges on whether it is the most commonly occurring tasks that benefit from the established fast path.  A fastpath that does not cover the most commonly occurring tasks, is no good.  And when it comes at the cost of making the normal path slower, you’ve been had by the evil twin of fastpath.

Per F5, and I quote – “the FastHTTP profile is optimized for speed under ideal traffic conditions”.  These ideal traffic conditions do not tolerate even things like out of order packets, or source-ip preservation.

How did it get this bad?

When I first heard about the ‘ideal traffic conditions’ requirements for F5 FastHTTP vserver profile, I was puzzled.   I have a tremendous respect for the F5 engineering team, even as we have competed over the years, and I wondered –

“Why would anyone design and market a feature that is only ever useful under ideal traffic conditions, (otherwise known as benchmarks…)?”

Well the straightforward answer is – ‘to optimize for benchmarks’.  But it usually doesn’t start that way.  The fact of the matter is that fastpath design is a sophisticated tradeoff.  From an engineer’s perspective, they can either optimize the existing path – one that handles numerous possible real-world traffic scenario – no mean task.  Or they can bolt-on a much-simpler fastpath and divert selected traffic to it, without ever having to revisit the normal code path.

The simplicity, optimization and engineering efficiency of such a fastpath is a direct result of narrowly defining what is and is not allowed to follow the fast path.

Without exercising mindful caution and discipline, the respite of fastpath can quickly turn from a runtime shortcut into a coding-time shortcut.   A misguided tradeoff between engineering effort and usefulness.  As a result, you find these fastpath approaches that are good for not much in the real world.

Any work going towards creating and embellishing these mock-fastpaths, is work not going towards optimizing more real-world scenarios.  And you can expect the normal path to get slower over time.

What about NetScaler?

So how do we do it in NetScaler?  We have adopted two guidelines that have helped us greatly.

First, we focus on real-world scenarios.  In our test labs, in our developer harnesses, in our design reviews and in our product management processes, there is an almost obsessive bias towards tuning for real-world traffic scenarios.  Sure, we publish datasheet numbers and benchmarks too, but in our world, these are the outcomes, and not the inputs into product design.

And second, we follow a very simple hardware design philosophy – keep the path between the processor and the NIC as short as possible.  When you have a switching ASIC or two, and / or an FPGA or two in the path of every packet, and each hop can introduce queuing and delay, it is no wonder that you have to find paths that short circuit some of these hops for the benchmarks.

In closing, evaluate fastpath offerings very carefully.  Remember that whenever there is a fast path, there is also a slower path.  Ask whether your traffic will follow the fast path or the slow one.  If all paths were fast, no one would be calling out a fast path – it would be called TriScale.