June | 2008 | benryanau

This entry was prompted by a question posed on slashdot. It is meant to shed light on the true issues surrounding QoS in most commercial home-grade links.

There’s a huge number of people who don’t know what the issues with QoS actually are – an unfortunately very common misconception is that configuring CPE QoS features == successful QoS. Also common among the ‘enthusiast’ group is the belief that sticking in a linux-based QoS-enabled distro is a panacea for QoS (eg "Put in IPCop == solved’").

In a situation where a link is a simple, as-designed ptp situation with no re-encapsulation (eg E1/SHDSL) QoS can be effectively implemented using in-box configuration. Home-class routers and OS-based control will work provided QoS is configured at each endpoint.

Unfortunately few people have the kind of service which out-of-box QoS can be used effectively.

The root issue of QoS today is the lack of outbound interface queuing. Unless one can ‘see’ the hardware/egress packet queue at its final point, QoS will not work successfully (without creating a ‘dummy’ artificial queue.. see below). When a link is congested, or when smaller frames/packets need to be sent before larger ones, QoS comes into play.

First component of QoS is the "queue". This is the ‘waiting list’ for packet egress (and ingress to a small degree, especially where link-congestion indication is supported). Here is where packets line up to be sent. QoS works by managing the priority of each packet (either by classification or by respecting the IP CoS/DSCP bit in the header). In a simple ptp link, this works well and as intended. Policy is applied to each end, and each router applied a queuing strategy and policy to the packets going out. Bandwidth is known, and the load on the link is directly observable by queue depth. Even applying ‘fancy queuing’ eg weighted round-robin, Random Early Detection is often enough to ensure equitable access.

Unfortunately in this day and age, most link implementations remove the ability to observe queue depth – by either shaping in the middle, by virtualising the terminating interface, or by multi-hopping the termination.. all three at once is common.

I am discussing DSL specifically as it’s the most common access technology, but the same thing applies to cable as rarely can one apply QoS inside the cable CPE.

Originally DSL-based WAN’s were intended to be implemented over an ATM-native network, where the CPE configures one or more PVC’s and this PVC(s) terminates on the decapsulating aggregation router. The PVC (at each end) has known properties (bitrate, class of service). It was also intended to have multiple PVC’s (eg VPI/VCI virtual circuits) with different Class of Service parameters for each to manage QoS.

Sadly this model has been discarded, and almost all DSL services ‘ignore’ the ATM component and just treat it as a ‘throwback’. One CPE has one PVC, and this is terminated on an AVGC far outside the reach of the actual ISP with the ATM component having no real impact other than simply being an access transport. To complicate matters, usually shaping is applied at the AVGC end, and the whole path is a series of virtualised interfaces double and triple encapsulating the link. So we have lost the ability to ‘see’ the actual queues – you can only throw packets at the link and hope they come out the other end.

So now the native ATM opportunity for traffic management is gone, what now? As mentioned, most commercial DSL operates using L2TP multihop and virtual interfaces. A virtual interface has no real queue and in the Cisco world at least, there’s no practical method to do anything with the provider end of the link. The QoS mechanism cannot ‘see’ what the link is doing load-wise and as such can’t effectively manage service.

So, what’s left?

The only avenue open is to create a ‘false queue’. This is effectively an artificial bottleneck.

Here, you implement traffic shaping by taking the maximum amount of bandwidth in one direction (and with the bandwidth tax of multiple encapsulations eg PPP, L2TP, ATM which can often be difficult to determine) and subtracting say 10%. This creates a situation where packets can actually queue up and be seen by the QoS process.

It can be done inside some routers, although many home-grade CPE implementations are unacceptable in that they force you to allocate a fixed percentage of bandwidth to a certain service (and often this is TCP/UDP port based which is very limiting in itself).

This is the only way to make QoS work in the current environment. The limitations are obvious: it must be also be done at the provider end, 10% or more loss of bandwidth is incurred, and if there is any congestion between the two endpoints QoS will not be effective.

There is also the option of the CPE router using coarse techniques to ‘poison’ or control non-sensitive traffic in the presence of time-sensitive traffic. This can be done by using/managing TCP resets, ACK’s and window sizes. This approach has many limitations and I cannot recommend it as a practical alternative.

So, if you’re after QoS, the CPE router must support traffic-shaping and QoS for upstream control. Your provider must also be willing to provide the same service at the head end. You lose bandwidth, and any congestion between ends can’t be accounted for.

Pretty horrible isn’t it? Until the IT industry gets off its’ backside and addresses these issues, the most practical solution for low-latency, low-jitter, low-loss internet is the same as it’s always been: get a bigger pipe.

I’ve a growing collection of the extraordinary Symbol/ARLAN/Telxon/Aironet bridges and AP’s.. 630, 640 series.

These were pre-802.11 radio AP’s and bridges, and are just amazing in their depth of control.

I’ve a few bricked units I’m trying to revive, and I intend to blog the (slow) progress here.

The main unit is an Arlan 640-2400 2.4ghz.

It has a Motorola 68360 series (68EN360FE25C) CPU – 25Mhz Quad Integrated Communications Controller. (datasheet link USER MANUAL)

Flash 2 x AM29F010-120JC (1 Megabit (128 K x 8-Bit) CMOS 5v) (DATASHEET LINK)

Clock source or EEPROM: Intel 14538B

DRAM 2 x 814260-70

RS232 MC145407DW

And some additional glue logic.

Motorola 68360 CPU

25mhz, 4.5 MIPS

8/16/32-bit databus, 32 address lines, glueless SRAM and DRAM interface, IJTAG Access port(!), 7 IRQ’s,

Four Comms Controllers – Ethernet, HDLC/SDLC 2mbit, UART, two TDM controllers (BRI/PRI ISDN)

Parallel/Centronics Interface

Instruction Set: CPU32+ (superset of M68000)

Connectivity to EEPROM/FLASH

EEPROM (8 bit boot) may be regular or flash.

Signals (68k-EEPROM): CSO-CE-Enable, OE-OE(Output Enable), WE0-WE (Write Enable), Data and Address.

WE0 – aka UU-WE, Active-low, Address bus A31, corresponds to data bits 31-24

Most interesting is the SIM60 (System Integration Module) as it controls startup, initialisation.

DEBRICKING

This unit was bricked by too much debug info being logged – into flash I think. This might have caused a wraparound effect – debug wasn’t meant to have full error handling as it’s not user-exposed.

The basic plan is to a) try and get a diag on the cause of the boot hang, b) understand the boot process, and c) remedy the cause.

I suspect flash has been trashed by the wrap (or it’s just plain full) – this would be most easily remedied by unsoldering flash from both bricked and good unit, examining both to see what’s there, and if wrap occurred, dumping into a programmer and cloning. Parity might be an issue too.

If the issue is EEPROM them same process.

Interfacing to Flash

The User’s Guide has the interfacing spec on page 9-6, and EEPROM on 9-8.

Boot Process

Basic init is described in User’s Manual on page 9-10.

<That’s all i have the battery for at the moment!>

Collateral to come:

M68000PM/AD M68000 Family Programmer’s Reference Manual – (link)

M68300 Family CPU32 Reference Manual (link)

—
Update:
Unbricking procedures for other routers – short-flash method
http://www.dd-wrt.com/wiki/index.php/Recover_from_a_Bad_Flash
http://forum.openwrt.org/viewtopic.php?id=1572
http://www.ranvik.net/prosjekter-privat/jtag_for_wrt54g_og_wrt54gs/HairyDairyMaid_WRT54G_v22.pdf – WRT54G EJTAG DeBrick Guide by HairyDairyMaid
http://forum.openwrt.org/viewtopic.php?id=5050 – Bit of discussion about JTAG methods
http://forum.openwrt.org/viewtopic.php?id=664&p=2 – JTAGging other platforms
http://www.dd-wrt.com/phpBB2/viewtopic.php?t=12053 – JTAG pinouts and guides ! !

benryanau ..share what you know..

Archive for June 2008

QoS on non-dedicated internet links Leave a comment

Hacking (debricking) the Aironet/Arlan 640 series bridges/AP’s 1 comment

Recent Posts

What’s Hot

Categories

Archives

Site Admin

GOT AN ISSUE??

hmmm