Vanilla Clients Maling Clients Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Netrek on TCP sucks, here's why ... and how to fix.



A month or two ago, Dave Ahn, who played using TCP through a firewall,
observed that the updates delivered to the client weren't at a regular
rate; they stuttered.  He noticed this as soon as he began play at ten
updates per second.

I was able to easily reproduce the problem.  It is most evident when
moving at, say, warp four, and dropping torps out the back.  The
animation clearly stutters.

Dave and I had a good look at the server code, and nothing was obviously
wrong, and yet it clearly was happening.

It isn't our problem, it's TCP/IP design.  It's the Nagle algorithm, the
deferral and amalgamation of packets for transmission on the grounds
that the acknowledgement has not yet been seen.  Turning this off using
setsockopt() and TCP_NODELAY fixes the stutter.


Here is how I proved it;

1) added code to the Repair handler in ntserv/socket.c that would toggle
the TCP_NODELAY option on the socket each time I hit Repair,

2) placed a server on a 486/100 running Linux to 10mb ethernet,

3) placed a client on an Alpha/266 running Digital UNIX on the same
ethernet,

4) logged in and orbited earth,

5) traced the packets under various conditions.

The client did not receive network packets at the update rates requested
in TCP mode.  The packets were being amalgamated.  My conclusions are
summarised thus;

a) mode = tcp, requested rate = 10Hz, actual rate < 5Hz
b) mode = tcp, requested rate = 5Hz,  actual rate < 5Hz
c) mode = udp, requested rate = 10Hz, actual rate = 10Hz
d) mode = udp, requested rate = 5Hz,  actual rate = 5Hz

Packet amalgamation was happening even at five updates per second! (b)

If the client had been generating packets to the server in excess of the
ping responses, it is likely that a higher rate of packet arrival would
have been detected. (a).

UDP has no deferral algorithm.  (c) & (d).


Other thoughts ...

Perhaps we should enable TCP_NODELAY in the Vanilla server source?

Perhaps the login ghostbusts on slow TCP links are really due to the
deferral of packets?

The traces show that enabling TCP_NODELAY will increase packet load for
the network between the server and the client.  Perhaps this will also
increase the probability of packet loss and subsequent pause and
retransmission?

Interestingly, the COW source code (socket.c) has a disabled code
fragment that sets TCP_NODELAY.  Enabling it would probably not improve
performance, but I'd love to try.  Do any of the other clients use the
option?

This has implications for playing through TCP proxy firewalls, such as
trekhopd.  If the proxy does not set TCP_NODELAY, then packets can be
amalgamated as they pass the proxy.

Packet traces follow ... eighty column; you may wish to compensate.  The
trick I find with reading them is to highlight on whole second of packet
trace that does NOT include any data from the client to the server, and
then count the packets seen and their sizes.  The normal update packet
data size is 12 bytes, increasing to 16 with UDP due to the sequence
number.

normal server, ten updates per second

21:02:34.706616 idda.1033 > nestor.4566: P 60:72(12) ack 1529 win 33580
21:02:34.726616 nestor.4566 > idda.1033: . ack 72 win 32120 (DF)
21:02:34.756601 nestor.4566 > idda.1033: P 1529:1541(12) ack 72 win
32120 (DF)
21:02:34.906526 idda.1033 > nestor.4566: . ack 1541 win 33580
21:02:34.906526 nestor.4566 > idda.1033: P 1541:1553(12) ack 72 win
32120 (DF)
21:02:35.106426 idda.1033 > nestor.4566: . ack 1553 win 33580
21:02:35.106426 nestor.4566 > idda.1033: P 1553:1577(24) ack 72 win
32120 (DF)
21:02:35.306326 idda.1033 > nestor.4566: . ack 1577 win 33580
21:02:35.306326 nestor.4566 > idda.1033: P 1577:1601(24) ack 72 win
32120 (DF)
21:02:35.506264 idda.1033 > nestor.4566: . ack 1601 win 33580
21:02:35.506264 nestor.4566 > idda.1033: P 1601:1625(24) ack 72 win
32120 (DF)
21:02:35.696264 idda.1033 > nestor.4566: . ack 1625 win 33580
21:02:35.696264 nestor.4566 > idda.1033: P 1625:1649(24) ack 72 win
32120 (DF)
21:02:35.896179 idda.1033 > nestor.4566: . ack 1649 win 33580
21:02:35.896179 nestor.4566 > idda.1033: P 1649:1685(36) ack 72 win
32120 (DF)
21:02:36.096079 idda.1033 > nestor.4566: . ack 1685 win 33580
21:02:36.096079 nestor.4566 > idda.1033: P 1685:1709(24) ack 72 win
32120 (DF)
21:02:36.295979 idda.1033 > nestor.4566: . ack 1709 win 33580
21:02:36.295979 nestor.4566 > idda.1033: P 1709:1733(24) ack 72 win
32120 (DF)
21:02:36.495911 idda.1033 > nestor.4566: . ack 1733 win 33580
21:02:36.495911 nestor.4566 > idda.1033: P 1733:1757(24) ack 72 win
32120 (DF)
21:02:36.695911 idda.1033 > nestor.4566: . ack 1757 win 33580
21:02:36.695911 nestor.4566 > idda.1033: P 1757:1817(60) ack 72 win
32120 (DF)
21:02:36.695911 idda.1033 > nestor.4566: P 72:84(12) ack 1817 win 33580


normal server, five updates per second

21:05:20.698132 idda.1033 > nestor.4566: P 12:24(12) ack 237 win 33580
21:05:20.718122 nestor.4566 > idda.1033: . ack 24 win 32120 (DF)
21:05:20.898032 nestor.4566 > idda.1033: P 237:249(12) ack 24 win 32120
(DF)
21:05:21.057952 idda.1033 > nestor.4566: . ack 249 win 33580
21:05:21.097932 nestor.4566 > idda.1033: P 249:261(12) ack 24 win 32120
(DF)
21:05:21.257852 idda.1033 > nestor.4566: . ack 261 win 33580
21:05:21.297832 nestor.4566 > idda.1033: P 261:273(12) ack 24 win 32120
(DF)
21:05:21.457785 idda.1033 > nestor.4566: . ack 273 win 33580
21:05:21.497785 nestor.4566 > idda.1033: P 273:285(12) ack 24 win 32120
(DF)
21:05:21.647785 idda.1033 > nestor.4566: . ack 285 win 33580
21:05:21.697780 nestor.4566 > idda.1033: P 285:297(12) ack 24 win 32120
(DF)
21:05:21.847705 idda.1033 > nestor.4566: . ack 297 win 33580
21:05:21.897680 nestor.4566 > idda.1033: P 297:321(24) ack 24 win 32120
(DF)
21:05:22.047605 idda.1033 > nestor.4566: . ack 321 win 33580
21:05:22.097580 nestor.4566 > idda.1033: P 321:333(12) ack 24 win 32120
(DF)
21:05:22.247505 idda.1033 > nestor.4566: . ack 333 win 33580
21:05:22.297480 nestor.4566 > idda.1033: P 333:345(12) ack 24 win 32120
(DF)
21:05:22.447433 idda.1033 > nestor.4566: . ack 345 win 33580
21:05:22.497433 nestor.4566 > idda.1033: P 345:357(12) ack 24 win 32120
(DF)
21:05:22.647433 idda.1033 > nestor.4566: . ack 357 win 33580
21:05:22.697428 nestor.4566 > idda.1033: P 357:377(20) ack 24 win 32120
(DF)
21:05:22.697428 idda.1033 > nestor.4566: P 24:36(12) ack 377 win 33580


normal server, udp mode, ten updates per second

21:06:17.278105 idda.11337 > nestor.1024: udp 12
21:06:17.378057 nestor.1024 > idda.11337: udp 16
21:06:17.478057 nestor.1024 > idda.11337: udp 16
21:06:17.578057 nestor.1024 > idda.11337: udp 16
21:06:17.678052 nestor.1024 > idda.11337: udp 16
21:06:17.778002 nestor.1024 > idda.11337: udp 16
21:06:17.877952 nestor.1024 > idda.11337: udp 16
21:06:17.977902 nestor.1024 > idda.11337: udp 16
21:06:18.077852 nestor.1024 > idda.11337: udp 16
21:06:18.177802 nestor.1024 > idda.11337: udp 16
21:06:18.277752 nestor.1024 > idda.11337: udp 16
21:06:18.377705 nestor.1024 > idda.11337: udp 16
21:06:18.477705 nestor.1024 > idda.11337: udp 16
21:06:18.577705 nestor.1024 > idda.11337: udp 16
21:06:18.677700 nestor.1024 > idda.11337: udp 16
21:06:18.777650 nestor.1024 > idda.11337: udp 16
21:06:18.877600 nestor.1024 > idda.11337: udp 28
21:06:18.977550 nestor.1024 > idda.11337: udp 16
21:06:19.077500 nestor.1024 > idda.11337: udp 16
21:06:19.177450 nestor.1024 > idda.11337: udp 16
21:06:19.277400 nestor.1024 > idda.11337: udp 24
21:06:19.277400 idda.11337 > nestor.1024: udp 12


patched server, tcp/ip, ten updates per second

21:07:45.247104 idda.1033 > nestor.4566: P 140:152(12) ack 4661 win
33580
21:07:45.267094 nestor.4566 > idda.1033: . ack 152 win 32120 (DF)
21:07:45.347057 nestor.4566 > idda.1033: P 4661:4673(12) ack 152 win
32120 (DF)
21:07:45.447057 nestor.4566 > idda.1033: P 4673:4685(12) ack 152 win
32120 (DF)
21:07:45.487057 idda.1033 > nestor.4566: . ack 4685 win 33580
21:07:45.547057 nestor.4566 > idda.1033: P 4685:4697(12) ack 152 win
32120 (DF)
21:07:45.647052 nestor.4566 > idda.1033: P 4697:4709(12) ack 152 win
32120 (DF)
21:07:45.687032 idda.1033 > nestor.4566: . ack 4709 win 33580
21:07:45.747002 nestor.4566 > idda.1033: P 4709:4721(12) ack 152 win
32120 (DF)
21:07:45.846952 nestor.4566 > idda.1033: P 4721:4733(12) ack 152 win
32120 (DF)
21:07:45.886932 idda.1033 > nestor.4566: . ack 4733 win 33580
21:07:45.946902 nestor.4566 > idda.1033: P 4733:4745(12) ack 152 win
32120 (DF)
21:07:46.046852 nestor.4566 > idda.1033: P 4745:4757(12) ack 152 win
32120 (DF)
21:07:46.086832 idda.1033 > nestor.4566: . ack 4757 win 33580
21:07:46.146802 nestor.4566 > idda.1033: P 4757:4769(12) ack 152 win
32120 (DF)
21:07:46.246752 nestor.4566 > idda.1033: P 4769:4781(12) ack 152 win
32120 (DF)
21:07:46.286732 idda.1033 > nestor.4566: . ack 4781 win 33580
21:07:46.346705 nestor.4566 > idda.1033: P 4781:4793(12) ack 152 win
32120 (DF)
21:07:46.446705 nestor.4566 > idda.1033: P 4793:4805(12) ack 152 win
32120 (DF)
21:07:46.486705 idda.1033 > nestor.4566: . ack 4805 win 33580
21:07:46.546705 nestor.4566 > idda.1033: P 4805:4817(12) ack 152 win
32120 (DF)
21:07:46.646700 nestor.4566 > idda.1033: P 4817:4829(12) ack 152 win
32120 (DF)
21:07:46.686680 idda.1033 > nestor.4566: . ack 4829 win 33580
21:07:46.746650 nestor.4566 > idda.1033: P 4829:4841(12) ack 152 win
32120 (DF)
21:07:46.846600 nestor.4566 > idda.1033: P 4841:4853(12) ack 152 win
32120 (DF)
21:07:46.886580 idda.1033 > nestor.4566: . ack 4853 win 33580
21:07:46.946550 nestor.4566 > idda.1033: P 4853:4865(12) ack 152 win
32120 (DF)
21:07:47.046500 nestor.4566 > idda.1033: P 4865:4877(12) ack 152 win
32120 (DF)
21:07:47.086480 idda.1033 > nestor.4566: . ack 4877 win 33580
21:07:47.146450 nestor.4566 > idda.1033: P 4877:4889(12) ack 152 win
32120 (DF)
21:07:47.246400 nestor.4566 > idda.1033: P 4889:4909(20) ack 152 win
32120 (DF)
21:07:47.246400 idda.1033 > nestor.4566: P 152:164(12) ack 4909 win
33580

-- 
James Cameron                                      (quozl@us.netrek.org)

Linux, Firewalls, OpenVMS, Software Engineering, CGI, HTTP, X, C, FORTH,
COBOL, BASIC, DCL, csh, bash, ksh, sh, Electronics, Microcontrollers,
Disability Engineering, Netrek, Bicycles, Pedant, Farming, Home Control,
Remote Area Power, Greek Scholar, Tenor Vocalist, Church Sound, Husband.

"Specialisation is for insects." -- Robert Heinlein.