tcp連接一端在進(jìn)行完三次握手以后進(jìn)入ESTABLISHED狀態(tài),如果連接的對(duì)端在某一時(shí)刻在網(wǎng)絡(luò)中消失,而本端沒(méi)有感知到,還是處于ESTABLISHED狀態(tài),那么本端的連接就被稱為半打開(kāi)連接(Half Open)。
連接的對(duì)端在網(wǎng)絡(luò)中消失的情況有好多:
例如對(duì)端主機(jī)突然斷電,tcp連接來(lái)不及發(fā)送任何信息就消失啦。
還有,連接路徑上的某個(gè)nat設(shè)備aging-time過(guò)期,并且nat port被重用,雖然tcp連接的兩端都還處于ESTABLISHED狀態(tài),可實(shí)際上兩端的連接已經(jīng)無(wú)法正常通信,此時(shí)這兩端的連接都是半打開(kāi)連接。(這種情況是我的猜測(cè),還沒(méi)有得到實(shí)踐的檢驗(yàn)。如果結(jié)論錯(cuò)誤,就會(huì)修改掉!)
還有,listen socket的accept調(diào)用緩慢導(dǎo)致積壓隊(duì)列滿,client端連接會(huì)成為半打開(kāi)連接。這種情況是本次討論的主題。
首先說(shuō)下tcp的三次握手

server端的tcp連接在三次握手階段會(huì)經(jīng)歷SYN_RECV狀態(tài)到ESTABLISHED狀態(tài)的變遷,其中SYN_RECV狀態(tài)到連接存放于listen socket積壓隊(duì)列的半連接隊(duì)列中,當(dāng)連接由SYN_RECV狀態(tài)變?yōu)镋STABLISHED狀態(tài),連接會(huì)被從半連接隊(duì)列中移到已連接隊(duì)列中。系統(tǒng)調(diào)用accept的作用就是從listen socket的已連接隊(duì)列中取走一個(gè)連接,然后將該連接與進(jìn)程綁定。
但是,如果listen socket的積壓隊(duì)列(半連接隊(duì)列與連接隊(duì)列)全部滿后,對(duì)于新來(lái)的client連接會(huì)如何處理呢。答案是,linux不同版本的實(shí)現(xiàn)不同。
當(dāng)前的實(shí)驗(yàn)環(huán)境:
zuchunlei@ubuntu14:~$ uname -a
Linux ubuntu14 4.4.0-31-generic #50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64 x86_64 x86
服務(wù)端代碼:
In [1]: from socket import *
In [2]: sock = socket(AF_INET,SOCK_STREAM)
In [3]: sock.bind(("",10000))
In [4]: sock.listen(1)
為了簡(jiǎn)單,我將listen的backlog設(shè)置為1,并且不調(diào)用sock.accept方法。這樣所有的ESTABLISHED狀態(tài)的連接都存在積壓隊(duì)列中,并且沒(méi)有和進(jìn)程綁定起來(lái)。
使用netstat查看10000端口的狀態(tài):
Every 1.0s: sudo netstat -tnpoa|sed -n -e 2p -e /10000/p Sat Dec 16 20:23:03 2017
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 1578/Python off (0.00/0/0)
使用ss查看10000端口的狀態(tài):
Every 1.0s: ss -tnpoa|sed -n -e 1p -e /10000/p Sat Dec 16 20:25:18 2017
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 1 *:10000 *:* users:(("ipython",1578,6))
解析一下,ss命令輸出的State=Listen狀態(tài)的數(shù)據(jù)時(shí),其中Send-Q的大小表示該listen socket積壓隊(duì)列的長(zhǎng)度,Recv-Q代表已完成三次握手,ESTABLISHED狀態(tài)的連接個(gè)數(shù)。這樣的連接存在于listen socket的已連接隊(duì)列中。
用nc localhost 10000進(jìn)行2次連接后,使用netstat查看10000端口的狀態(tài):
Every 1.0s: sudo netstat -tnpoa|sed -n -e 2p -e /10000/p Sat Dec 16 20:32:45 2017
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 1578/python off (0.00/0/0)
tcp 0 0 127.0.0.1:59890 127.0.0.1:10000 ESTABLISHED 6301/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59890 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59892 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:59892 127.0.0.1:10000 ESTABLISHED 6379/nc off (0.00/0/0)
netstat顯示當(dāng)前客戶端程序nc連接已經(jīng)建立完成,服務(wù)端的2個(gè)連接也處于ESTABLISHED狀態(tài),但因?yàn)楫?dāng)前沒(méi)有accept調(diào)用,所以服務(wù)端的兩個(gè)連接的進(jìn)程PID顯示為-,表示當(dāng)前連接沒(méi)有和進(jìn)程綁定起來(lái)。
使用ss查看10000端口的狀態(tài):
Every 1.0s: ss -tnpoa|sed -n -e 1p -e /10000/p Sat Dec 16 20:36:10 2017
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 2 1 *:10000 *:* users:(("ipython",1578,6))
ESTAB 0 0 127.0.0.1:59890 127.0.0.1:10000 users:(("nc",6301,3))
ESTAB 0 0 127.0.0.1:10000 127.0.0.1:59890
ESTAB 0 0 127.0.0.1:10000 127.0.0.1:59892
ESTAB 0 0 127.0.0.1:59892 127.0.0.1:10000 users:(("nc",6379,3))
通過(guò)ss可以看到,當(dāng)前LISTEN狀態(tài)的RECV-Q值為2,表示有2個(gè)ESTABLISHED狀態(tài)的連接在已連接隊(duì)列中等待應(yīng)用層調(diào)用accept取走。
用nc localhost 10000進(jìn)行第三次連接后,netstat查看10000端口的狀態(tài):
Every 1.0s: sudo netstat -tnpoa|sed -n -e 2p -e /10000/p Sat Dec 16 20:41:18 2017
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 1578/python off (0.00/0/0)
tcp 0 0 127.0.0.1:59890 127.0.0.1:10000 ESTABLISHED 6301/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59896 SYN_RECV - on (1.06/3/0)
tcp 0 0 127.0.0.1:59896 127.0.0.1:10000 ESTABLISHED 10989/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59890 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59892 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:59892 127.0.0.1:10000 ESTABLISHED 6379/nc off (0.00/0/0)
可以看到對(duì)于第三個(gè)客戶端nc,連接狀態(tài)為ESTABLISHED,表示3次握手已經(jīng)正確完成。而對(duì)于服務(wù)端,當(dāng)前的連接狀態(tài)為SYN_RECV,表示半連接狀態(tài),因?yàn)楫?dāng)前積壓隊(duì)列已經(jīng)滿,沒(méi)有空間再存放ESTABLISHED連接,所以該連接無(wú)法從SYN_RECV狀態(tài)變?yōu)镋STABLISHED狀態(tài),雖然能正確接收到nc端的第三個(gè)ACK段。
此時(shí)使用tcpdump進(jìn)行抓包:
zuchunlei@ubuntu14:~$ sudo tcpdump -i any tcp port 10000 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
20:50:15.739292 IP 127.0.0.1.10000 > 127.0.0.1.59896: Flags [S.], seq 2458870060, ack 3925261891, win 43690, options [mss 65495,sackOK,TS val 1340001 ecr 1339751,nop,wscale 7], length 0
20:50:15.739301 IP 127.0.0.1.59896 > 127.0.0.1.10000: Flags [.], ack 1, win 342, options [nop,nop,TS val 1340001 ecr 1339751], length 0
20:50:17.738724 IP 127.0.0.1.10000 > 127.0.0.1.59896: Flags [S.], seq 2458870060, ack 3925261891, win 43690, options [mss 65495,sackOK,TS val 1340501 ecr 1340001,nop,wscale 7], length 0
20:50:17.738772 IP 127.0.0.1.59896 > 127.0.0.1.10000: Flags [.], ack 1, win 342, options [nop,nop,TS val 1340501 ecr 1339751], length 0
20:50:21.739110 IP 127.0.0.1.10000 > 127.0.0.1.59896: Flags [S.], seq 2458870060, ack 3925261891, win 43690, options [mss 65495,sackOK,TS val 1341501 ecr 1340501,nop,wscale 7], length 0
20:50:21.739158 IP 127.0.0.1.59896 > 127.0.0.1.10000: Flags [.], ack 1, win 342, options [nop,nop,TS val 1341501 ecr 1339751], length 0
20:50:29.738975 IP 127.0.0.1.10000 > 127.0.0.1.59896: Flags [S.], seq 2458870060, ack 3925261891, win 43690, options [mss 65495,sackOK,TS val 1343501 ecr 1341501,nop,wscale 7], length 0
20:50:29.739022 IP 127.0.0.1.59896 > 127.0.0.1.10000: Flags [.], ack 1, win 342, options [nop,nop,TS val 1343501 ecr 1339751], length 0
20:50:45.739231 IP 127.0.0.1.10000 > 127.0.0.1.59896: Flags [S.], seq 2458870060, ack 3925261891, win 43690, options [mss 65495,sackOK,TS val 1347501 ecr 1343501,nop,wscale 7], length 0
20:50:45.739310 IP 127.0.0.1.59896 > 127.0.0.1.10000: Flags [.], ack 1, win 342, options [nop,nop,TS val 1347501 ecr 1339751], length 0
對(duì)于SYN_RECV狀態(tài)的連接,linux會(huì)啟動(dòng)定時(shí)器進(jìn)行重傳三次握手的第二段[S.],在4次重傳后,如果當(dāng)前l(fā)isten socket已連接隊(duì)列中依然沒(méi)有空間,則將SYN_RECV狀態(tài)的連接丟棄。
等待4次重傳后,使用netstat查看10000端口狀態(tài):
Every 1.0s: sudo netstat -tnpoa|sed -n -e 2p -e /10000/p Sat Dec 16 20:58:20 2017
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 1578/python off (0.00/0/0)
tcp 0 0 127.0.0.1:59890 127.0.0.1:10000 ESTABLISHED 6301/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59890 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:59896 127.0.0.1:10000 ESTABLISHED 15954/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:59892 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:59892 127.0.0.1:10000 ESTABLISHED 6379/nc off (0.00/0/0)
server端將SYN_RECV狀態(tài)的連接丟棄后,此時(shí)第三個(gè)nc客戶端連接就已經(jīng)成為了半打開(kāi)連接。
對(duì)半打開(kāi)連接進(jìn)行send/recv操作時(shí)的影響:
如果此時(shí),第三個(gè)nc客戶端發(fā)送數(shù)據(jù),則因?yàn)檫B接對(duì)對(duì)端不存在,對(duì)端會(huì)回復(fù)RST段,本端收到RST段后也會(huì)將連接重置。
如果第三個(gè)nc客戶端只接收數(shù)據(jù)的話,則這個(gè)客戶端永遠(yuǎn)阻塞在recv調(diào)用中無(wú)法返回。為了有效解決這種問(wèn)題,客戶端可以啟動(dòng)tcp的keepalive,因?yàn)槟J(rèn)tcp發(fā)送keepalive probe的間隔時(shí)間較長(zhǎng),應(yīng)用可以通過(guò)設(shè)置socket option(
TCP_KEEPDILE/TCP_KEEPINTVL/TCP_KEEPCNT)將發(fā)送keepalive probe的時(shí)間設(shè)短些。
今早我測(cè)試了一下最新版ubuntu16.04的實(shí)現(xiàn),發(fā)現(xiàn)如果listen socket的積壓隊(duì)列滿后,新來(lái)客戶端的連接不再成為ESTABLISHED狀態(tài),而是在SYN_SENT狀態(tài)進(jìn)行進(jìn)行SYN段的超時(shí)重傳,而服務(wù)端不返回任何tcp段。
新版的測(cè)試環(huán)境:
zuchunlei@box:~$ uname -a
Linux box 4.10.0-28-generic #32~16.04.2-Ubuntu SMP Thu Jul 20 10:19:48 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
與之前的測(cè)試場(chǎng)景一樣,當(dāng)前只關(guān)注第三個(gè)nc客戶端連接的狀態(tài)。
使用netstat查看10000端口的狀態(tài):
Every 1.0s: sudo netstat -tnpoa|sed -n -e 2p -e /10000/p Sat Dec 16 21:21:57 2017
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 2022/python off (0.00/0/0)
tcp 0 0 127.0.0.1:36516 127.0.0.1:10000 ESTABLISHED 2347/nc off (0.00/0/0)
tcp 0 1 127.0.0.1:36520 127.0.0.1:10000 SYN_SENT 2522/nc on (5.18/3/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:36518 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:36518 127.0.0.1:10000 ESTABLISHED 2388/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:36516 ESTABLISHED - off (0.00/0/0)
此時(shí),第三個(gè)nc客戶端連接狀態(tài)為SYN_SENT,進(jìn)行超時(shí)重傳SYN段。
使用tcpdump抓去第三個(gè)nc客戶端的tcp包:
zuchunlei@box:~$ sudo tcpdump -i any tcp port 10000 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
21:21:47.357226 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214107076 ecr 0,nop,wscale 7], length 0
21:21:48.358267 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214107327 ecr 0,nop,wscale 7], length 0
21:21:50.373837 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214107831 ecr 0,nop,wscale 7], length 0
21:21:54.565832 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214108879 ecr 0,nop,wscale 7], length 0
21:22:02.758111 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214110927 ecr 0,nop,wscale 7], length 0
21:22:18.885934 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214114959 ecr 0,nop,wscale 7], length 0
21:22:51.141643 IP 127.0.0.1.36520 > 127.0.0.1.10000: Flags [S], seq 1445936074, win 43690, options [mss 65495,sackOK,TS val 4214123023 ecr 0,nop,wscale 7], length 0
可以看到客戶端在進(jìn)行超時(shí)重傳SYN段的過(guò)程中,服務(wù)端沒(méi)有發(fā)送一個(gè)包。
在客戶端SYN_SENT超時(shí)后,使用netstat查看10000端口狀態(tài):
Every 1.0s: sudo netstat -tnpoa|sed -n -e 2p -e /10000/p Sat Dec 16 21:27:36 2017
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 2022/python off (0.00/0/0)
tcp 0 0 127.0.0.1:36516 127.0.0.1:10000 ESTABLISHED 2347/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:36518 ESTABLISHED - off (0.00/0/0)
tcp 0 0 127.0.0.1:36518 127.0.0.1:10000 ESTABLISHED 2388/nc off (0.00/0/0)
tcp 0 0 127.0.0.1:10000 127.0.0.1:36516 ESTABLISHED - off (0.00/0/0)
客戶端連接消失。
在當(dāng)前新版當(dāng)linux實(shí)現(xiàn)中,由于listen socket積壓隊(duì)列滿時(shí),新的客戶端連接并不會(huì)成為半打開(kāi)連接,而是在connect調(diào)用時(shí)進(jìn)行重傳SYN段,如果達(dá)到了SYN_SENT狀態(tài)的閾值后,tcp連接消失,應(yīng)用層connect調(diào)用返回timeout異常!