Network Performance Issue

Notice

Recent Posts

Recent Comments

Link

« 2025/01 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

기억을 지배하는 기록

Network Performance Issue 본문

Operation System Controls

Network Performance Issue

Andrew's Akashic Records 2018. 4. 17. 16:46

728x90

Network Performance Issue

최적화를 위해 network은 다음 3가지 조건을 갖추워야 한다.

1. 정확한 DATA의 전송이 이루어져야 한다.

2. network user들의 요구에 부합하는 충분한 bandwidth을 제공해야 한다. 만약, bandwidth이 충분치 못하면 두 point간의 전송 시 매우 많은 시간이 소요된다.

3. network에 있는 각 system들은 network traffic을 제어하기 위해 충분히 빨라야 한다.

bandwidth : 대역폭

네트웍에서 이용할 수 있는 신호의 최고 주파수와 최저 주파수의 차이를 말한다. 일반적으로는 통신에서 이용 가능한 최대 전송속도, 즉 정보를 전송할 수 있는 능력을 뜻하며, 그 기본 단위로는 bps를 사용한다.

모뎀에서 전송속도가 28.8 Kbps라는 것은 초당 28,800 비트를 전송할 수 있다는 것을 의미한다. 보통 14.4 ~ 28.8 Kbps 정도는 문자열을 보내고 받기에 적당하고, 음악이나 동영상 같은 멀티미디어 자료를 전송 받으려면 ISDN과 같은 고속 회선을 사용하는 것이 좋다. 전화선을 통한 정보 전송은 이론적으로 수십 Mbps까지 가능하지만, 전화국의 교환기 등에서 대역폭을 64 Kbps로 제한하고 있다.

DATA Corruption on the network

- network 문제를 간단히 살펴보기 위한 툴로 “netstat –I”가 있다.

n System이 booting이후에 발생한 모든 input/oujtput packet의 수등이 report 된다.

n input-error 나 out-error는 0.025%이하여야 하며 collision이 10%에 근접하면network에 overload가 초래된다.

리눅스

Kernel Interface table

Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg

eth0 1500 0 349202 0 0 0 84382 0 0 0 BMRU

lo 16436 0 16513 0 0 0 16513 0 0 0 LRU

SUN

Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue

lo0 8232 loopback localhost 261676 0 261676 0 0 0

hme0 1500 tmaxs1 tmaxs1 2059927 0 34231257 0 0 0

AIX

이름 Mtu 네트워크 주소 Ipkts Ierrs Opkts Oerrs Coll

en0 1500 link#2 0.6.29.dc.b2.16 775439936 0 116707019 0 0

en0 1500 192.168.1 tmaxi2 775439936 0 116707019 0 0

lo0 16896 link#1 -721764089 0 -730012441 0 0

lo0 16896 127 localhost -721764089 0 -730012441 0 0

lo0 16896 ::1 -721764089 0 -730012441 0 0

Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll

lan0 1500 192.168.0.0 tmaxh2 85874293 0 71213981 0 0

lo0 4136 loopback localhost 14306406 0 14306409 0 0

- gateway에서 발생한 error의 근원을 발견하기 위해 “netstat –s”를 사용할 수 있으며 이는ip, icmp, tcp, udp 별로 전송된 DATA량 및 발생된 error의 수를 report한다.

SUN

[inter999:/sapora/user/inter999]netstat -s

UDP

udpInDatagrams = 6475 udpInErrors = 0

udpOutDatagrams =33554694

TCP tcpRtoAlgorithm = 4 tcpRtoMin = 400

tcpRtoMax = 60000 tcpMaxConn = -1

tcpActiveOpens = 37791 tcpPassiveOpens = 11709

tcpAttemptFails = 24940 tcpEstabResets = 94

tcpCurrEstab = 51 tcpOutSegs =723168

tcpOutDataSegs =477019 tcpOutDataBytes =178839753

tcpRetransSegs = 831 tcpRetransBytes = 61751

tcpOutAck =246148 tcpOutAckDelayed = 47625

tcpOutUrg = 1 tcpOutWinUpdate = 27

tcpOutWinProbe = 0 tcpOutControl =104467

tcpOutRsts = 29816 tcpOutFastRetrans = 5

tcpInSegs =1351793

tcpInAckSegs =421535 tcpInAckBytes =178860343

tcpInDupAck = 25863 tcpInAckUnsent = 0

tcpInInorderSegs =1080491 tcpInInorderBytes =1038297986

tcpInUnorderSegs = 950 tcpInUnorderBytes = 67992

tcpInDupSegs = 257 tcpInDupBytes =107879

tcpInPartDupSegs = 4 tcpInPartDupBytes = 1296

tcpInPastWinSegs = 0 tcpInPastWinBytes = 0

tcpInWinProbe = 0 tcpInWinUpdate = 0

tcpInClosed = 0 tcpRttNoUpdate = 66

tcpRttUpdate =397997 tcpTimRetrans = 941

tcpTimRetransDrop = 66 tcpTimKeepalive = 382

tcpTimKeepaliveProbe= 47 tcpTimKeepaliveDrop = 0

tcpListenDrop = 0 tcpListenDropQ0 = 0

tcpHalfOpenDrop = 0 tcpOutSackRetrans = 42

IP ipForwarding = 2 ipDefaultTTL = 255

ipInReceives =1503680 ipInHdrErrors = 0

ipInAddrErrors = 0 ipInCksumErrs = 0

ipForwDatagrams = 0 ipForwProhibits = 0

ipInUnknownProtos = 0 ipInDiscards = 0

ipInDelivers =1401339 ipOutRequests =34234131

ipOutDiscards = 0 ipOutNoRoutes = 0

ipReasmTimeout = 60 ipReasmReqds = 0

ipReasmOKs = 0 ipReasmFails = 0

ipReasmDuplicates = 0 ipReasmPartDups = 0

ipFragOKs = 0 ipFragFails = 0

ipFragCreates = 0 ipRoutingDiscards = 0

tcpInErrs = 0 udpNoPorts =385690

udpInCksumErrs = 0 udpInOverflows = 0

rawipInOverflows = 0

ICMP icmpInMsgs = 5648 icmpInErrors = 0

icmpInCksumErrs = 2 icmpInUnknowns = 0

icmpInDestUnreachs = 263 icmpInTimeExcds = 0

icmpInParmProbs = 0 icmpInSrcQuenchs = 5

icmpInRedirects = 0 icmpInBadRedirects = 0

icmpInEchos = 176 icmpInEchoReps = 5202

icmpInTimestamps = 0 icmpInTimestampReps = 0

icmpInAddrMasks = 0 icmpInAddrMaskReps = 0

icmpInFragNeeded = 0 icmpOutMsgs = 178

icmpOutDrops = 0 icmpOutErrors = 0

icmpOutDestUnreachs = 2 icmpOutTimeExcds = 0

icmpOutParmProbs = 0 icmpOutSrcQuenchs = 0

icmpOutRedirects = 0 icmpOutEchos = 0

icmpOutEchoReps = 176 icmpOutTimestamps = 0

icmpOutTimestampReps= 0 icmpOutAddrMasks = 0

icmpOutAddrMaskReps = 0 icmpOutFragNeeded = 0

icmpInOverflows = 0

IGMP:

0 messages received

0 messages received with too few bytes

0 messages received with bad checksum

0 membership queries received

0 membership queries received with invalid field(s)

0 membership reports received

0 membership reports received with invalid field(s)

0 membership reports received for groups to which we belong

0 membership reports sent

$ netstat -s

tcp:

61780317 packets sent

40026627 data packets (2022562296 bytes)

33851 data packets (20402833 bytes) retransmitted

21750704 ack-only packets (5140731 delayed)

0 URG only packets

28 window probe packets

51 window update packets

10729813 control packets

85413720 packets received

30922858 acks (for 2025048627 bytes)

61302 duplicate acks

0 acks for unsent data

49505640 packets (1931008279 bytes) received in-sequence

64 completely duplicate packets (93112 bytes)

349 packets with some dup, data (483560 bytes duped)

10553 out of order packets (12988300 bytes)

4 packets (1037213427 bytes) of data after window

195 window probes

4458436 window update packets

3219 packets received after close

1 segment discarded for bad checksum

0 bad TCP segments dropped due to state change

723322 connection requests

4471901 connection accepts

5195223 connections established (including accepts)

5872161 connections closed (including 677123 drops)

673863 embryonic connections dropped

26359442 segments updated rtt (of 26359442 attempts)

46875 retransmit timeouts

4317 connections dropped by rexmit timeout

28 persist timeouts

15476 keepalive timeouts

15309 keepalive probes sent

47 connections dropped by keepalive

0 connect requests dropped due to full queue

977309 connect requests dropped due to no listener

udp:

0 incomplete headers

0 bad checksums

0 socket overflows

ip:

85224681 total packets received

0 bad IP headers

0 fragments received

0 fragments dropped (dup or out of space)

0 fragments dropped after timeout

0 packets forwarded

0 packets not forwardable

icmp:

6517276 calls to generate an ICMP error message

5665 ICMP messages dropped

Output histogram:

echo reply: 415

destination unreachable: 6511197

source quench: 0

routing redirect: 0

echo: 0

time exceeded: 0

parameter problem: 0

time stamp: 0

time stamp reply: 0

address mask request: 0

address mask reply: 0

0 bad ICMP messages

Input histogram:

echo reply: 1762

destination unreachable: 6511314

source quench: 7

routing redirect: 0

echo: 415

time exceeded: 12

parameter problem: 0

time stamp request: 0

time stamp reply: 0

address mask request: 0

address mask reply: 0

415 responses sent

igmp:

0 messages received

0 messages received with too few bytes

0 messages received with bad checksum

0 membership queries received

0 membership queries received with incorrect fields(s)

0 membership reports received

0 membership reports received with incorrect field(s)

0 membership reports received for groups to which this host belongs

0 membership reports sent

Gathering Network Integrity data from NFS(Network File System)

- “nfsstat –c”를 사용하여 system의client측 NFS 통계를 report할 수 있다.

n retrans field는 이 host가 어떤 RPC 클라이언트에 재전송한 packet의 수를 마타내며, 어떤 NFS file을 read/write할 때 발생하는데 만약 Client nfs call의 total수의 5%를 넘으면 심각한 문제가 있다.

n badxid field와 retrans filed를 비교하여 대략 같으면 network의 NFS server는 클라이언트의 요구에 대해 문제를 가지고 있음을 의미한다.

SUN

[inter999:/sapora/user/inter999]nfsstat -c

Client rpc:

Connection oriented:

calls badcalls badxids timeouts newcreds badverfs

1331 0 0 0 0 0

timers cantconn nomem interrupts

0 0 0 0

Connectionless:

calls badcalls retrans badxids timeouts newcreds

7 1 0 0 0 0

badverfs timers nomem cantsend

0 4 0 0

Client nfs:

calls badcalls clgets cltoomany

7 1 7 0

Version 2: (6 calls)

null getattr setattr root lookup readlink

0 0% 5 83% 0 0% 0 0% 0 0% 0 0%

read wrcache write create remove rename

0 0% 0 0% 0 0% 0 0% 0 0% 0 0%

link symlink mkdir rmdir readdir statfs

0 0% 0 0% 0 0% 0 0% 0 0% 1 16%

Version 3: (0 calls)

null getattr setattr lookup access readlink

0 0% 0 0% 0 0% 0 0% 0 0% 0 0%

read write create mkdir symlink mknod

0 0% 0 0% 0 0% 0 0% 0 0% 0 0%

remove rmdir rename link readdir readdirplus

0 0% 0 0% 0 0% 0 0% 0 0% 0 0%

fsstat fsinfo pathconf commit

0 0% 0 0% 0 0% 0 0%

Client nfs_acl:

Version 2: (1 calls)

null getacl setacl getattr access

0 0% 0 0% 0 0% 1 100% 0 0%

Version 3: (0 calls)

null getacl setacl

0 0% 0 0% 0 0%

$ nfsstat -c

Client rpc:

Connection oriented:

calls badcalls badxids

0 0 0

timeouts newcreds badverfs

0 0 0

timers cantconn nomem

0 0 0

interrupts

Connectionless oriented:

calls badcalls retrans

2883 0 0

badxids timeouts waits

0 0 0

newcreds badverfs timers

0 0 21

toobig nomem cantsend

0 0 0

bufulocks

Client nfs:

calls badcalls clgets

2883 0 2883

cltoomany

Version 2: (2883 calls)

null getattr setattr

0 0% 2864 99% 0 0%

root lookup readlink

0 0% 3 0% 0 0%

read wrcache write

0 0% 0 0% 0 0%

create remove rename

0 0% 0 0% 0 0%

link symlink mkdir

0 0% 0 0% 0 0%

rmdir readdir statfs

0 0% 12 0% 4 0%

Version 3: (0 calls)

null getattr setattr

0 0% 0 0% 0 0%

lookup access readlink

0 0% 0 0% 0 0%

read write create

0 0% 0 0% 0 0%

mkdir symlink mknod

0 0% 0 0% 0 0%

remove rmdir rename

0 0% 0 0% 0 0%

link readdir readdir+

0 0% 0 0% 0 0%

fsstat fsinfo pathconf

0 0% 0 0% 0 0%

commit

0 0%

Network and CPU Load

CPU에 load가 많이 걸리면 network의 performance가 떨어지게 되는데 spray 유틸리티를 이용하여 시스템의 CPU를 check할 수 있다.

[inter999:/sapora/user/inter999]spray localhost

sending 1162 packets of length 86 to localhost ...

164 packets (14.114%) dropped by localhost

66 packets/sec, 5706 bytes/sec

중요한 요소는 drop된 packet의 수인데 drop된 수가 5%이하의 적은 수라면 문제가 없으나 그 수가 많다면 packet을 receive하는 host보다 더 빠르게 packet을 generate하는 것을 나타내므로 host가 네트워크에 반응할 수 있도록 빠르지 못하며 CPU에 load가 많음을 의미한다.

Reducing the NFS Workload

NFS 서버의 workload를 줄 일려면 클라이언트 시스템의 /etc/fstab 파일을 수정하여 read와write buffer size을 늘여주는 것이 좋으며 만약 두 시스템의 pagesize가 4096 byte라하면

server:/remfs/dataspace /space nfs rw,hard,wsize=4096,rsize=4096 0 0

시스템의 page-size는 "pagesize" command를 사용하여 확인할 수 있으며 rsize와 wsize는remote filesystem에만 적용되며 local filesystem에 사용해서는 안 된다

Timeout

NFS 클라이언트가 어떤 주어진 시간 동안 NFS 요청에 대한 응답을 받지 못하면 time out이 발생하며, 이는 NFS 서버에 load가 많이 걸려 충분히 빠르게 NFS 요청을 처리해 주지 못함을 의미한다.

이런 경우 /etc/fstab 파일의 timeout period를 증가시켜 time out을 방지할 수 있다.

server:/mf /mf nfs noquota,hard,bg,intr,timeo=15 0 0

(이것은 timeout period가 1.5 second임을 의미)

"nfsstat -c" command를 사용 timeout된 수를 check할 수 있고 이때 call의 수에 비해 5% 이상이 발생되면 problem을 가지고 있음을 의미한다.

728x90

저작자표시 비영리 변경금지

'Operation System Controls' 카테고리의 다른 글

paste (0)	2018.04.18
OS 별 기본 정보 보기 (0)	2018.04.17
netstat (0)	2018.04.17
more (0)	2018.04.17
ls (0)	2018.04.17

'Operation System Controls' Related Articles

Comments

기억을 지배하는 기록

Network Performance Issue 본문

Network Performance Issue

'Operation System Controls' 카테고리의 다른 글

티스토리툴바