Akashic Records

Network Performance Issue 본문

Operation System Controls

Network Performance Issue

Andrew's Akashic Records 2018. 4. 17. 16:46
728x90

Network Performance Issue

최적화를 위해 network은 다음 3가지 조건을 갖추워야 한다.

1.       정확한 DATA의 전송이 이루어져야 한다.

2.       network user들의 요구에 부합하는 충분한 bandwidth을 제공해야 한다. 만약, bandwidth이 충분치 못하면 두 point간의 전송 시 매우 많은 시간이 소요된다.

3.       network에 있는 각 system들은 network traffic을 제어하기 위해 충분히 빨라야 한다.

bandwidth : 대역폭

네트웍에서 이용할 수 있는 신호의 최고 주파수와 최저 주파수의 차이를 말한다. 일반적으로는 통신에서 이용 가능한 최대 전송속도, 즉 정보를 전송할 수 있는 능력을 뜻하며, 그 기본 단위로는 bps를 사용한다.

모뎀에서 전송속도가 28.8 Kbps라는 것은 초당 28,800 비트를 전송할 수 있다는 것을 의미한다. 보통 14.4 ~ 28.8 Kbps 정도는 문자열을 보내고 받기에 적당하고, 음악이나 동영상 같은 멀티미디어 자료를 전송 받으려면 ISDN과 같은 고속 회선을 사용하는 것이 좋다. 전화선을 통한 정보 전송은 이론적으로 수십 Mbps까지 가능하지만, 전화국의 교환기 등에서 대역폭을 64 Kbps로 제한하고 있다.


DATA Corruption on the network

-         network 문제를 간단히 살펴보기 위한 툴로 “netstat –I”가 있다.

n         System이 booting이후에 발생한 모든 input/oujtput packet의 수등이 report 된다.

n         input-error 나 out-error는  0.025%이하여야 하며 collision이 10%에 근접하면network에 overload가 초래된다.

리눅스

Kernel Interface table

Iface     MTU Met RX-OK RX-ERR RX-DRP RX-OVR   TX-OK TX-ERR TX-DRP TX-OVR Flg

eth0       1500 0 349202      0 0 0 84382      0 0 0 BMRU

lo        16436 0  16513 0      0 0 16513    0 0 0 LRU


SUN

Name  Mtu Net/Dest      Address Ipkts  Ierrs Opkts Oerrs Collis Queue

lo0   8232 loopback      localhost 261676 0     261676 0 0 0

hme0  1500 tmaxs1        tmaxs1 2059927 0     34231257 0 0 0


AIX

이름  Mtu  네트워크  주소         Ipkts Ierrs Opkts Oerrs  Coll

en0   1500 link#2      0.6.29.dc.b2.16 775439936     0 116707019 0 0

en0   1500 192.168.1   tmaxi2 775439936     0 116707019 0 0

lo0   16896 link#1                        -721764089 0 -730012441 0  0

lo0   16896 127         localhost -721764089     0 -730012441 0 0

lo0   16896 ::1                           -721764089 0 -730012441 0  0


HP

Name      Mtu Network         Address Ipkts   Ierrs Opkts Oerrs Coll

lan0      1500 192.168.0.0     tmaxh2 85874293 0     71213981 0 0

lo0       4136 loopback        localhost 14306406 0     14306409 0 0


-         gateway에서 발생한 error의 근원을 발견하기 위해 “netstat –s”를 사용할 수 있으며 이는ip, icmp, tcp, udp 별로 전송된 DATA량 및 발생된 error의 수를 report한다.

SUN

[inter999:/sapora/user/inter999]netstat -s

UDP

       udpInDatagrams      = 6475 udpInErrors         = 0

       udpOutDatagrams     =33554694

TCP     tcpRtoAlgorithm     = 4 tcpRtoMin           = 400

       tcpRtoMax           = 60000 tcpMaxConn          = -1

       tcpActiveOpens      = 37791 tcpPassiveOpens     = 11709

       tcpAttemptFails     = 24940 tcpEstabResets      = 94

       tcpCurrEstab        = 51 tcpOutSegs          =723168

       tcpOutDataSegs      =477019 tcpOutDataBytes     =178839753

       tcpRetransSegs      = 831 tcpRetransBytes     = 61751

       tcpOutAck           =246148 tcpOutAckDelayed    = 47625

       tcpOutUrg           = 1 tcpOutWinUpdate     = 27

       tcpOutWinProbe      = 0 tcpOutControl       =104467

       tcpOutRsts          = 29816 tcpOutFastRetrans   = 5

       tcpInSegs           =1351793

       tcpInAckSegs        =421535 tcpInAckBytes       =178860343

       tcpInDupAck         = 25863 tcpInAckUnsent      = 0

       tcpInInorderSegs    =1080491 tcpInInorderBytes   =1038297986

       tcpInUnorderSegs    = 950 tcpInUnorderBytes   = 67992

       tcpInDupSegs        = 257 tcpInDupBytes       =107879

       tcpInPartDupSegs    = 4 tcpInPartDupBytes   = 1296

       tcpInPastWinSegs    = 0 tcpInPastWinBytes   = 0

       tcpInWinProbe       = 0 tcpInWinUpdate      = 0

       tcpInClosed         = 0 tcpRttNoUpdate      = 66

       tcpRttUpdate        =397997 tcpTimRetrans       = 941

       tcpTimRetransDrop   = 66 tcpTimKeepalive     = 382

       tcpTimKeepaliveProbe=    47 tcpTimKeepaliveDrop =     0

       tcpListenDrop       = 0 tcpListenDropQ0     = 0

       tcpHalfOpenDrop     = 0 tcpOutSackRetrans   = 42

IP      ipForwarding        = 2 ipDefaultTTL        = 255

       ipInReceives        =1503680 ipInHdrErrors       = 0

       ipInAddrErrors      = 0 ipInCksumErrs       = 0

       ipForwDatagrams     = 0 ipForwProhibits     = 0

       ipInUnknownProtos   = 0 ipInDiscards        = 0

       ipInDelivers        =1401339 ipOutRequests       =34234131

       ipOutDiscards       = 0 ipOutNoRoutes       = 0

       ipReasmTimeout      = 60 ipReasmReqds        = 0

       ipReasmOKs          = 0 ipReasmFails        = 0

       ipReasmDuplicates   = 0 ipReasmPartDups     = 0

       ipFragOKs           = 0 ipFragFails         = 0

       ipFragCreates       = 0 ipRoutingDiscards   = 0

       tcpInErrs           = 0 udpNoPorts          =385690

       udpInCksumErrs      = 0 udpInOverflows      = 0

       rawipInOverflows    = 0

ICMP    icmpInMsgs          = 5648 icmpInErrors        = 0

       icmpInCksumErrs     = 2 icmpInUnknowns      = 0

       icmpInDestUnreachs  = 263 icmpInTimeExcds     = 0

       icmpInParmProbs     = 0 icmpInSrcQuenchs    = 5

       icmpInRedirects     = 0 icmpInBadRedirects  = 0

       icmpInEchos         = 176 icmpInEchoReps      = 5202

       icmpInTimestamps    = 0 icmpInTimestampReps =     0

       icmpInAddrMasks     = 0 icmpInAddrMaskReps  = 0

       icmpInFragNeeded    = 0 icmpOutMsgs         = 178

       icmpOutDrops        = 0 icmpOutErrors       = 0

       icmpOutDestUnreachs =     2 icmpOutTimeExcds =     0

       icmpOutParmProbs    = 0 icmpOutSrcQuenchs   = 0

       icmpOutRedirects    = 0 icmpOutEchos        = 0

       icmpOutEchoReps     = 176 icmpOutTimestamps   = 0

       icmpOutTimestampReps=     0 icmpOutAddrMasks =     0

       icmpOutAddrMaskReps =     0 icmpOutFragNeeded =     0

       icmpInOverflows     = 0

IGMP:

         0 messages received

         0 messages received with too few bytes

         0 messages received with bad checksum

         0 membership queries received

         0 membership queries received with invalid field(s)

         0 membership reports received

         0 membership reports received with invalid field(s)

         0 membership reports received for groups to which we belong

         0 membership reports sent


HP

$ netstat -s

tcp:

       61780317 packets sent

               40026627 data packets (2022562296 bytes)

               33851 data packets (20402833 bytes) retransmitted

               21750704 ack-only packets (5140731 delayed)

               0 URG only packets

               28 window probe packets

               51 window update packets

               10729813 control packets

       85413720 packets received

               30922858 acks (for 2025048627 bytes)

               61302 duplicate acks

               0 acks for unsent data

               49505640 packets (1931008279 bytes) received in-sequence

               64 completely duplicate packets (93112 bytes)

               349 packets with some dup, data (483560 bytes duped)

               10553 out of order packets (12988300 bytes)

               4 packets (1037213427 bytes) of data after window

               195 window probes

               4458436 window update packets

               3219 packets received after close

               1 segment discarded for bad checksum

               0 bad TCP segments dropped due to state change

       723322 connection requests

       4471901 connection accepts

       5195223 connections established (including accepts)

       5872161 connections closed (including 677123 drops)

       673863 embryonic connections dropped

       26359442 segments updated rtt (of 26359442 attempts)

       46875 retransmit timeouts

               4317 connections dropped by rexmit timeout

       28 persist timeouts

       15476 keepalive timeouts

               15309 keepalive probes sent

               47 connections dropped by keepalive

       0 connect requests dropped due to full queue

       977309 connect requests dropped due to no listener

udp:

       0 incomplete headers

       0 bad checksums

       0 socket overflows

ip:

       85224681 total packets received

       0 bad IP headers

       0 fragments received

       0 fragments dropped (dup or out of space)

       0 fragments dropped after timeout

       0 packets forwarded

       0 packets not forwardable

icmp:

       6517276 calls to generate an ICMP error message

       5665 ICMP messages dropped

       Output histogram:

        echo reply: 415

        destination unreachable: 6511197

        source quench: 0

        routing redirect: 0

        echo: 0

        time exceeded: 0

        parameter problem: 0

        time stamp: 0

        time stamp reply: 0

        address mask request: 0

        address mask reply: 0

       0 bad ICMP messages

       Input histogram:

        echo reply: 1762

        destination unreachable: 6511314

        source quench: 7

        routing redirect: 0

        echo: 415

        time exceeded: 12

        parameter problem: 0

        time stamp request: 0

        time stamp reply: 0

        address mask request: 0

        address mask reply: 0

       415 responses sent

igmp:

       0 messages received

       0 messages received with too few bytes

       0 messages received with bad checksum

       0 membership queries received

       0 membership queries received with incorrect fields(s)

       0 membership reports received

       0 membership reports received with incorrect field(s)

       0 membership reports received for groups to which this host belongs

       0 membership reports sent


Gathering Network Integrity data from NFS(Network File System)

-         “nfsstat –c”를 사용하여 system의client측 NFS 통계를 report할 수 있다.

n         retrans field는 이 host가 어떤 RPC 클라이언트에 재전송한 packet의 수를 마타내며, 어떤 NFS file을 read/write할 때 발생하는데 만약 Client nfs call의 total수의 5%를 넘으면 심각한 문제가 있다.

n         badxid field와 retrans filed를 비교하여 대략 같으면 network의 NFS server는 클라이언트의 요구에 대해 문제를 가지고 있음을 의미한다.

SUN

[inter999:/sapora/user/inter999]nfsstat -c

Client rpc:

Connection oriented:

calls       badcalls badxids     timeouts newcreds badverfs   

1331        0 0           0 0 0          

timers      cantconn nomem       interrupts

0           0 0           0

Connectionless:

calls       badcalls retrans     badxids timeouts newcreds   

7           1 0           0 0 0          

badverfs    timers nomem       cantsend

0           4 0           0

Client nfs:

calls       badcalls clgets      cltoomany

7           1 7           0

Version 2: (6 calls)

null        getattr setattr     root lookup readlink   

0 0%        5 83% 0 0%        0 0% 0 0% 0 0%       

read        wrcache write       create remove rename     

0 0%        0 0% 0 0%        0 0% 0 0% 0 0%       

link        symlink mkdir       rmdir readdir statfs     

0 0%        0 0% 0 0%        0 0% 0 0% 1 16%      

Version 3: (0 calls)

null        getattr setattr     lookup access readlink   

0 0%        0 0% 0 0%        0 0% 0 0% 0 0%       

read        write create      mkdir symlink mknod       

0 0%        0 0% 0 0%        0 0% 0 0% 0 0%       

remove      rmdir rename      link readdir readdirplus

0 0%        0 0% 0 0%        0 0% 0 0% 0 0%       

fsstat      fsinfo pathconf    commit

0 0%        0 0% 0 0%        0 0%

Client nfs_acl:

Version 2: (1 calls)

null        getacl setacl      getattr access

0 0%        0 0% 0 0%        1 100% 0 0%

Version 3: (0 calls)

null        getacl setacl     

0 0%        0 0% 0 0%


HP

$ nfsstat -c

Client rpc:

Connection oriented:

calls                   badcalls badxids                

0                       0 0                      

timeouts                newcreds badverfs               

0                       0 0                      

timers                  cantconn nomem                  

0                       0 0                      

interrupts             

0                      

Connectionless oriented:

calls                   badcalls retrans                

2883                    0 0                      

badxids                 timeouts waits                  

0                       0 0                      

newcreds                badverfs timers                 

0                       0 21                     

toobig                  nomem cantsend               

0                       0 0                      

bufulocks              

0                      

Client nfs:

calls                   badcalls clgets                 

2883                    0 2883                   

cltoomany              

0                      

Version 2: (2883 calls)

null                    getattr setattr                

0 0%                    2864 99% 0 0%                   

root                    lookup readlink               

0 0%                    3 0% 0 0%                   

read                    wrcache write                  

0 0%                    0 0% 0 0%                   

create                  remove rename                 

0 0%                    0 0% 0 0%                   

link                    symlink mkdir                  

0 0%                    0 0% 0 0%                   

rmdir                   readdir statfs                 

0 0%                    12 0% 4 0%                   

Version 3: (0 calls)

null                    getattr setattr                

0 0%                    0 0% 0 0%                   

lookup                  access readlink               

0 0%                    0 0% 0 0%                   

read                    write create                 

0 0%                    0 0% 0 0%                    

mkdir                   symlink mknod                  

0 0%                    0 0% 0 0%                   

remove                  rmdir rename                 

0 0%                    0 0% 0 0%                   

link                    readdir readdir+               

0 0%                    0 0% 0 0%                   

fsstat                  fsinfo pathconf               

0 0%                    0 0% 0 0%                   

commit                 

0 0%


Network and CPU Load

CPU에 load가 많이 걸리면 network의 performance가 떨어지게 되는데 spray 유틸리티를 이용하여 시스템의 CPU를 check할 수 있다.

[inter999:/sapora/user/inter999]spray localhost

sending 1162 packets of length 86 to localhost ...

       164 packets (14.114%) dropped by localhost

       66 packets/sec, 5706 bytes/sec


중요한 요소는 drop된 packet의 수인데 drop된 수가 5%이하의 적은 수라면 문제가 없으나 그 수가 많다면 packet을 receive하는 host보다 더 빠르게 packet을 generate하는 것을 나타내므로 host가 네트워크에 반응할 수 있도록 빠르지 못하며 CPU에 load가 많음을 의미한다.

Reducing the NFS Workload

NFS 서버의 workload를 줄 일려면 클라이언트 시스템의 /etc/fstab 파일을 수정하여 read와write buffer size을 늘여주는 것이 좋으며 만약 두 시스템의 pagesize가 4096 byte라하면

server:/remfs/dataspace /space nfs rw,hard,wsize=4096,rsize=4096 0 0

시스템의 page-size는 "pagesize" command를 사용하여 확인할 수 있으며 rsize와 wsize는remote filesystem에만 적용되며 local filesystem에 사용해서는 안 된다

Timeout

NFS 클라이언트가 어떤 주어진 시간 동안 NFS 요청에 대한 응답을 받지 못하면 time out이 발생하며, 이는 NFS 서버에 load가 많이 걸려 충분히 빠르게 NFS 요청을 처리해 주지 못함을 의미한다.

이런 경우 /etc/fstab 파일의 timeout period를 증가시켜 time out을 방지할 수 있다.

server:/mf /mf nfs noquota,hard,bg,intr,timeo=15 0 0

(이것은 timeout period가 1.5 second임을 의미)

"nfsstat -c" command를 사용 timeout된 수를 check할 수 있고 이때 call의 수에 비해 5% 이상이 발생되면 problem을 가지고 있음을 의미한다.



728x90

'Operation System Controls' 카테고리의 다른 글

paste  (0) 2018.04.18
OS 별 기본 정보 보기  (0) 2018.04.17
netstat  (0) 2018.04.17
more  (0) 2018.04.17
ls  (0) 2018.04.17
Comments