[Review]: Packet Drop vs Packet Loss – Linux

Packet Loss

What’s Packet Drop and Packet Loss?


Actually, Packet Loss occurs when one or more packet can’t reach their destination because of some issue such as link congestion, TCP can detect packet loss and send the packet again (During packet recovery process) but packet loss has impact on users, who are using streaming media application and actually all application that using unreliable protocols such as UDP.

Packet Drop is typically discarding packets on different layers after processing packets and packet drop is one of reasons of data loss in some conditions.

What’s Different Between Packet Drop and Packet Loss?


When Packet Drop occurs, the packet is discarded by receiver or sender because of:

  1. Softnet backlog full  — (Measured from /proc/net/softnet_stat)
  2. Bad / Unintended VLAN tags
  3. Unknown / Unregistered protocols
  4. IPv6 frames when the server is not configured for IPv6

Now, another question:

Is every dropped packet is bad?

Actually not, because dropped packets are corrupted or has bad address (We’ll review this as an example) or something wrong.

All packets with incorrect checksum will be dropped and it’s so good not bad.

But when lot of packets are dropping, there is something wrong!, maybe there is issue on physical layer (In most cases) or network performance tuning is needed.

Read the below article about network performance tuning:

Red Hat Enterprise Linux Network Performance Tuning Guide

PERFORMANCE TUNING GUIDE

Red Hat Enterprise Linux Network Performance Tuning Guide (PDF)

Linux Network Receive Stack – Red Hat People

Based my experience, when there is some packet drop periodically (For example 1 packet in every 2 seconds), there is no concern and it can be ignored. This situation will be happened when there is some protocols such as Spanning Tree on network switch.

Actually, as the first step, you must find which packets are dropping by analyzing traffic.

How many dropped packets can be ignored?

There is good answer on the below link:

Should I be concerned about a 0.05% packet drop rate?

Packet drops in a network are normal and expected.

One of TCP’s main functions is to preserve a reliable data stream for applications by identifying and retrying any lost packets. It is normal operation for an underlying network to lose packets, for many different reasons, and for TCP be used and to hide this fact from application layers. Applications using unreliable transports (the major one being UDP) are expected to not care about the unreliability of the network. Applications which do care should use a reliable transport, eg. TCP (or more recently, SCTP) or implement their own reliability mechanisms.

The only impact of packet loss in the underlying network (which includes the device-drivers and lower network layer code which is counting the drops you are seeing) is a reduction in performance from the peak network speed. The reduction is dependent, non-linearly, on the packet drop ratio, and also on the traffic type. For bulk transfers (such as file-transfer) over TCP a drop ratio of under 0.1% results in a very small reduction in throughput — assuming that the Select Acknowledgement (SACK) TCP option is enabled on both systems. If SACK is disabled on either system there will be a measurable reduction in throughput if the drop rate exceeds 0.001%. On the other hand, thin stream traffic, where there is often no outstanding data for the transmitter to send, will be affected more as the need to retry lost packets will only be noticed after a time out rather than being detectable when data arrives out-of-order. Examples of thin-stream traffic include request-response and interactive connections.

Now, we know that there is different between the two concepts.

Example!


Bonding is equal to teaming in Linux and help you to have high availability, load balancing and link aggregation with two or more NICs.

Bonding has seven modes:

  • Mode 0 (Balance-RR)
    This mode transmits packets in a sequential order from the first available slave through the last. If two real interfaces are slaves in the bond and two packets arrive destined out of the bonded interface the first will be transmitted on the first slave and the second frame will be transmitted on the second slave. The third packet will be sent on the first and so on. This provides load balancing and fault tolerance.
  • Mode 1 (Active-Backup)
    Mode 1 places one of the interfaces into a backup state and will only make it active if the link is lost by the active interface. Only one slave in the bond is active at an instance of time. A different slave becomes active only when the active slave fails. This mode provides fault tolerance.
  • Mode 2 (Balance-XOR)
    Transmits based on XOR formula. (Source MAC address is XOR’d with destination MAC address) modula slave count. This selects the same slave for each destination MAC address and provides load balancing and fault tolerance.
  • Mode 3 (Broadcast)
    The broadcast mode transmits everything on all slave interfaces. This mode is least used (only for specific purpose) and provides only fault tolerance.
  • Mode 4 (802.3ad)
    The 802.3ad mode is known as Dynamic Link Aggregation mode. It creates aggregation groups that share the same speed and duplex settings. This mode requires a switch that supports IEEE 802.3ad Dynamic link. Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option. Note that not all transmit policies may be 802.3ad compliant, particularly inregards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Differing peer implementations will have varying tolerances for noncompliance.
  • Mode 5 (Balance-TLB)
    This is called as Adaptive transmit load balancing. The outgoing traffic is distributed according to the current load and queue on each slave interface. Incoming traffic is received by the current slave.
  • Mode 6 (Balance-ALB)
    This is Adaptive load balancing mode. This includes Balance-TLB + Receive Load Balancing (RLB) for IPV4 traffic. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the server on their way out and overwrites the SRC HW address with the unique HW address of one of the slaves in the bond such that different clients use different HW addresses for the server.

Mode 1 or Active/Backup mode is one of popular modes, the bond configuration includes at least two NICs, one of NICs is active and other NIC is in passive mode. MAC address of bond interface can be changed between the NICs MAC addresses dynamically, so one MAC is distributing to network always.

But “ifconfig” shows the passive NIC has dropped packet!

Actually, this is the dropped packets that we talked about them and you must know that these drops are not bad and there is no impact on network performance.

You can do the below instruction to preventing dropping packets when “lldpad” is running:

In order to prevent it from interfering you have to disable lldp messages on underlying interfaces for bond devices

# lldptool -L -i ethX adminStatus=disabled
# lldptool -L -i ethY adminStatus=disabled

If lldpad daemon is required for operations enable it on bond interface only:

# lldptool -L -i bondN adminStatus=rxtx