Abstract:Recording flow statistics for each network packet is resource-intensive. Various sampling techniques are used to estimate flow statistics. However, the estimation accuracy based on the sampling remains a significant challenge. This paper introduces both sampling techniques denoted as Integral and Iteration algorithms, which can accurately infer the number of original flows from the sampled flow records. The Integral algorithm uses only the number of sampled flows with one sampled packet to approximately deduce the number of unsampled flows. The Iteration algorithm can estimate the number of unsampled flows using an iteration method. The number of original flows can be precisely estimated according to both the number of sampled flows and unsampled flows. Both the algorithms are compared to the EM (expectation maximization) algorithm using multiple traffic traces collected from CERNET (China education and research network) backbone. The result shows that the Iteration algorithm is superior to the EM algorithm and can provide highly accurate estimation on the number of original flows.