Abstract
The archiving of Internet traffic is an essential function for retrospective network event analysis and forensic computer communication. The state-of-the-art approach for network monitoring and analysis involves storage and analysis of network flow statistic. However, this approach loses much valuable information within the Internet traffic. With the advancement of commodity hardware, in particular the volume of storage devices and the speed of interconnect technologies used in network adapter cards and multi-core processors, it is now possible to capture 10 Gbps and beyond real-time network traffic using a commodity computer, such as n2disk. Also with the advancement of distributed file system (such as Hadoop, ZFS, etc.) and open cloud computing platform (such as OpenStack, CloudStack, and Eucalyptus, etc.), it is practical to store such large volume of traffic data and fully in-depth analyse the inside communication within an acceptable latency. In this paper, based on well-known TimeMachine, we present TIFAflow, the design and implementation of a novel system for archiving and querying network flows. Firstly, we enhance the traffic archiving system named TImemachine+FAstbit (TIFA) with flow granularity, i.e., supply the system with flow table and flow module. Secondly, based on real network traces, we conduct performance comparison experiments of TIFAflow with other implementations such as common database solution, TimeMachine and TIFA system. Finally, based on comparison results, we demonstrate that TIFAflow has a higher performance improvement in storing and querying performance than TimeMachine and TIFA, both in time and space metrics.