README.md 7.65 KB
Newer Older
1 2
FastClick
=========
3
This is an extended version of the Click Modular Router featuring an
4
improved Netmap support and a new DPDK support. It is the result of
5
our ANCS paper available at http://hdl.handle.net/2268/181954 .
6

7 8
Partial DPDK support is now reverted into vanilla Click (without support for 
batching, auto-thread assignment, thread vector, ...).
9 10 11 12

Netmap
------
Be sure to install Netmap on your system then configure with :
Tom Barbette's avatar
Tom Barbette committed
13
```bash
14
./configure --with-netmap --enable-netmap-pool --enable-multithread --disable-linuxmodule --enable-intel-cpu --enable-user-multithread --verbose --enable-select=poll CFLAGS="-O3" CXXFLAGS="-std=gnu++11 -O3"  --disable-dynamic-linking --enable-poll --enable-bound-port-transfer --enable-local --enable-zerocopy --enable-batch
Tom Barbette's avatar
Tom Barbette committed
15
```
16 17 18
to get the better performances.

An example configuration is :
Tom Barbette's avatar
Tom Barbette committed
19
```
Tom Barbette's avatar
Tom Barbette committed
20
FromNetmapDevice(netmap:eth0) -> CheckIPHeader() -> ToNetmapDevice(netmap:eth1)
Tom Barbette's avatar
Tom Barbette committed
21
```
22 23

To run click, do :
Tom Barbette's avatar
Tom Barbette committed
24
```bash
25
sudo bin/click -j 4 -a /path/to/config/file.click
Tom Barbette's avatar
Tom Barbette committed
26
```
27 28 29 30 31 32 33 34 35
Where 4 is the number of threads to use. The FromNetmapDevice will share the assigned cores themselves, do not pin the thread.

We noted that Netmap performs better without MQ, or at least with a minimal amount of queues :
ethtool -L eth% combined 1
will set the number of Netmap queues to 1. No need to pin the IRQ of the queues as our FastClick implementation will
take care of it. Just kill irqbalance.

Also, be sure to read the sections of our paper about full push to make the faster configuration.

Tom Barbette's avatar
Tom Barbette committed
36
The `--enable-netmap-pool` option allows to use Netmap buffers instead of Click malloc'ed buffers. This enhance performance as there is only one kind of buffer floating into Click. However with this option you need to place at least one From/ToNetmapDevice in your configuration and allocate enough Netmap buffers using NetmapInfo.
Tom Barbette's avatar
Tom Barbette committed
37

38 39
DPDK
----
Tom Barbette's avatar
Tom Barbette committed
40
Setup your DPDK environment (version 1.6 to 17.05 are supported), then configure with :
Tom Barbette's avatar
Tom Barbette committed
41
```bash
Tom Barbette's avatar
Tom Barbette committed
42
./configure --enable-multithread --disable-linuxmodule --enable-intel-cpu --enable-user-multithread --verbose CFLAGS="-g -O3" CXXFLAGS="-g -std=gnu++11 -O3" --disable-dynamic-linking --enable-poll --enable-bound-port-transfer --enable-dpdk --enable-batch --with-netmap=no --enable-zerocopy --enable-dpdk-pool --disable-dpdk-packet
Tom Barbette's avatar
Tom Barbette committed
43
```
44 45 46
to get the better performances.

An example configuration is :
Tom Barbette's avatar
Tom Barbette committed
47
```
48
FromDPDKDevice(0) -> CheckIPHeader(OFFSET 14) -> ToDPDKDevice(1)
Tom Barbette's avatar
Tom Barbette committed
49
```
50 51

To run click with DPDK, you can add the usual EAL parameters :
Tom Barbette's avatar
Tom Barbette committed
52
```bash
Tom Barbette's avatar
Tom Barbette committed
53
sudo bin/click --dpdk -c 0xf -n 4 -- /path/to/config/file.click
Tom Barbette's avatar
Tom Barbette committed
54
```
55 56 57 58
where 4 is the number of memory channel and 0xf the core mask.

DPDK only supports full push mode.

Tom Barbette's avatar
Tom Barbette committed
59 60
As for Netmap `--enable-dpdk-pool` option allows to use only DPDK buffers instead of Click malloc'ed buffers.
The `--enable-dpdk-packet` option allows to use DPDK packet handling mechanism instead of of Click's Packet object. All Packet function will be changed by wrappers around DPDK's rte\_pktmbuf functions. However this feature while reducing memory footprint do not enhance the performances as Packets objects are recyced in LIFO and stays in cache while every new access to metadata inside the rte\_mbuf produce a cache miss.
Tom Barbette's avatar
Tom Barbette committed
61

62 63 64 65
Examples
--------
See conf/fastclick/README.md

66 67 68 69 70 71 72 73 74 75
How to make an element batch-compatible
---------------------------------------
FastClick is backward compatible with all vanilla element, and it should work 
out of the box with your own library. However Click may un-batch and re-batch
packets along the path. This is not an issue for most slow path elements such 
as ICMP erros elements, where the development cost is not worth it. However, 
you probably want to have only batch-compatible elements in your fast path as 
this will be really faster.

Batch-compatible element should extend the BatchElement instead of the Element 
76 77
class. They also have to implement 
a version of push receiving a PacketBatch\* argument instead of Packet\* called 
78 79
push\_batch. 

80
The reason why batch element must provide a good old push fonction is that
81
it may be not worth it to rebuild a batch before your element, and then 
82 83
unbatch-it because your element is betweem two vanilla elements. In this case
the push version of your element will be used.
84

Tom Barbette's avatar
Tom Barbette committed
85
To let click compile with `--disable-batch`, always enclose push\_batch prototype
86 87 88 89 90 91 92 93 94 95 96
and implementation around #if HAVE\_BATCH .. #endif

If your element must use batching, if only push\_batch is implemented or 
your element always produces batches no matter the input, you
will want to set batch\_mode=BATCH\_MODE\_YES in the constructor, to let know 
the backward-compatibility manager that subsequent elements will receive 
batches, and previous element must send batch or let the backward compatibility 
manager rebuild a batch before passing it to your element. The default is 
BATCH\_MODE\_IFPOSSIBLE, telling that it should run in batch mode if it can, and 
vanilla element are fixed to BATCH\_MODE\_NO.

Tom Barbette's avatar
Tom Barbette committed
97
If you provide `--enable-auto-batch`, the vanilla Elements will be set in mode 
98 99 100 101 102 103 104 105
BATCH\_MODE\_IFPOSSIBLE, with a special push\_batch function which will simply
call push() for each packets. However the push ports of the elements will
rebuild batches instead of letting them go through.

Without auto-batch, the batches will be un-batched before a vanilla Element and
re-batched when hitting the next BatchElement. It is referenced as the "jump"
mode as the batch "jump over" the vanilla Element. This is the behaviour
described in the ANCS paper and still the default mode.
106

Tom Barbette's avatar
Tom Barbette committed
107 108 109
Continuous integration and `make check`
---------------------------------------
To ensure people not familiar with batching get warned about bad configuration including non-batch compatible element, some messages are printed to inform a potential slower configuration. However testies (used by `make check`) does not cope well with those message to stdout. To disable them and allow make check to run, you must pass `--disable-verbose-batch` to configure.
Tom Barbette's avatar
Tom Barbette committed
110

Tom Barbette's avatar
Tom Barbette committed
111
This repository uses Travis CI for CI tests which run make check under various configure options combinations. We also have a Gitlab CI for internal tests.
Tom Barbette's avatar
Tom Barbette committed
112

113 114 115 116
Differences with the ANCS paper
-------------------------------
For simplicity, we reference all input element as "FromDevice" and output
element as "ToDevice". However in practice our I/O elements are 
117
FromNetmapDevice/ToNetmapDevice and FromDPDKDevice/ToDPDKDevice. They both
118 119 120
inherit from QueueDevice, which is a generic abstract element to implement a
device which supports multiple queues (or in a more generic way I/O through
multiple different threads).
121

122 123 124 125 126 127 128 129 130 131 132 133 134
Thread vector and bit vector designate the same thing.

The --enable-dpdk-packet flag allows to use the metadata of the DPDK packets
and use the click Packet class only as a wrapper, as such the Click buffer
and the Click pool is completly unused. However we did not spoke of that feature
in the paper as this doesn't improve performance. DPDK metadata is written
in the beginning of the packet buffer. And writing the huge Click annotation
space (~164 bytes) leads to more cache miss than with the Click pool where a
few Click Packet descriptors are re-used to "link" to differents DPDK buffers
using the pool recycling mechanism. Even when reducing the annotation to a
minimal size (dpdk metadata + next + prev + transport header + ...) this still
force us to fetch a new cacheline.

135

136 137
Getting help
------------
138 139
Use the github issue tracker (https://github.com/tbarbette/fastclick/issues) or
contact tom.barbette at ulg.ac.be if you encounter any problem.
140

Tom Barbette's avatar
Tom Barbette committed
141
Please do not ask FastClick-related problems on the vanilla Click mailing list.
142 143
If you are sure that your problem is Click related, post it on vanilla Click's
issue tracker (https://github.com/kohler/click/issues).
144 145

The original Click readme is available in the README.original file.