and authenticity in the connections among OpenFlow switches and
controllers.
-To compile the datapath kernel module, you will additionally need:
-
- - A supported Linux kernel version. Please refer to README for a
- list of supported versions.
-
- The OpenFlow datapath requires bridging support (CONFIG_BRIDGE)
- to be built as a kernel module. (This is common in kernels
- provided by Linux distributions.) The bridge module must not be
- loaded or in use. If the bridge module is running (check with
- "lsmod | grep bridge"), you must remove it ("rmmod bridge")
- before starting the datapath.
-
- - The correct version of GCC for the kernel that you are building
- the module against:
-
- * To build a kernel module for a Linux 2.6 kernel, you need
- the same version of GCC that was used to build that kernel
- (usually version 4.0 or later).
-
- * To build a kernel module for a Linux 2.4 kernel, you need an
- earlier version of GCC, typically GCC 2.95, 3.3, or 3.4.
-
- - A kernel build directory corresponding to the Linux kernel image
- the module is to run on. Under Debian and Ubuntu, for example,
- each linux-image package containing a kernel binary has a
- corresponding linux-headers package with the required build
- infrastructure.
-
If you are working from a Git tree or snapshot (instead of from a
distribution tarball), or if you modify the OpenFlow build system, you
will also need the following software:
- pkg-config (http://pkg-config.freedesktop.org/wiki/). We test
with version 0.22.
-Building the Code
------------------
+The optional Linux module has additional prerequisites, described
+later in the section "Building and Testing the Linux Kernel-Based
+Switch".
-1. In the top source directory, configure the package by running the
- configure script. To compile without building a kernel module, you
- can usually invoke configure without any arguments:
- % ./configure
+Building Userspace Programs
+---------------------------
- To build a kernel module as well as the rest of the distribution,
- pass the location of the kernel build directory as an argument.
- Use --with-l26 for Linux 2.6, --with-l24 for Linux 2.4. For
- example, to build for a running instance of Linux 2.6:
- % ./configure --with-l26=/lib/modules/`uname -r`/build
+These instructions describe how to build the userspace components of
+the OpenFlow distribution. Refer to "Building and Testing the Linux
+Kernel-Based Switch", below, for additional instructions on how to
+build the optional Linux kernel module.
- To build for a running instance of Linux 2.4:
- % ./configure --with-l24=/lib/modules/`uname -r`/build
+1. In the top source directory, configure the package by running the
+ configure script. You can usually invoke configure without any
+ arguments:
+
+ % ./configure
To use a specific C compiler for compiling OpenFlow user programs,
also specify it on the configure command line, like so:
+
% ./configure CC=gcc-4.2
The configure script accepts a number of other options and honors
configure with the --help option.
2. Run make in the top source directory:
+
% make
The following binaries will be built:
- Datapath kernel module:
- datapath/linux-2.6/openflow_mod.ko (if --with-l26 was specified)
- datapath/linux-2.4/openflow_mod.o (if --with-l24 was specified)
+ - Switch executable: switch/switch. This executable is built
+ only if the configure script detects a supported interface to
+ network devices. Refer to README for a list of OSes whose
+ network device interfaces are supported.
- Secure channel executable:
- secchan/secchan
+ - Secure channel executable: secchan/secchan.
- Controller executable:
- controller/controller
+ - Controller executable: controller/controller.
- Datapath administration utility:
- utilities/dpctl
+ - Datapath administration utility: utilities/dpctl.
- Runtime logging configuration utility:
- utilities/vlogconf
+ - Runtime logging configuration utility: utilities/vlogconf.
3. (Optional) Run "make install" to install the executables and
manpages into the running system, by default under /usr/local.
-Installing the datapath
------------------------
-
-To run the module, simply insmod it:
-
- (Linux 2.6)
- % insmod datapath/linux-2.6/openflow_mod.ko
-
- (Linux 2.4)
- % insmod datapath/linux-2.4/compat24_mod.o
- % insmod datapath/linux-2.4/openflow_mod.o
-
+Testing Userspace Programs
+--------------------------
-Testing the datapath
---------------------
+1. Start the OpenFlow controller running in the background, by running
+ the "controller" program with a command like the following:
-Once the OpenFlow datapath has been installed (you can verify that it is
-running if it appears in lsmod's listing), you can configure it using
-the dpctl command line utility.
+ % controller ptcp: &
-1. Create a datapath instance. The command below creates a datapath with
- ID 0 (see dpctl(8) for more detailed usage information).
- % dpctl adddp 0
+ This command causes the controller to bind to port 975 (the
+ default) awaiting connections from OpenFlow switches. See
+ controller(8) for details.
- (note, while in principle openflow_mod supports multiple datapaths
- within the same host, this is rarely useful in practice)
+2. On the same machine, use the "switch" program to start an OpenFlow
+ switch, specifying network devices to use as switch ports on the -i
+ option as a comma-separated list, like so:
-2. Use dpctl to attach the datapath to physical interfaces on the
- machine. Say, for example, you want to create a trivial 2-port
- switch using interfaces eth1 and eth2, you would issue the following
- commands:
- % dpctl addif 0 eth1
- % dpctl addif 0 eth2
-
- You can verify that the interfaces were successfully added by asking
- dpctl to print the current status of datapath 0:
- % dpctl show 0
-
-3. (Optional) You can manually add flows to the datapath to test using
- dpctl add-flows and view them using dpctl dump-flows. See dpctl(8)
- for more details.
-
-4. The simplest way to test the datapath is to run the provided sample
- controller on the host machine to manage the datapath directly using
- netlink:
- % controller -v nl:0
-
- Once the controller is running, the datapath should operate like a
- learning Ethernet switch. You may monitor the flows in the datapath
- flow table using "dpctl dump-flows" command.
-
-Running the datapath with a remote controller
----------------------------------------------
-
-1. Start the datapath and attach it to two or more physical ports as
- described in the previous section.
-
- Note: The current version of the secure channel and controller
- require at least one interface not be connected to the datapath
- to be functional. This interface will be used for communication
- between the secure channel and the controller. Future releases will
- support in-band control communication.
-
-2. Run the controller in passive tcp mode on the host which will act as
- the controller. In the example below, the controller will bind to
- port 975 (the default) awaiting connections from secure channels.
- % controller -v ptcp:
-
- (See controller(8) for more details)
+ % switch tcp:127.0.0.1 -i eth1,eth2
- Make sure the machine hosting the controller is reachable by the switch.
-
-3. Run secchan on the datapath host to start the secure channel
- connecting the datapath to a remote controller. (See secchan(8)
- for usage details). The channel should be configured to connect to
- the controller's IP address on the port configured in step 2.
-
- If the controller is running on host 192.168.1.2 port 975 (the
- default port) and the datapath ID is 0, the secchan invocation
- would look like:
- % secchan -v nl:0 tcp:192.168.1.2
+ The network devices that you specify should not have configured IP
+ addresses. The switch program must run as root.
+
+3. The controller causes each switch that connects to it to act like a
+ learning Ethernet switch. Thus, devices plugged into the specified
+ network ports should now be able to send packets to each other, as
+ if they were plugged into ports on a conventional Ethernet switch.
+
+Troubleshooting: if the commands above do not work, try using the -v
+or --verbose option on the controller or switch commands, which will
+cause a large amount of debug output from each program.
+
+Remote switches: These instructions assume that the controller and the
+switch are running on the same machine. This is an easy configuration
+for testing, but a more conventional setup would run a controller on
+one machine and one or more switches on different machines. To do so,
+simply specify the IP address of the controller as the first argument
+to the switch program (in place of 127.0.0.1). (Note: The current
+version of the switch and controller requires that they be connected
+through a "control network" that is physically separate from the one
+that they are controlling. Future releases will support in-band
+control communication.)
Secure operation over SSL
-------------------------
To configure the controller to listen for SSL connections on the
default port, invoke it as follows:
+
% controller -v pssl: --private-key=PRIVKEY --certificate=CERT \
--ca-cert=CACERT
+
where PRIVKEY is a file containing the controller's private key, CERT
is a file containing the controller CA's certificate for the
controller's public key, and CACERT is a file containing the root
certificate for the switch CA. If, for example, your PKI was created
with the instructions below, then the invocation would look like:
+
% controller -v pssl: --private-key=ctl-privkey.pem \
--certificate=ctl-cert.pem --ca-cert=pki/switchca/cacert.pem
To configure a switch to connect to a controller running on the
default port on host 192.168.1.2 over SSL, invoke it as follows:
- % secchan -v nl:0 ssl:192.168.1.2 --private-key=PRIVKEY \
+
+ % switch -v ssl:192.168.1.2 -i INTERFACES --private-key=PRIVKEY \
--certificate=CERT --ca-cert=CACERT
-where PRIVKEY is a file containing the switch's private key, CERT is a
-file containing the switch CA's certificate for the switch's public
-key, and CACERT is a file containing the root certificate for the
-controller CA. If, for example, your PKI was created with the
+
+where INTERFACES is the command-separated list of network devices
+interfaces, PRIVKEY is a file containing the switch's private key,
+CERT is a file containing the switch CA's certificate for the switch's
+public key, and CACERT is a file containing the root certificate for
+the controller CA. If, for example, your PKI was created with the
instructions below, then the invocation would look like:
- % secchan -v nl:0 ssl:192.168.1.2 --private-key=sc-privkey.pem \
+
+ % secchan -v -i INTERFACES ssl:192.168.1.2 --private-key=sc-privkey.pem \
--certificate=sc-cert.pem --ca-cert=pki/controllerca/cacert.pem
[*] To be specific, OpenFlow uses TLS version 1.0 or later (TLSv1), as
related files, including the following:
- cacert.pem: Root certificate for the controller certificate
- authority. This file must be provided to the secchan
- program with the --ca-cert option to enable it to
- authenticate valid controllers.
+ authority. This file must be provided to the switch or secchan
+ program with the --ca-cert option to enable it to authenticate
+ valid controllers.
- private/cakey.pem: Private signing key for the controller
certificate authority. This file must be kept secret. There is
options of controller, respectively, would point to these files.
Analogously, to create a switch private key and certificate in files
-named sc-privkey.pem and sc-cert.pem, for example, you could run:
+named sc-privkey.pem and sc-cert.pem, for example, you could run:
% ofp-pki req+sign sc switch
sc-privkey.pem and sc-cert.pem would need to be copied to the switch
for its use at runtime (they could then be deleted from their original
-locations). The --private-key and --certificate options of secchan,
-respectively, would point to these files.
+locations). The --private-key and --certificate options,
+respectively, of switch and secchan would point to these files.
+
+Building and Testing the Linux Kernel-Based Switch
+--------------------------------------------------
+
+The OpenFlow distribution also includes a Linux kernel module that can
+be used to achieve higher switching performance at a cost in
+portability and ease of installation. Compiling the kernel module has
+the following prerequisites in addition to those listed in the
+"Prerequisites" section above:
+
+ - A supported Linux kernel version. Please refer to README for a
+ list of supported versions.
+
+ The OpenFlow datapath requires bridging support (CONFIG_BRIDGE)
+ to be built as a kernel module. (This is common in kernels
+ provided by Linux distributions.) The bridge module must not be
+ loaded or in use. If the bridge module is running (check with
+ "lsmod | grep bridge"), you must remove it ("rmmod bridge")
+ before starting the datapath.
+
+ - The correct version of GCC for the kernel that you are building
+ the module against:
+
+ * To build a kernel module for a Linux 2.6 kernel, you need
+ the same version of GCC that was used to build that kernel
+ (usually version 4.0 or later).
+
+ * To build a kernel module for a Linux 2.4 kernel, you need an
+ earlier version of GCC, typically GCC 2.95, 3.3, or 3.4.
+
+ - A kernel build directory corresponding to the Linux kernel image
+ the module is to run on. Under Debian and Ubuntu, for example,
+ each linux-image package containing a kernel binary has a
+ corresponding linux-headers package with the required build
+ infrastructure.
+
+To build the kernel module, follow the build process described under
+"Building Userspace Programs" above, but pass the location of the
+kernel build directory as an additional argument to the configure
+script, as described under step 1 in that section. Specify the
+location on --with-l26 for Linux 2.6, --with-l24 for Linux 2.4. For
+example, to build for a running instance of Linux 2.6:
+
+ % ./configure --with-l26=/lib/modules/`uname -r`/build
+
+To build for a running instance of Linux 2.4:
+
+ % ./configure --with-l24=/lib/modules/`uname -r`/build
+
+In addition to the binaries listed under step 2 in "Building Userspace
+Programs" above, "make" will build the following kernel modules:
+
+ datapath/linux-2.6/openflow_mod.ko (if --with-l26 was specified)
+ datapath/linux-2.4/openflow_mod.o (if --with-l24 was specified)
+
+Once you have built the kernel modules, activating them requires only
+running "insmod", e.g.:
+
+ (Linux 2.6)
+ % insmod datapath/linux-2.6/openflow_mod.ko
+
+ (Linux 2.4)
+ % insmod datapath/linux-2.4/compat24_mod.o
+ % insmod datapath/linux-2.4/openflow_mod.o
+
+The insmod program must be run as root. You may need to specify a
+full path to insmod, which is usually in the /sbin directory. To
+verify that the modules have been loaded, run "lsmod" (also in /sbin)
+and check that openflow_mod appears in the result.
+
+Testing the Kernel-Based Implementation
+---------------------------------------
+
+The OpenFlow kernel module must be loaded, as described in the
+previous section, before it may be tested.
+
+1. Create a datapath instance. The command below creates a datapath with
+ ID 0 (see dpctl(8) for more detailed usage information).
+
+ % dpctl adddp 0
+
+ (In principle, openflow_mod supports multiple datapaths within the
+ same host, but this is rarely useful in practice.)
+
+2. Use dpctl to attach the datapath to physical interfaces on the
+ machine. Say, for example, you want to create a trivial 2-port
+ switch using interfaces eth1 and eth2, you would issue the following
+ commands:
+
+ % dpctl addif 0 eth1
+ % dpctl addif 0 eth2
+
+ You can verify that the interfaces were successfully added by asking
+ dpctl to print the current status of datapath 0:
+
+ % dpctl show 0
+
+3. (Optional) You can manually add flows to the datapath to test using
+ dpctl add-flows and view them using dpctl dump-flows. See dpctl(8)
+ for more details.
+
+4. The simplest way to test the datapath is to run the provided sample
+ controller on the host machine to manage the datapath directly using
+ netlink:
+
+ % controller -v nl:0
+
+ Once the controller is running, the datapath should operate like a
+ learning Ethernet switch. You may monitor the flows in the datapath
+ flow table using "dpctl dump-flows" command.
+
+The preceding instructions assume that the controller and the switch
+are running on the same machine. This is an easy configuration for
+testing, but a more conventional setup would run a controller on one
+machine and one or more switches on different machines. Use the
+following instructions to set up remote switches:
+
+1. Start the datapath and attach it to two or more physical ports as
+ described in the previous section.
+
+ Note: The current version of the switch and controller requires
+ that they be connected through a "control network" that is
+ physically separate from the one that they are controlling. Future
+ releases will support in-band control communication.
+
+2. Run the controller in passive tcp mode on the host which will act as
+ the controller. In the example below, the controller will bind to
+ port 975 (the default) awaiting connections from secure channels.
+
+ % controller -v ptcp:
+
+ (See controller(8) for more details)
+
+ Make sure the machine hosting the controller is reachable by the switch.
+
+3. Run secchan on the datapath host to start the secure channel
+ connecting the datapath to a remote controller. (See secchan(8)
+ for usage details). The channel should be configured to connect to
+ the controller's IP address on the port configured in step 2.
+
+ If the controller is running on host 192.168.1.2 port 975 (the
+ default port) and the datapath ID is 0, the secchan invocation
+ would look like:
+
+ % secchan -v nl:0 tcp:192.168.1.2
Bug Reporting
-------------
AUTOMAKE_OPTIONS=foreign
-SUBDIRS = lib datapath secchan controller utilities man include third-party
+SUBDIRS = lib datapath secchan controller
+if HAVE_IF_PACKET
+SUBDIRS += switch
+endif
+SUBDIRS += utilities man include third-party
What's here?
------------
-This distribution includes a Linux-specific reference implementation
-of an OpenFlow switch, comprising:
+This distribution includes two different reference implementations of
+an OpenFlow switch. The first implementation, which is closely tied
+to Linux because it is partially implemented in the Linux kernel, has
+the following components:
- A Linux kernel module that implements the flow table and
- OpenFlow protocol.
+ OpenFlow protocol, in the datapath directory.
- secchan, a program that implements the secure channel
component of the reference switch.
- dpctl, a tool for configuring the kernel module.
+The second implementation is a single user-space program, named
+"switch", that integrates all three parts of an OpenFlow switch.
+
This distribution includes some additional software as well:
- controller, a simple program that connects to any number of
Platform support
----------------
-Other than the Linux kernel module, the software in the OpenFlow
-distribution should compile under Unix-like environments such as
-Linux, FreeBSD, Mac OS X, and Solaris. Our primary test environment
-is Debian GNU/Linux. Please contact us with portability-related bug
-reports or patches.
+Other than the Linux kernel module and userspace switch
+implementation, the software in the OpenFlow distribution should
+compile under Unix-like environments such as Linux, FreeBSD, Mac OS X,
+and Solaris. Our primary test environment is Debian GNU/Linux.
+Please contact us with portability-related bug reports or patches.
The Linux kernel module is, of course, Linux-specific, and the secchan
and dpctl utilities will not be as useful without the kernel module.
2.6 releases from 2.6.15 onward and Linux 2.4 releases from 2.4.20
onward should also work.
+The userspace switch implementation should be easy to port to
+Unix-like systems. The interface to network devices, in netdev.c, is
+the only code that should need to change. So far, only Linux is
+supported. We welcome ports to other platforms.
+
GCC is the expected compiler.
Bugs/Shortcomings
[Define to 1 if Netlink protocol is available.])
fi
+AC_CHECK_HEADER([net/if_packet.h],
+ [HAVE_IF_PACKET=yes],
+ [HAVE_IF_PACKET=no])
+AM_CONDITIONAL([HAVE_IF_PACKET], [test "$HAVE_IF_PACKET" = yes])
+if test "$HAVE_IF_PACKET" = yes; then
+ AC_DEFINE([HAVE_IF_PACKET], [1],
+ [Define to 1 if net/if_packet.h is available.])
+fi
+
PKG_CHECK_MODULES([SSL], [libssl],
[HAVE_OPENSSL=yes],
[HAVE_OPENSSL=no
controller/Makefile
utilities/Makefile
secchan/Makefile
+switch/Makefile
datapath/tests/Makefile
third-party/Makefile
datapath/linux-2.6/Makefile
--- /dev/null
+.TH secchan 8 "March 2008" "OpenFlow" "OpenFlow Manual"
+
+.SH NAME
+switch \- userspace implementation of OpenFlow switch
+
+.SH SYNOPSIS
+.B switch
+[\fIoptions\fR]
+\fB-i\fR \fInetdev\fR[\fB,\fInetdev\fR]...
+\fIcontroller\fR
+
+.SH DESCRIPTION
+The \fBswitch\fR is a userspace implementation of an OpenFlow switch.
+It implements all three parts of the OpenFlow switch specification: a
+``flow table'' in which each flow entry is associated with an action
+telling the switch how to process the flow; a ``secure channel''
+connecting the switch to a remote process (a controller), allowing
+commands and packets to be sent between the controller and the switch;
+and an OpenFlow protocol implementation.
+
+\fBswitch\fR monitors one or more network device interfaces,
+forwarding packets between them according to the entries in the flow
+table. It also maintains a connection to an OpenFlow controller over
+a TCP or SSL connection, relaying packets that do not match a flow
+table entry to the controller and executing commands sent by the
+controller.
+
+For access to network devices, the switch program must normally run as
+root.
+
+The mandatory \fIcontroller\fR argument specifies how to connect to
+the OpenFlow controller. It takes one of the following forms:
+
+.TP
+\fBtcp:\fIhost\fR[\fB:\fIport\fR]
+The specified TCP \fIport\fR (default: 975) on the given remote
+\fIhost\fR.
+
+.TP
+\fBssl:\fIhost\fR[\fB:\fIport\fR]
+The specified SSL \fIport\fR (default: 976) on the given remote
+\fIhost\fR. The \fB--private-key\fR, \fB--certificate\fR, and
+\fB--ca-cert\fR options are mandatory when this form is used.
+
+.SH OPTIONS
+.TP
+\fB-i\fR, \fB--interfaces=\fR\fInetdev\fR[\fB,\fInetdev\fR]...
+Specifies each \fInetdev\fR (e.g., \fBeth0\fR) as a switch port. The
+specified network devices should not have any configured IP addresses.
+This option may be given any number of times to specify additional
+network devices.
+
+.TP
+\fB-d\fR, \fB--datapath-id=\fIdpid\fR
+Specifies the OpenFlow switch ID (a 48-bit number that uniquely
+identifies a controller) as \fIdpid\fR, which consist of exactly 12
+hex digits. Without this option, \fBswitch\fR picks an ID randomly.
+
+.TP
+\fB-p\fR, \fB--private-key=\fIprivkey.pem\fR
+Specifies a PEM file containing the private key used as the switch's
+identity for SSL connections to the controller.
+
+.TP
+\fB-c\fR, \fB--certificate=\fIcert.pem\fR
+Specifies a PEM file containing a certificate, signed by the
+controller's certificate authority (CA), that certifies the switch's
+private key to identify a trustworthy switch.
+
+.TP
+\fB-C\fR, \fB--ca-cert=\fIcacert.pem\fR
+Specifies a PEM file containing the CA certificate used to verify that
+the switch is connected to a trustworthy controller.
+
+.TP
+.BR \-h ", " \-\^\-help
+Prints a brief help message to the console.
+
+.TP
+.BR \-u ", " \-\^\-unreliable
+Do not attempt to reconnect the channel if a connection drops. By
+default, \fBsecchan\fR attempts to reconnect.
+
+.TP
+.BR \-v ", " \-\^\-verbose
+Prints debug messages to the console.
+
+.TP
+.BR \-V ", " \-\^\-version
+Prints version information to the console.
+
+.SH "SEE ALSO"
+
+.BR dpctl (8),
+.BR controller (8)
+.BR vlogconf (8)
+
+.SH BUGS
+Currently \fBsecchan\fR does not support SSL
--- /dev/null
+/Makefile
+/Makefile.in
+/switch
--- /dev/null
+include ../Make.vars
+
+bin_PROGRAMS = switch
+
+switch_SOURCES = \
+ chain.c \
+ chain.h \
+ controller.c \
+ controller.h \
+ crc32.c \
+ crc32.h \
+ datapath.c \
+ datapath.h \
+ forward.c \
+ forward.h \
+ netdev.c \
+ netdev.h \
+ switch.c \
+ switch-flow.c \
+ switch-flow.h \
+ table.h \
+ table-hash.c \
+ table-linear.c \
+ table-mac.c
+
+switch_LDADD = ../lib/libopenflow.la
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "chain.h"
+#include <assert.h>
+#include <errno.h>
+#include <stdlib.h>
+#include "switch-flow.h"
+#include "table.h"
+
+#define THIS_MODULE VLM_chain
+#include "vlog.h"
+
+/* Attempts to append 'table' to the set of tables in 'chain'. Returns 0 or
+ * negative error. If 'table' is null it is assumed that table creation failed
+ * due to out-of-memory. */
+static int add_table(struct sw_chain *chain, struct sw_table *table)
+{
+ if (table == NULL)
+ return -ENOMEM;
+ if (chain->n_tables >= CHAIN_MAX_TABLES) {
+ VLOG_ERR("too many tables in chain\n");
+ table->destroy(table);
+ return -ENOBUFS;
+ }
+ chain->tables[chain->n_tables++] = table;
+ return 0;
+}
+
+/* Creates and returns a new chain. Returns NULL if the chain cannot be
+ * created. */
+struct sw_chain *chain_create(void)
+{
+ struct sw_chain *chain = calloc(1, sizeof *chain);
+ if (chain == NULL)
+ return NULL;
+
+ if (add_table(chain, table_mac_create(TABLE_MAC_NUM_BUCKETS,
+ TABLE_MAC_MAX_FLOWS))
+ || add_table(chain, table_hash2_create(0x1EDC6F41, TABLE_HASH_MAX_FLOWS,
+ 0x741B8CD7, TABLE_HASH_MAX_FLOWS))
+ || add_table(chain, table_linear_create(TABLE_LINEAR_MAX_FLOWS))) {
+ chain_destroy(chain);
+ return NULL;
+ }
+
+ return chain;
+}
+
+/* Searches 'chain' for a flow matching 'key', which must not have any wildcard
+ * fields. Returns the flow if successful, otherwise a null pointer. */
+struct sw_flow *
+chain_lookup(struct sw_chain *chain, const struct sw_flow_key *key)
+{
+ int i;
+
+ assert(!key->wildcards);
+ for (i = 0; i < chain->n_tables; i++) {
+ struct sw_table *t = chain->tables[i];
+ struct sw_flow *flow = t->lookup(t, key);
+ if (flow)
+ return flow;
+ }
+ return NULL;
+}
+
+/* Inserts 'flow' into 'chain', replacing any duplicate flow. Returns 0 if
+ * successful or a negative error.
+ *
+ * If successful, 'flow' becomes owned by the chain, otherwise it is retained
+ * by the caller. */
+int
+chain_insert(struct sw_chain *chain, struct sw_flow *flow)
+{
+ int i;
+
+ for (i = 0; i < chain->n_tables; i++) {
+ struct sw_table *t = chain->tables[i];
+ if (t->insert(t, flow))
+ return 0;
+ }
+
+ return -ENOBUFS;
+}
+
+/* Deletes from 'chain' any and all flows that match 'key'. Returns the number
+ * of flows that were deleted.
+ *
+ * Expensive in the general case as currently implemented, since it requires
+ * iterating through the entire contents of each table for keys that contain
+ * wildcards. Relatively cheap for fully specified keys.
+ *
+ * The caller need not hold any locks. */
+int
+chain_delete(struct sw_chain *chain, const struct sw_flow_key *key, int strict)
+{
+ int count = 0;
+ int i;
+
+ for (i = 0; i < chain->n_tables; i++) {
+ struct sw_table *t = chain->tables[i];
+ count += t->delete(t, key, strict);
+ }
+
+ return count;
+
+}
+
+/* Performs timeout processing on all the tables in 'chain'. Returns the
+ * number of flow entries deleted through expiration.
+ *
+ * Expensive as currently implemented, since it iterates through the entire
+ * contents of each table.
+ *
+ * The caller need not hold any locks. */
+int
+chain_timeout(struct sw_chain *chain, struct datapath *dp)
+{
+ int count = 0;
+ int i;
+
+ for (i = 0; i < chain->n_tables; i++) {
+ struct sw_table *t = chain->tables[i];
+ count += t->timeout(dp, t);
+ }
+ return count;
+}
+
+/* Destroys 'chain', which must not have any users. */
+void
+chain_destroy(struct sw_chain *chain)
+{
+ int i;
+
+ for (i = 0; i < chain->n_tables; i++) {
+ struct sw_table *t = chain->tables[i];
+ t->destroy(t);
+ }
+ free(chain);
+}
+
+/* Prints statistics for each of the tables in 'chain'. */
+void
+chain_print_stats(struct sw_chain *chain)
+{
+ int i;
+
+ printf("\n");
+ for (i = 0; i < chain->n_tables; i++) {
+ struct sw_table *t = chain->tables[i];
+ struct sw_table_stats stats;
+ t->stats(t, &stats);
+ printf("%s: %lu/%lu flows\n",
+ stats.name, stats.n_flows, stats.max_flows);
+ }
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef CHAIN_H
+#define CHAIN_H 1
+
+struct sw_flow;
+struct sw_flow_key;
+struct datapath;
+
+#define TABLE_LINEAR_MAX_FLOWS 100
+#define TABLE_HASH_MAX_FLOWS 65536
+#define TABLE_MAC_MAX_FLOWS 1024
+#define TABLE_MAC_NUM_BUCKETS 1024
+
+/* Set of tables chained together in sequence from cheap to expensive. */
+#define CHAIN_MAX_TABLES 4
+struct sw_chain {
+ int n_tables;
+ struct sw_table *tables[CHAIN_MAX_TABLES];
+};
+
+struct sw_chain *chain_create(void);
+struct sw_flow *chain_lookup(struct sw_chain *, const struct sw_flow_key *);
+int chain_insert(struct sw_chain *, struct sw_flow *);
+int chain_delete(struct sw_chain *, const struct sw_flow_key *, int);
+int chain_timeout(struct sw_chain *, struct datapath *);
+void chain_destroy(struct sw_chain *);
+void chain_print_stats(struct sw_chain *);
+
+#endif /* chain.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "controller.h"
+#include <errno.h>
+#include <string.h>
+#include "buffer.h"
+#include "forward.h"
+#include "poll-loop.h"
+#include "ofp-print.h"
+#include "util.h"
+#include "vconn.h"
+
+#define THIS_MODULE VLM_controller_connection
+#include "vlog.h"
+
+void
+controller_init(struct controller_connection *cc,
+ const char *name, bool reliable)
+{
+ cc->reliable = reliable;
+ cc->name = name;
+ cc->vconn = NULL;
+ queue_init(&cc->txq);
+ cc->backoff_deadline = 0;
+ cc->backoff = 0;
+}
+
+static int
+try_send(struct controller_connection *cc)
+{
+ int retval = 0;
+ struct buffer *next = cc->txq.head->next;
+ retval = vconn_send(cc->vconn, cc->txq.head);
+ if (retval) {
+ return retval;
+ }
+ queue_advance_head(&cc->txq, next);
+ return 0;
+}
+
+void
+controller_run(struct controller_connection *cc, struct datapath *dp)
+{
+ if (!cc->vconn) {
+ if (time(0) >= cc->backoff_deadline) {
+ int retval;
+
+ retval = vconn_open(cc->name, &cc->vconn);
+ if (!retval) {
+ cc->backoff_deadline = time(0) + cc->backoff;
+ cc->connected = false;
+ } else {
+ VLOG_WARN("%s: connection failed (%s)",
+ cc->name, strerror(retval));
+ controller_disconnect(cc, 0);
+ }
+ }
+ } else if (!cc->connected) {
+ int error = vconn_connect(cc->vconn);
+ if (!error) {
+ VLOG_WARN("%s: connected", cc->name);
+ if (vconn_is_passive(cc->vconn)) {
+ fatal(0, "%s: passive vconn not supported in switch",
+ cc->name);
+ }
+ cc->connected = true;
+ } else if (error != EAGAIN) {
+ VLOG_WARN("%s: connection failed (%s)",
+ cc->name, strerror(error));
+ controller_disconnect(cc, 0);
+ }
+ } else {
+ int iterations;
+
+ for (iterations = 0; iterations < 50; iterations++) {
+ struct buffer *buffer;
+ int error = vconn_recv(cc->vconn, &buffer);
+ if (!error) {
+ fwd_control_input(dp, buffer->data, buffer->size);
+ buffer_delete(buffer);
+ } else if (error == EAGAIN) {
+ break;
+ } else {
+ controller_disconnect(cc, error);
+ return;
+ }
+ }
+
+ while (cc->txq.n > 0) {
+ int error = try_send(cc);
+ if (error == EAGAIN) {
+ break;
+ } else if (error) {
+ controller_disconnect(cc, error);
+ return;
+ }
+ }
+ }
+}
+
+void
+controller_disconnect(struct controller_connection *cc, int error)
+{
+ time_t now = time(0);
+
+ if (cc->vconn) {
+ if (!cc->reliable) {
+ fatal(0, "%s: connection dropped", cc->name);
+ }
+
+ if (error > 0) {
+ VLOG_WARN("%s: connection dropped (%s)",
+ cc->name, strerror(error));
+ } else if (error == EOF) {
+ VLOG_WARN("%s: connection closed", cc->name);
+ } else {
+ VLOG_WARN("%s: connection dropped", cc->name);
+ }
+ vconn_close(cc->vconn);
+ cc->vconn = NULL;
+ queue_clear(&cc->txq);
+ }
+
+ if (now >= cc->backoff_deadline) {
+ cc->backoff = 1;
+ } else {
+ cc->backoff = MIN(60, MAX(1, 2 * cc->backoff));
+ VLOG_WARN("%s: waiting %d seconds before reconnect\n",
+ cc->name, cc->backoff);
+ }
+ cc->backoff_deadline = now + cc->backoff;
+}
+
+void
+controller_wait(struct controller_connection *cc)
+{
+ if (cc->vconn) {
+ vconn_wait(cc->vconn, WAIT_RECV);
+ if (cc->txq.n) {
+ vconn_wait(cc->vconn, WAIT_SEND);
+ }
+ } else {
+ poll_timer_wait((cc->backoff_deadline - time(0)) * 1000);
+ }
+}
+
+void
+controller_send(struct controller_connection *cc, struct buffer *b)
+{
+ if (cc->vconn) {
+ if (cc->txq.n < 128) {
+ queue_push_tail(&cc->txq, b);
+ if (cc->txq.n == 1) {
+ try_send(cc);
+ }
+ } else {
+ VLOG_WARN("%s: controller queue overflow", cc->name);
+ buffer_delete(b);
+ }
+ }
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef CONTROLLER_H
+#define CONTROLLER_H 1
+
+#include "queue.h"
+#include <stdbool.h>
+#include <time.h>
+
+struct datapath;
+
+struct controller_connection {
+ bool reliable;
+ const char *name;
+ struct vconn *vconn;
+ bool connected;
+ struct queue txq;
+ time_t backoff_deadline;
+ int backoff;
+};
+
+void controller_init(struct controller_connection *,
+ const char *name, bool reliable);
+void controller_run(struct controller_connection *, struct datapath *);
+void controller_connect(struct controller_connection *);
+void controller_disconnect(struct controller_connection *, int error);
+void controller_wait(struct controller_connection *);
+void controller_send(struct controller_connection *, struct buffer *);
+
+#endif /* controller.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "crc32.h"
+
+void
+crc32_init(struct crc32 *crc, unsigned int polynomial)
+{
+ int i;
+
+ for (i = 0; i < CRC32_TABLE_SIZE; ++i) {
+ unsigned int reg = i << 24;
+ int j;
+ for (j = 0; j < CRC32_TABLE_BITS; j++) {
+ int topBit = (reg & 0x80000000) != 0;
+ reg <<= 1;
+ if (topBit)
+ reg ^= polynomial;
+ }
+ crc->table[i] = reg;
+ }
+}
+
+unsigned int
+crc32_calculate(const struct crc32 *crc, const void *data_, size_t n_bytes)
+{
+ const uint8_t *data = data_;
+ unsigned int result = 0;
+ size_t i;
+
+ for (i = 0; i < n_bytes; i++) {
+ unsigned int top = result >> 24;
+ top ^= data[i];
+ result = (result << 8) ^ crc->table[top];
+ }
+ return result;
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef CRC32_H
+#define CRC32_H 1
+
+#include <stdint.h>
+#include <stddef.h>
+
+#define CRC32_TABLE_BITS 8
+#define CRC32_TABLE_SIZE (1u << CRC32_TABLE_BITS)
+
+struct crc32 {
+ unsigned int table[CRC32_TABLE_SIZE];
+};
+
+void crc32_init(struct crc32 *, unsigned int polynomial);
+unsigned int crc32_calculate(const struct crc32 *, const void *, size_t);
+
+#endif /* crc32.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "datapath.h"
+#include <arpa/inet.h>
+#include <assert.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include "buffer.h"
+#include "chain.h"
+#include "controller.h"
+#include "flow.h"
+#include "forward.h"
+#include "netdev.h"
+#include "packets.h"
+#include "poll-loop.h"
+#include "table.h"
+#include "xtoxll.h"
+
+#define THIS_MODULE VLM_datapath
+#include "vlog.h"
+
+#define BRIDGE_PORT_NO_FLOOD 0x00000001
+
+static void send_port_status(struct sw_port *p, uint8_t status);
+static void del_switch_port(struct sw_port *p);
+static int port_no(struct datapath *dp, struct sw_port *p)
+{
+ assert(p >= dp->ports && p < &dp->ports[ARRAY_SIZE(dp->ports)]);
+ return p - dp->ports;
+}
+
+/* Generates a unique datapath id. It incorporates the datapath index
+ * and a hardware address, if available. If not, it generates a random
+ * one.
+ */
+static uint64_t
+gen_datapath_id(void)
+{
+ /* Choose a random datapath id. */
+ uint64_t id = 0;
+ int i;
+
+ srand(time(0));
+
+ for (i = 0; i < ETH_ADDR_LEN; i++) {
+ id |= (uint64_t)(rand() & 0xff) << (8*(ETH_ADDR_LEN-1 - i));
+ }
+
+ return id;
+}
+
+int
+dp_new(struct datapath **dp_, uint64_t dpid, struct controller_connection *cc)
+{
+ struct datapath *dp;
+
+ dp = calloc(1, sizeof *dp);
+ if (!dp) {
+ return ENOMEM;
+ }
+
+ dp->last_timeout = time(0);
+ dp->cc = cc;
+ dp->id = dpid <= UINT64_C(0xffffffffffff) ? dpid : gen_datapath_id();
+ dp->chain = chain_create();
+ if (!dp->chain) {
+ VLOG_ERR("could not create chain");
+ free(dp);
+ return ENOMEM;
+ }
+
+ list_init(&dp->port_list);
+ dp->miss_send_len = OFP_DEFAULT_MISS_SEND_LEN;
+ *dp_ = dp;
+ return 0;
+}
+
+int
+dp_add_port(struct datapath *dp, const char *name)
+{
+ struct netdev *netdev;
+ struct sw_port *p;
+ int error;
+
+ error = netdev_open(name, &netdev);
+ if (error) {
+ return error;
+ }
+
+ for (p = dp->ports; ; p++) {
+ if (p >= &dp->ports[ARRAY_SIZE(dp->ports)]) {
+ return EXFULL;
+ } else if (!p->netdev) {
+ break;
+ }
+ }
+
+ p->dp = dp;
+ p->netdev = netdev;
+ list_push_back(&dp->port_list, &p->node);
+
+ /* Notify the ctlpath that this port has been added */
+ send_port_status(p, OFPPR_ADD);
+
+ return 0;
+}
+
+void
+dp_run(struct datapath *dp)
+{
+ time_t now = time(0);
+ struct sw_port *p, *n;
+ struct buffer *buffer = NULL;
+
+ if (now != dp->last_timeout) {
+ chain_timeout(dp->chain, dp);
+ dp->last_timeout = now;
+ }
+ poll_timer_wait(1000);
+
+ LIST_FOR_EACH_SAFE (p, n, struct sw_port, node, &dp->port_list) {
+ int error;
+
+ if (!buffer) {
+ /* Allocate buffer with some headroom to add headers in forwarding
+ * to the controller or adding a vlan tag, plus an extra 2 bytes to
+ * allow IP headers to be aligned on a 4-byte boundary. */
+ const int headroom = 128 + 2;
+ buffer = buffer_new(ETH_TOTAL_MAX + headroom);
+ buffer->data += headroom;
+ }
+ error = netdev_recv(p->netdev, buffer, false);
+ if (!error) {
+ fwd_port_input(dp, buffer, port_no(dp, p));
+ buffer = NULL;
+ } else if (error != EAGAIN) {
+ VLOG_ERR("Error receiving data from %s: %s",
+ netdev_get_name(p->netdev), strerror(error));
+ del_switch_port(p);
+ }
+ }
+ buffer_delete(buffer);
+}
+
+void
+dp_wait(struct datapath *dp)
+{
+ struct sw_port *p;
+
+ LIST_FOR_EACH (p, struct sw_port, node, &dp->port_list) {
+ poll_fd_wait(netdev_get_fd(p->netdev), POLLIN, NULL);
+ }
+}
+
+/* Delete 'p' from switch. */
+static void
+del_switch_port(struct sw_port *p)
+{
+ send_port_status(p, OFPPR_DELETE);
+ netdev_close(p->netdev);
+ p->netdev = NULL;
+ list_remove(&p->node);
+}
+
+void
+dp_destroy(struct datapath *dp)
+{
+ struct sw_port *p, *n;
+
+ if (!dp) {
+ return;
+ }
+
+ LIST_FOR_EACH_SAFE (p, n, struct sw_port, node, &dp->port_list) {
+ del_switch_port(p);
+ }
+ chain_destroy(dp->chain);
+ free(dp);
+}
+
+static int
+flood(struct datapath *dp, struct buffer *buffer, int in_port)
+{
+ struct sw_port *p;
+ struct sw_port *prev_port;
+
+ prev_port = NULL;
+ LIST_FOR_EACH (p, struct sw_port, node, &dp->port_list) {
+ if (port_no(dp, p) == in_port || p->flags & BRIDGE_PORT_NO_FLOOD) {
+ continue;
+ }
+ if (prev_port) {
+ struct buffer *clone = buffer_clone(buffer);
+ if (!clone) {
+ buffer_delete(buffer);
+ return -ENOMEM;
+ }
+ dp_output_port(dp, clone, in_port, port_no(dp, prev_port));
+ }
+ prev_port = p;
+ }
+ if (prev_port)
+ dp_output_port(dp, buffer, in_port, port_no(dp, prev_port));
+ else
+ buffer_delete(buffer);
+
+ return 0;
+}
+
+void
+output_packet(struct datapath *dp, struct buffer *buffer, int out_port)
+{
+ if (out_port >= 0 && out_port < OFPP_MAX) {
+ struct sw_port *p = &dp->ports[out_port];
+ if (p->netdev != NULL) {
+ /* FIXME: queue packets. */
+ netdev_send(p->netdev, buffer, false);
+ return;
+ }
+ }
+
+ buffer_delete(buffer);
+ /* FIXME: ratelimit */
+ VLOG_DBG("can't forward to bad port %d\n", out_port);
+}
+
+/* Takes ownership of 'buffer' and transmits it to 'out_port' on 'dp'.
+ */
+void
+dp_output_port(struct datapath *dp, struct buffer *buffer,
+ int in_port, int out_port)
+{
+
+ assert(buffer);
+ if (out_port == OFPP_FLOOD) {
+ flood(dp, buffer, in_port);
+ } else if (out_port == OFPP_CONTROLLER) {
+ dp_output_control(dp, buffer, in_port, fwd_save_buffer(buffer), 0,
+ OFPR_ACTION);
+ } else {
+ output_packet(dp, buffer, out_port);
+ }
+}
+
+/* Takes ownership of 'buffer' and transmits it to 'dp''s controller. If
+ * 'buffer_id' != -1, then only the first 64 bytes of 'buffer' are sent;
+ * otherwise, all of 'buffer' is sent. 'reason' indicates why 'buffer' is
+ * being sent. 'max_len' sets the maximum number of bytes that the caller wants
+ * to be sent; a value of 0 indicates the entire packet should be sent. */
+void
+dp_output_control(struct datapath *dp, struct buffer *buffer, int in_port,
+ uint32_t buffer_id, size_t max_len, int reason)
+{
+ struct ofp_packet_in *opi;
+ size_t total_len;
+
+ total_len = buffer->size;
+ if (buffer_id != UINT32_MAX && max_len > buffer->size) {
+ buffer->size = max_len;
+ }
+
+ opi = buffer_push_uninit(buffer, offsetof(struct ofp_packet_in, data));
+ opi->header.version = OFP_VERSION;
+ opi->header.type = OFPT_PACKET_IN;
+ opi->header.length = htons(buffer->size);
+ opi->header.xid = htonl(0);
+ opi->buffer_id = htonl(buffer_id);
+ opi->total_len = htons(total_len);
+ opi->in_port = htons(in_port);
+ opi->reason = reason;
+ opi->pad = 0;
+ controller_send(dp->cc, buffer);
+}
+
+static void fill_port_desc(struct datapath *dp, struct sw_port *p,
+ struct ofp_phy_port *desc)
+{
+ desc->port_no = htons(port_no(dp, p));
+ strncpy((char *) desc->name, netdev_get_name(p->netdev),
+ sizeof desc->name);
+ desc->name[sizeof desc->name - 1] = '\0';
+ memcpy(desc->hw_addr, netdev_get_etheraddr(p->netdev), ETH_ADDR_LEN);
+ desc->flags = htonl(p->flags);
+ desc->features = htonl(netdev_get_features(p->netdev));
+ desc->speed = htonl(netdev_get_speed(p->netdev));
+}
+
+void
+dp_send_hello(struct datapath *dp)
+{
+ struct buffer *buffer;
+ struct ofp_data_hello *odh;
+ struct sw_port *p;
+
+ buffer = buffer_new(sizeof *odh);
+ odh = buffer_put_uninit(buffer, sizeof *odh);
+ memset(odh, 0, sizeof *odh);
+ odh->header.version = OFP_VERSION;
+ odh->header.type = OFPT_DATA_HELLO;
+ odh->header.xid = htonl(0);
+ odh->datapath_id = htonll(dp->id);
+ odh->n_exact = htonl(2 * TABLE_HASH_MAX_FLOWS);
+ odh->n_mac_only = htonl(TABLE_MAC_MAX_FLOWS);
+ odh->n_compression = 0; /* Not supported */
+ odh->n_general = htonl(TABLE_LINEAR_MAX_FLOWS);
+ odh->buffer_mb = htonl(UINT32_MAX);
+ odh->n_buffers = htonl(N_PKT_BUFFERS);
+ odh->capabilities = htonl(OFP_SUPPORTED_CAPABILITIES);
+ odh->actions = htonl(OFP_SUPPORTED_ACTIONS);
+ odh->miss_send_len = htons(dp->miss_send_len);
+ LIST_FOR_EACH (p, struct sw_port, node, &dp->port_list) {
+ struct ofp_phy_port *opp = buffer_put_uninit(buffer, sizeof *opp);
+ memset(opp, 0, sizeof *opp);
+ fill_port_desc(dp, p, opp);
+ }
+ odh = buffer_at_assert(buffer, 0, sizeof *odh);
+ odh->header.length = htons(buffer->size);
+ controller_send(dp->cc, buffer);
+}
+
+void
+dp_update_port_flags(struct datapath *dp, const struct ofp_phy_port *opp)
+{
+ struct sw_port *p;
+
+ p = &dp->ports[htons(opp->port_no)];
+
+ /* Make sure the port id hasn't changed since this was sent */
+ if (!p || memcmp(opp->hw_addr, netdev_get_etheraddr(p->netdev),
+ ETH_ADDR_LEN) != 0)
+ return;
+
+ p->flags = htonl(opp->flags);
+}
+
+static void
+send_port_status(struct sw_port *p, uint8_t status)
+{
+ struct buffer *buffer;
+ struct ofp_port_status *ops;
+ buffer = buffer_new(sizeof *ops);
+ ops = buffer_put_uninit(buffer, sizeof *ops);
+ ops->header.version = OFP_VERSION;
+ ops->header.type = OFPT_PORT_STATUS;
+ ops->header.length = htons(sizeof(*ops));
+ ops->header.xid = htonl(0);
+ ops->reason = status;
+ fill_port_desc(p->dp, p, &ops->desc);
+ controller_send(p->dp->cc, buffer);
+}
+
+void
+dp_send_flow_expired(struct datapath *dp, struct sw_flow *flow)
+{
+ struct buffer *buffer;
+ struct ofp_flow_expired *ofe;
+ buffer = buffer_new(sizeof *ofe);
+ ofe = buffer_put_uninit(buffer, sizeof *ofe);
+ ofe->header.version = OFP_VERSION;
+ ofe->header.type = OFPT_FLOW_EXPIRED;
+ ofe->header.length = htons(sizeof(*ofe));
+ ofe->header.xid = htonl(0);
+ flow_fill_match(&ofe->match, &flow->key);
+ ofe->duration = htonl(flow->timeout - flow->max_idle - flow->created);
+ ofe->packet_count = htonll(flow->packet_count);
+ ofe->byte_count = htonll(flow->byte_count);
+ controller_send(dp->cc, buffer);
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/* Interface exported by OpenFlow module. */
+
+#ifndef DATAPATH_H
+#define DATAPATH_H 1
+
+#include <time.h>
+#include "openflow.h"
+#include "switch-flow.h"
+#include "buffer.h"
+#include "list.h"
+
+#define NL_FLOWS_PER_MESSAGE 100
+
+/* Capabilities supported by this implementation. */
+#define OFP_SUPPORTED_CAPABILITIES (OFPC_MULTI_PHY_TX)
+
+/* Actions supported by this implementation. */
+#define OFP_SUPPORTED_ACTIONS ( (1 << OFPAT_OUTPUT) \
+ | (1 << OFPAT_SET_DL_VLAN) \
+ | (1 << OFPAT_SET_DL_SRC) \
+ | (1 << OFPAT_SET_DL_DST) \
+ | (1 << OFPAT_SET_NW_SRC) \
+ | (1 << OFPAT_SET_NW_DST) \
+ | (1 << OFPAT_SET_TP_SRC) \
+ | (1 << OFPAT_SET_TP_DST) )
+
+struct sw_port {
+ uint32_t flags;
+ struct datapath *dp;
+ struct netdev *netdev;
+ struct list node; /* Element in datapath.ports. */
+};
+
+struct datapath {
+ struct controller_connection *cc;
+
+ time_t last_timeout;
+
+ /* Unique identifier for this datapath */
+ uint64_t id;
+
+ struct sw_chain *chain; /* Forwarding rules. */
+
+ /* Flags from the control hello message */
+ uint16_t hello_flags;
+
+ /* Maximum number of bytes that should be sent for flow misses */
+ uint16_t miss_send_len;
+
+ /* Switch ports. */
+ struct sw_port ports[OFPP_MAX];
+ struct list port_list; /* List of ports, for flooding. */
+};
+
+int dp_new(struct datapath **, uint64_t dpid, struct controller_connection *);
+int dp_add_port(struct datapath *, const char *netdev);
+void dp_run(struct datapath *);
+void dp_wait(struct datapath *);
+
+void dp_output_port(struct datapath *, struct buffer *,
+ int in_port, int out_port);
+void dp_output_control(struct datapath *, struct buffer *, int in_port,
+ uint32_t buffer_id, size_t max_len, int reason);
+void dp_send_hello(struct datapath *);
+void dp_send_flow_expired(struct datapath *, struct sw_flow *);
+void dp_update_port_flags(struct datapath *dp, const struct ofp_phy_port *opp);
+
+#endif /* datapath.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "forward.h"
+#include <arpa/inet.h>
+#include <assert.h>
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+#include "datapath.h"
+#include "chain.h"
+#include "flow.h"
+#include "packets.h"
+
+static void execute_actions(struct datapath *, struct buffer *,
+ int in_port, const struct sw_flow_key *,
+ const struct ofp_action *, int n_actions);
+
+static struct buffer *retrieve_buffer(uint32_t id);
+static void discard_buffer(uint32_t id);
+
+/* 'buffer' was received on 'in_port', a physical switch port between 0 and
+ * OFPP_MAX. Process it according to 'chain'. */
+void fwd_port_input(struct datapath *dp, struct buffer *buffer, int in_port)
+{
+ struct sw_flow_key key;
+ struct sw_flow *flow;
+
+ key.wildcards = 0;
+ flow_extract(buffer, in_port, &key.flow);
+ flow = chain_lookup(dp->chain, &key);
+ if (flow != NULL) {
+ flow_used(flow, buffer);
+ execute_actions(dp, buffer, in_port, &key,
+ flow->actions, flow->n_actions);
+ } else {
+ dp_output_control(dp, buffer, in_port, fwd_save_buffer(buffer),
+ dp->miss_send_len, OFPR_NO_MATCH);
+ }
+}
+
+static void
+do_output(struct datapath *dp, struct buffer *buffer, int in_port,
+ size_t max_len, int out_port)
+{
+ if (out_port != OFPP_CONTROLLER) {
+ dp_output_port(dp, buffer, in_port, out_port);
+ } else {
+ dp_output_control(dp, buffer, in_port, fwd_save_buffer(buffer),
+ max_len, OFPR_ACTION);
+ }
+}
+
+static void execute_actions(struct datapath *dp, struct buffer *buffer,
+ int in_port, const struct sw_flow_key *key,
+ const struct ofp_action *actions, int n_actions)
+{
+ /* Every output action needs a separate clone of 'buffer', but the common
+ * case is just a single output action, so that doing a clone and then
+ * freeing the original buffer is wasteful. So the following code is
+ * slightly obscure just to avoid that. */
+ int prev_port;
+ size_t max_len=0; /* Initialze to make compiler happy */
+ uint16_t eth_proto;
+ int i;
+
+ prev_port = -1;
+ eth_proto = ntohs(key->flow.dl_type);
+
+ for (i = 0; i < n_actions; i++) {
+ const struct ofp_action *a = &actions[i];
+
+ if (prev_port != -1) {
+ do_output(dp, buffer_clone(buffer), in_port, max_len, prev_port);
+ prev_port = -1;
+ }
+
+ if (a->type == ntohs(OFPAT_OUTPUT)) {
+ prev_port = ntohs(a->arg.output.port);
+ max_len = ntohs(a->arg.output.max_len);
+ } else {
+ buffer = execute_setter(buffer, eth_proto, key, a);
+ }
+ }
+ if (prev_port != -1)
+ do_output(dp, buffer, in_port, max_len, prev_port);
+ else
+ buffer_delete(buffer);
+}
+
+/* Returns the new checksum for a packet in which the checksum field previously
+ * contained 'old_csum' and in which a field that contained 'old_u16' was
+ * changed to contain 'new_u16'. */
+static uint16_t
+recalc_csum16(uint16_t old_csum, uint16_t old_u16, uint16_t new_u16)
+{
+ /* Ones-complement arithmetic is endian-independent, so this code does not
+ * use htons() or ntohs().
+ *
+ * See RFC 1624 for formula and explanation. */
+ uint16_t hc_complement = ~old_csum;
+ uint16_t m_complement = ~old_u16;
+ uint16_t m_prime = new_u16;
+ uint32_t sum = hc_complement + m_complement + m_prime;
+ uint16_t hc_prime_complement = sum + (sum >> 16);
+ return ~hc_prime_complement;
+}
+
+/* Returns the new checksum for a packet in which the checksum field previously
+ * contained 'old_csum' and in which a field that contained 'old_u32' was
+ * changed to contain 'new_u32'. */
+static uint16_t
+recalc_csum32(uint16_t old_csum, uint32_t old_u32, uint32_t new_u32)
+{
+ return recalc_csum16(recalc_csum16(old_csum, old_u32, new_u32),
+ old_u32 >> 16, new_u32 >> 16);
+}
+
+static void modify_nh(struct buffer *buffer, uint16_t eth_proto,
+ uint8_t nw_proto, const struct ofp_action *a)
+{
+ if (eth_proto == ETH_TYPE_IP) {
+ struct ip_header *nh = buffer->l3;
+ uint32_t new, *field;
+
+ new = a->arg.nw_addr;
+ field = a->type == OFPAT_SET_NW_SRC ? &nh->ip_src : &nh->ip_dst;
+ if (nw_proto == IP_TYPE_TCP) {
+ struct tcp_header *th = buffer->l4;
+ th->tcp_csum = recalc_csum32(th->tcp_csum, *field, new);
+ } else if (nw_proto == IP_TYPE_UDP) {
+ struct udp_header *th = buffer->l4;
+ if (th->udp_csum) {
+ th->udp_csum = recalc_csum32(th->udp_csum, *field, new);
+ if (!th->udp_csum) {
+ th->udp_csum = 0xffff;
+ }
+ }
+ }
+ nh->ip_csum = recalc_csum32(nh->ip_csum, *field, new);
+ *field = new;
+ }
+}
+
+static void modify_th(struct buffer *buffer, uint16_t eth_proto,
+ uint8_t nw_proto, const struct ofp_action *a)
+{
+ if (eth_proto == ETH_TYPE_IP) {
+ uint16_t new, *field;
+
+ new = a->arg.tp;
+
+ if (nw_proto == IP_TYPE_TCP) {
+ struct tcp_header *th = buffer->l4;
+ field = a->type == OFPAT_SET_TP_SRC ? &th->tcp_src : &th->tcp_dst;
+ th->tcp_csum = recalc_csum16(th->tcp_csum, *field, new);
+ *field = new;
+ } else if (nw_proto == IP_TYPE_UDP) {
+ struct udp_header *th = buffer->l4;
+ field = a->type == OFPAT_SET_TP_SRC ? &th->udp_src : &th->udp_dst;
+ th->udp_csum = recalc_csum16(th->udp_csum, *field, new);
+ *field = new;
+ }
+ }
+}
+
+static struct buffer *
+modify_vlan(struct buffer *buffer,
+ const struct sw_flow_key *key, const struct ofp_action *a)
+{
+ uint16_t new_id = a->arg.vlan_id;
+ struct vlan_eth_header *veh;
+
+ if (new_id != OFP_VLAN_NONE) {
+ if (key->flow.dl_vlan != htons(OFP_VLAN_NONE)) {
+ /* Modify vlan id, but maintain other TCI values */
+ veh = buffer->l2;
+ veh->veth_tci &= ~htons(VLAN_VID);
+ veh->veth_tci |= htons(new_id);
+ } else {
+ /* Insert new vlan id. */
+ struct eth_header *eh = buffer->l2;
+ struct vlan_eth_header tmp;
+ memcpy(tmp.veth_dst, eh->eth_dst, ETH_ADDR_LEN);
+ memcpy(tmp.veth_src, eh->eth_src, ETH_ADDR_LEN);
+ tmp.veth_type = htons(ETH_TYPE_VLAN);
+ tmp.veth_tci = new_id;
+ tmp.veth_next_type = eh->eth_type;
+
+ veh = buffer_push_uninit(buffer, VLAN_HEADER_LEN);
+ memcpy(veh, &tmp, sizeof tmp);
+ buffer->l2 -= VLAN_HEADER_LEN;
+ }
+ } else {
+ /* Remove an existing vlan header if it exists */
+ veh = buffer->l2;
+ if (veh->veth_type == htons(ETH_TYPE_VLAN)) {
+ struct eth_header tmp;
+
+ memcpy(tmp.eth_dst, veh->veth_dst, ETH_ADDR_LEN);
+ memcpy(tmp.eth_src, veh->veth_src, ETH_ADDR_LEN);
+ tmp.eth_type = veh->veth_next_type;
+
+ buffer->size -= VLAN_HEADER_LEN;
+ buffer->data += VLAN_HEADER_LEN;
+ buffer->l2 += VLAN_HEADER_LEN;
+ memcpy(buffer->data, &tmp, sizeof tmp);
+ }
+ }
+
+ return buffer;
+}
+
+struct buffer *execute_setter(struct buffer *buffer, uint16_t eth_proto,
+ const struct sw_flow_key *key, const struct ofp_action *a)
+{
+ switch (a->type) {
+ case OFPAT_SET_DL_VLAN:
+ buffer = modify_vlan(buffer, key, a);
+ break;
+
+ case OFPAT_SET_DL_SRC: {
+ struct eth_header *eh = buffer->l2;
+ memcpy(eh->eth_src, a->arg.dl_addr, sizeof eh->eth_src);
+ break;
+ }
+ case OFPAT_SET_DL_DST: {
+ struct eth_header *eh = buffer->l2;
+ memcpy(eh->eth_dst, a->arg.dl_addr, sizeof eh->eth_dst);
+ break;
+ }
+
+ case OFPAT_SET_NW_SRC:
+ case OFPAT_SET_NW_DST:
+ modify_nh(buffer, eth_proto, key->flow.nw_proto, a);
+ break;
+
+ case OFPAT_SET_TP_SRC:
+ case OFPAT_SET_TP_DST:
+ modify_th(buffer, eth_proto, key->flow.nw_proto, a);
+ break;
+
+ default:
+ NOT_REACHED();
+ }
+
+ return buffer;
+}
+
+static int
+recv_control_hello(struct datapath *dp, const void *msg)
+{
+ const struct ofp_control_hello *och = msg;
+
+ printf("control_hello(version=%d)\n", ntohl(och->version));
+
+ if (ntohs(och->miss_send_len) != OFP_MISS_SEND_LEN_UNCHANGED) {
+ dp->miss_send_len = ntohs(och->miss_send_len);
+ }
+
+ dp->hello_flags = ntohs(och->flags);
+
+ dp_send_hello(dp);
+
+ return 0;
+}
+
+static int
+recv_packet_out(struct datapath *dp, const void *msg)
+{
+ const struct ofp_packet_out *opo = msg;
+
+ if (ntohl(opo->buffer_id) == (uint32_t) -1) {
+ /* FIXME: can we avoid copying data here? */
+ int data_len = ntohs(opo->header.length) - sizeof *opo;
+ struct buffer *buffer = buffer_new(data_len);
+ buffer_put(buffer, opo->u.data, data_len);
+ dp_output_port(dp, buffer,
+ ntohs(opo->in_port), ntohs(opo->out_port));
+ } else {
+ struct sw_flow_key key;
+ struct buffer *buffer;
+ int n_acts;
+
+ buffer = retrieve_buffer(ntohl(opo->buffer_id));
+ if (!buffer) {
+ return -ESRCH;
+ }
+
+ n_acts = (ntohs(opo->header.length) - sizeof *opo)
+ / sizeof *opo->u.actions;
+ flow_extract(buffer, ntohs(opo->in_port), &key.flow);
+ execute_actions(dp, buffer, ntohs(opo->in_port),
+ &key, opo->u.actions, n_acts);
+ }
+ return 0;
+}
+
+static int
+recv_port_mod(struct datapath *dp, const void *msg)
+{
+ const struct ofp_port_mod *opm = msg;
+
+ dp_update_port_flags(dp, &opm->desc);
+
+ return 0;
+}
+
+static int
+add_flow(struct datapath *dp, const struct ofp_flow_mod *ofm)
+{
+ int error = -ENOMEM;
+ int n_acts;
+ struct sw_flow *flow;
+
+
+ /* Check number of actions. */
+ n_acts = (ntohs(ofm->header.length) - sizeof *ofm) / sizeof *ofm->actions;
+ if (n_acts > MAX_ACTIONS) {
+ error = -E2BIG;
+ goto error;
+ }
+
+ /* Allocate memory. */
+ flow = flow_alloc(n_acts);
+ if (flow == NULL)
+ goto error;
+
+ /* Fill out flow. */
+ flow_extract_match(&flow->key, &ofm->match);
+ flow->group_id = ntohl(ofm->group_id);
+ flow->max_idle = ntohs(ofm->max_idle);
+ flow->timeout = time(0) + flow->max_idle; /* FIXME */
+ flow->n_actions = n_acts;
+ flow->created = time(0); /* FIXME */
+ flow->byte_count = 0;
+ flow->packet_count = 0;
+ memcpy(flow->actions, ofm->actions, n_acts * sizeof *flow->actions);
+
+ /* Act. */
+ error = chain_insert(dp->chain, flow);
+ if (error) {
+ goto error_free_flow;
+ }
+ error = 0;
+ if (ntohl(ofm->buffer_id) != UINT32_MAX) {
+ struct buffer *buffer = retrieve_buffer(ntohl(ofm->buffer_id));
+ if (buffer) {
+ struct sw_flow_key key;
+ uint16_t in_port = ntohs(ofm->match.in_port);
+ flow_used(flow, buffer);
+ flow_extract(buffer, in_port, &key.flow);
+ execute_actions(dp, buffer, in_port,
+ &key, ofm->actions, n_acts);
+ } else {
+ error = -ESRCH;
+ }
+ }
+ return error;
+
+error_free_flow:
+ flow_free(flow);
+error:
+ if (ntohl(ofm->buffer_id) != (uint32_t) -1)
+ discard_buffer(ntohl(ofm->buffer_id));
+ return error;
+}
+
+static int
+recv_flow(struct datapath *dp, const void *msg)
+{
+ const struct ofp_flow_mod *ofm = msg;
+ uint16_t command = ntohs(ofm->command);
+
+ if (command == OFPFC_ADD) {
+ return add_flow(dp, ofm);
+ } else if (command == OFPFC_DELETE) {
+ struct sw_flow_key key;
+ flow_extract_match(&key, &ofm->match);
+ return chain_delete(dp->chain, &key, 0) ? 0 : -ESRCH;
+ } else if (command == OFPFC_DELETE_STRICT) {
+ struct sw_flow_key key;
+ flow_extract_match(&key, &ofm->match);
+ return chain_delete(dp->chain, &key, 1) ? 0 : -ESRCH;
+ } else {
+ return -ENODEV;
+ }
+}
+
+/* 'msg', which is 'length' bytes long, was received from the control path.
+ * Apply it to 'chain'. */
+int
+fwd_control_input(struct datapath *dp, const void *msg, size_t length)
+{
+
+ struct openflow_packet {
+ size_t min_size;
+ int (*handler)(struct datapath *, const void *);
+ };
+
+ static const struct openflow_packet packets[] = {
+ [OFPT_CONTROL_HELLO] = {
+ sizeof (struct ofp_control_hello),
+ recv_control_hello,
+ },
+ [OFPT_PACKET_OUT] = {
+ sizeof (struct ofp_packet_out),
+ recv_packet_out,
+ },
+ [OFPT_FLOW_MOD] = {
+ sizeof (struct ofp_flow_mod),
+ recv_flow,
+ },
+ [OFPT_PORT_MOD] = {
+ sizeof (struct ofp_port_mod),
+ recv_port_mod,
+ },
+ };
+
+ const struct openflow_packet *pkt;
+ struct ofp_header *oh;
+
+ if (length < sizeof(struct ofp_header))
+ return -EINVAL;
+
+ oh = (struct ofp_header *) msg;
+ if (oh->version != 1 || oh->type >= ARRAY_SIZE(packets)
+ || ntohs(oh->length) > length)
+ return -EINVAL;
+
+ pkt = &packets[oh->type];
+ if (!pkt->handler)
+ return -ENOSYS;
+ if (length < pkt->min_size)
+ return -EFAULT;
+
+ return pkt->handler(dp, msg);
+}
+
+/* Packet buffering. */
+
+#define OVERWRITE_SECS 1
+
+struct packet_buffer {
+ struct buffer *buffer;
+ uint32_t cookie;
+ time_t timeout;
+};
+
+static struct packet_buffer buffers[N_PKT_BUFFERS];
+static unsigned int buffer_idx;
+
+uint32_t fwd_save_buffer(struct buffer *buffer)
+{
+ struct packet_buffer *p;
+ uint32_t id;
+
+ buffer_idx = (buffer_idx + 1) & PKT_BUFFER_MASK;
+ p = &buffers[buffer_idx];
+ if (p->buffer) {
+ /* Don't buffer packet if existing entry is less than
+ * OVERWRITE_SECS old. */
+ if (time(0) < p->timeout) { /* FIXME */
+ return -1;
+ } else {
+ buffer_delete(p->buffer);
+ }
+ }
+ /* Don't use maximum cookie value since the all-bits-1 id is
+ * special. */
+ if (++p->cookie >= (1u << PKT_COOKIE_BITS) - 1)
+ p->cookie = 0;
+ p->buffer = buffer_clone(buffer); /* FIXME */
+ p->timeout = time(0) + OVERWRITE_SECS; /* FIXME */
+ id = buffer_idx | (p->cookie << PKT_BUFFER_BITS);
+
+ return id;
+}
+
+static struct buffer *retrieve_buffer(uint32_t id)
+{
+ struct buffer *buffer = NULL;
+ struct packet_buffer *p;
+
+ p = &buffers[id & PKT_BUFFER_MASK];
+ if (p->cookie == id >> PKT_BUFFER_BITS) {
+ buffer = p->buffer;
+ p->buffer = NULL;
+ } else {
+ printf("cookie mismatch: %x != %x\n",
+ id >> PKT_BUFFER_BITS, p->cookie);
+ }
+
+ return buffer;
+}
+
+static void discard_buffer(uint32_t id)
+{
+ struct packet_buffer *p;
+
+ p = &buffers[id & PKT_BUFFER_MASK];
+ if (p->cookie == id >> PKT_BUFFER_BITS) {
+ buffer_delete(p->buffer);
+ p->buffer = NULL;
+ }
+}
+
+void fwd_exit(void)
+{
+ int i;
+
+ for (i = 0; i < N_PKT_BUFFERS; i++)
+ buffer_delete(buffers[i].buffer);
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef FORWARD_H
+#define FORWARD_H 1
+
+#include <stddef.h>
+#include <stdint.h>
+
+struct buffer;
+struct datapath;
+struct ofp_action;
+struct sw_flow_key;
+
+/* Buffers are identified to userspace by a 31-bit opaque ID. We divide the ID
+ * into a buffer number (low bits) and a cookie (high bits). The buffer number
+ * is an index into an array of buffers. The cookie distinguishes between
+ * different packets that have occupied a single buffer. Thus, the more
+ * buffers we have, the lower-quality the cookie... */
+#define PKT_BUFFER_BITS 8
+#define N_PKT_BUFFERS (1 << PKT_BUFFER_BITS)
+#define PKT_BUFFER_MASK (N_PKT_BUFFERS - 1)
+
+#define PKT_COOKIE_BITS (32 - PKT_BUFFER_BITS)
+
+
+void fwd_port_input(struct datapath *, struct buffer *, int in_port);
+int fwd_control_input(struct datapath *, const void *, size_t);
+
+uint32_t fwd_save_buffer(struct buffer *);
+
+void fwd_exit(void);
+
+struct buffer *execute_setter(struct buffer *, uint16_t,
+ const struct sw_flow_key *,
+ const struct ofp_action *);
+
+#endif /* forward.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "netdev.h"
+
+#include <assert.h>
+#include <errno.h>
+#include <arpa/inet.h>
+#include <inttypes.h>
+#include <linux/types.h>
+#include <linux/ethtool.h>
+#include <linux/sockios.h>
+#include <sys/types.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <netpacket/packet.h>
+#include <net/ethernet.h>
+#include <net/if.h>
+#include <net/if_arp.h>
+#include <net/if_packet.h>
+#include <netinet/in.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "list.h"
+#include "fatal-signal.h"
+#include "buffer.h"
+#include "openflow.h"
+#include "packets.h"
+
+#define THIS_MODULE VLM_netdev
+#include "vlog.h"
+
+struct netdev {
+ struct list node;
+ char *name;
+ int fd;
+ uint8_t etheraddr[ETH_ADDR_LEN];
+ int speed;
+ uint32_t features;
+ int save_flags;
+};
+
+static struct list netdev_list = LIST_INITIALIZER(&netdev_list);
+
+static void init_netdev(void);
+static int restore_flags(struct netdev *netdev);
+
+/* Check whether device NAME has an IPv4 address assigned to it and, if so, log
+ * an error. */
+static void
+check_ipv4_address(const char *name)
+{
+ int sock;
+ struct ifreq ifr;
+
+ sock = socket(AF_INET, SOCK_DGRAM, 0);
+ if (sock < 0) {
+ VLOG_WARN("socket(AF_INET): %s", strerror(errno));
+ return;
+ }
+
+ strncpy(ifr.ifr_name, name, sizeof ifr.ifr_name);
+ ifr.ifr_addr.sa_family = AF_INET;
+ if (ioctl(sock, SIOCGIFADDR, &ifr) == 0) {
+ VLOG_ERR("%s device has assigned IP address %s", name,
+ inet_ntoa(((struct sockaddr_in*) &ifr.ifr_addr)->sin_addr));
+ }
+
+ close(sock);
+}
+
+static void
+check_ipv6_address(const char *name)
+{
+ FILE *file;
+ char line[128];
+
+ file = fopen("/proc/net/if_inet6", "r");
+ if (file == NULL) {
+ return;
+ }
+
+ while (fgets(line, sizeof line, file)) {
+ struct in6_addr in6;
+ uint8_t *s6 = in6.s6_addr;
+ char ifname[16 + 1];
+
+#define X8 "%2"SCNx8
+ if (sscanf(line, " "X8 X8 X8 X8 X8 X8 X8 X8 X8 X8 X8 X8 X8 X8 X8 X8
+ "%*x %*x %*x %*x %16s\n",
+ &s6[0], &s6[1], &s6[2], &s6[3],
+ &s6[4], &s6[5], &s6[6], &s6[7],
+ &s6[8], &s6[9], &s6[10], &s6[11],
+ &s6[12], &s6[13], &s6[14], &s6[15],
+ ifname) == 17
+ && !strcmp(name, ifname))
+ {
+ char in6_name[INET6_ADDRSTRLEN + 1];
+ inet_ntop(AF_INET6, &in6, in6_name, sizeof in6_name);
+ VLOG_ERR("%s device has assigned IPv6 address %s",
+ name, in6_name);
+ }
+ }
+
+ fclose(file);
+}
+
+static void
+do_ethtool(struct netdev *netdev)
+{
+ struct ifreq ifr;
+ struct ethtool_cmd ecmd;
+
+ netdev->speed = 0;
+ netdev->features = 0;
+
+ memset(&ifr, 0, sizeof ifr);
+ strncpy(ifr.ifr_name, netdev->name, sizeof ifr.ifr_name);
+ ifr.ifr_data = (caddr_t) &ecmd;
+
+ memset(&ecmd, 0, sizeof ecmd);
+ ecmd.cmd = ETHTOOL_GSET;
+ if (ioctl(netdev->fd, SIOCETHTOOL, &ifr) == 0) {
+ if (ecmd.supported & SUPPORTED_10baseT_Half) {
+ netdev->features |= OFPPF_10MB_HD;
+ }
+ if (ecmd.supported & SUPPORTED_10baseT_Full) {
+ netdev->features |= OFPPF_10MB_FD;
+ }
+ if (ecmd.supported & SUPPORTED_100baseT_Half) {
+ netdev->features |= OFPPF_100MB_HD;
+ }
+ if (ecmd.supported & SUPPORTED_100baseT_Full) {
+ netdev->features |= OFPPF_100MB_FD;
+ }
+ if (ecmd.supported & SUPPORTED_1000baseT_Half) {
+ netdev->features |= OFPPF_1GB_HD;
+ }
+ if (ecmd.supported & SUPPORTED_1000baseT_Full) {
+ netdev->features |= OFPPF_1GB_FD;
+ }
+ /* 10Gbps half-duplex doesn't exist... */
+ if (ecmd.supported & SUPPORTED_10000baseT_Full) {
+ netdev->features |= OFPPF_10GB_FD;
+ }
+
+ switch (ecmd.speed) {
+ case SPEED_10:
+ netdev->speed = 10;
+ break;
+
+ case SPEED_100:
+ netdev->speed = 100;
+ break;
+
+ case SPEED_1000:
+ netdev->speed = 1000;
+ break;
+
+ case SPEED_2500:
+ netdev->speed = 2500;
+ break;
+
+ case SPEED_10000:
+ netdev->speed = 10000;
+ break;
+ }
+ } else {
+ VLOG_DBG("ioctl(SIOCETHTOOL) failed: %s", strerror(errno));
+ }
+}
+
+int
+netdev_open(const char *name, struct netdev **netdev_)
+{
+ int fd;
+ struct sockaddr sa;
+ struct ifreq ifr;
+ unsigned int ifindex;
+ socklen_t rcvbuf_len;
+ size_t rcvbuf;
+ uint8_t etheraddr[ETH_ADDR_LEN];
+ int error;
+ struct netdev *netdev;
+
+ *netdev_ = NULL;
+ init_netdev();
+
+ /* Create raw socket.
+ *
+ * We have to use SOCK_PACKET, despite its deprecation, because only
+ * SOCK_PACKET lets us set the hardware source address of outgoing
+ * packets. */
+ fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
+ if (fd < 0) {
+ return errno;
+ }
+
+ /* Bind to specific ethernet device. */
+ memset(&sa, 0, sizeof sa);
+ sa.sa_family = AF_UNSPEC;
+ strncpy((char *) sa.sa_data, name, sizeof sa.sa_data);
+ if (bind(fd, &sa, sizeof sa) < 0) {
+ VLOG_ERR("bind to %s failed: %s", name, strerror(errno));
+ goto error;
+ }
+
+ /* Between the socket() and bind() calls above, the socket receives all
+ * packets on all system interfaces. We do not want to receive that
+ * data, but there is no way to avoid it. So we must now drain out the
+ * receive queue. There is no way to know how long the receive queue is,
+ * but we know that the total number of byted queued does not exceed the
+ * receive buffer size, so we pull packets until none are left or we've
+ * read that many bytes. */
+ rcvbuf_len = sizeof rcvbuf;
+ if (getsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcvbuf, &rcvbuf_len) < 0) {
+ VLOG_ERR("getsockopt(SO_RCVBUF) on %s device failed: %s",
+ name, strerror(errno));
+ goto error;
+ }
+ while (rcvbuf > 0) {
+ char buffer;
+ ssize_t n_bytes = recv(fd, &buffer, 1, MSG_TRUNC | MSG_DONTWAIT);
+ if (n_bytes <= 0) {
+ break;
+ }
+ rcvbuf -= n_bytes;
+ }
+
+ /* Get ethernet device index and hardware address. */
+ strncpy(ifr.ifr_name, name, sizeof ifr.ifr_name);
+ if (ioctl(fd, SIOCGIFINDEX, &ifr) < 0) {
+ VLOG_ERR("ioctl(SIOCGIFINDEX) on %s device failed: %s",
+ name, strerror(errno));
+ goto error;
+ }
+ ifindex = ifr.ifr_ifindex;
+ if (ioctl(fd, SIOCGIFHWADDR, &ifr) < 0) {
+ VLOG_ERR("ioctl(SIOCGIFHWADDR) on %s device failed: %s",
+ name, strerror(errno));
+ goto error;
+ }
+ if (ifr.ifr_hwaddr.sa_family != AF_UNSPEC
+ && ifr.ifr_hwaddr.sa_family != ARPHRD_ETHER) {
+ VLOG_WARN("%s device has unknown hardware address family %d",
+ name, (int) ifr.ifr_hwaddr.sa_family);
+ }
+ memcpy(etheraddr, ifr.ifr_hwaddr.sa_data, sizeof etheraddr);
+
+ /* Allocate network device. */
+ netdev = xmalloc(sizeof *netdev);
+ netdev->name = xstrdup(name);
+ netdev->fd = fd;
+ memcpy(netdev->etheraddr, etheraddr, sizeof etheraddr);
+
+ /* Get speed, features. */
+ do_ethtool(netdev);
+
+ /* Save flags to restore at close or exit. */
+ if (ioctl(fd, SIOCGIFFLAGS, &ifr) < 0) {
+ VLOG_ERR("ioctl(SIOCGIFFLAGS) on %s device failed: %s",
+ name, strerror(errno));
+ goto error;
+ }
+ netdev->save_flags = ifr.ifr_flags;
+ fatal_signal_block();
+ list_push_back(&netdev_list, &netdev->node);
+ fatal_signal_unblock();
+
+ /* Bring up interface and set promiscuous mode. */
+ ifr.ifr_flags |= IFF_PROMISC | IFF_UP;
+ if (ioctl(fd, SIOCSIFFLAGS, &ifr) < 0) {
+ error = errno;
+ VLOG_ERR("failed to set promiscuous mode on %s device: %s",
+ name, strerror(errno));
+ netdev_close(netdev);
+ return error;
+ }
+
+ /* Report IP addresses to administrator. */
+ check_ipv4_address(name);
+ check_ipv6_address(name);
+
+ /* Success! */
+ *netdev_ = netdev;
+ return 0;
+
+error:
+ error = errno;
+ close(fd);
+ return error;
+}
+
+void
+netdev_close(struct netdev *netdev)
+{
+ if (netdev) {
+ /* Bring down interface and drop promiscuous mode, if we brought up
+ * the interface or enabled promiscuous mode. */
+ int error;
+ fatal_signal_block();
+ error = restore_flags(netdev);
+ list_remove(&netdev->node);
+ fatal_signal_unblock();
+ if (error) {
+ VLOG_WARN("failed to restore network device flags on %s: %s",
+ netdev->name, strerror(error));
+ }
+
+ /* Free. */
+ free(netdev->name);
+ close(netdev->fd);
+ free(netdev);
+ }
+}
+
+static void
+pad_to_minimum_length(struct buffer *buffer)
+{
+ if (buffer->size < ETH_TOTAL_MIN) {
+ size_t shortage = ETH_TOTAL_MIN - buffer->size;
+ memset(buffer_put_uninit(buffer, shortage), 0, shortage);
+ }
+}
+
+int
+netdev_recv(struct netdev *netdev, struct buffer *buffer, bool block)
+{
+ ssize_t n_bytes;
+
+ assert(buffer->size == 0);
+ assert(buffer_tailroom(buffer) >= ETH_TOTAL_MIN);
+ do {
+ n_bytes = recv(netdev->fd,
+ buffer_tail(buffer), buffer_tailroom(buffer),
+ block ? 0 : MSG_DONTWAIT);
+ } while (n_bytes < 0 && errno == EINTR);
+ if (n_bytes < 0) {
+ if (errno != EAGAIN) {
+ VLOG_WARN("error receiving Ethernet packet on %s: %s",
+ strerror(errno), netdev->name);
+ }
+ return errno;
+ } else {
+ buffer->size += n_bytes;
+
+ /* When the kernel internally sends out an Ethernet frame on an
+ * interface, it gives us a copy *before* padding the frame to the
+ * minimum length. Thus, when it sends out something like an ARP
+ * request, we see a too-short frame. So pad it out to the minimum
+ * length. */
+ pad_to_minimum_length(buffer);
+ return 0;
+ }
+}
+
+int
+netdev_send(struct netdev *netdev, struct buffer *buffer, bool block)
+{
+ ssize_t n_bytes;
+ const struct eth_header *eh;
+ struct sockaddr_pkt spkt;
+
+ /* Ensure packet is long enough. (Although all incoming packets are at
+ * least ETH_TOTAL_MIN bytes long, we could have trimmed some data off a
+ * minimum-size packet, e.g. by dropping a vlan header.) */
+ pad_to_minimum_length(buffer);
+
+ /* Construct packet sockaddr, which SOCK_PACKET requires. */
+ spkt.spkt_family = AF_PACKET;
+ strncpy((char *) spkt.spkt_device, netdev->name, sizeof spkt.spkt_device);
+ eh = buffer_at_assert(buffer, 0, sizeof *eh);
+ spkt.spkt_protocol = eh->eth_type;
+
+ do {
+ n_bytes = sendto(netdev->fd, buffer->data, buffer->size,
+ block ? 0 : MSG_DONTWAIT,
+ (const struct sockaddr *) &spkt, sizeof spkt);
+ } while (n_bytes < 0 && errno == EINTR);
+ if (n_bytes < 0) {
+ if (errno != EAGAIN) {
+ VLOG_WARN("error sending Ethernet packet on %s: %s",
+ netdev->name, strerror(errno));
+ }
+ return errno;
+ } else if (n_bytes != buffer->size) {
+ VLOG_WARN("send partial Ethernet packet (%d bytes of %d) on %s",
+ (int) n_bytes, buffer->size, netdev->name);
+ return EMSGSIZE;
+ } else {
+ return 0;
+ }
+}
+
+const uint8_t *
+netdev_get_etheraddr(const struct netdev *netdev)
+{
+ return netdev->etheraddr;
+}
+
+int
+netdev_get_fd(const struct netdev *netdev)
+{
+ return netdev->fd;
+}
+
+const char *
+netdev_get_name(const struct netdev *netdev)
+{
+ return netdev->name;
+}
+
+int
+netdev_get_speed(const struct netdev *netdev)
+{
+ return netdev->speed;
+}
+
+uint32_t
+netdev_get_features(const struct netdev *netdev)
+{
+ return netdev->features;
+}
+\f
+static void restore_all_flags(void *aux);
+
+static void
+init_netdev(void)
+{
+ static bool inited;
+ if (!inited) {
+ inited = true;
+ fatal_signal_add_hook(restore_all_flags, NULL);
+ }
+}
+
+static int
+restore_flags(struct netdev *netdev)
+{
+ struct ifreq ifr;
+
+ /* Get current flags. */
+ strncpy(ifr.ifr_name, netdev->name, sizeof ifr.ifr_name);
+ if (ioctl(netdev->fd, SIOCGIFFLAGS, &ifr) < 0) {
+ return errno;
+ }
+
+ /* Restore flags that we might have changed, if necessary. */
+ if ((ifr.ifr_flags ^ netdev->save_flags) & (IFF_PROMISC | IFF_UP)) {
+ ifr.ifr_flags &= ~(IFF_PROMISC | IFF_UP);
+ ifr.ifr_flags |= netdev->save_flags & (IFF_PROMISC | IFF_UP);
+ if (ioctl(netdev->fd, SIOCSIFFLAGS, &ifr) < 0) {
+ return errno;
+ }
+ }
+
+ return 0;
+}
+
+static void
+restore_all_flags(void *aux UNUSED)
+{
+ struct netdev *netdev;
+ LIST_FOR_EACH (netdev, struct netdev, node, &netdev_list) {
+ restore_flags(netdev);
+ }
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef NETDEV_H
+#define NETDEV_H 1
+
+#include <stdbool.h>
+#include <stdint.h>
+
+struct buffer;
+
+struct netdev;
+int netdev_open(const char *name, struct netdev **);
+void netdev_close(struct netdev *);
+int netdev_recv(struct netdev *, struct buffer *, bool block);
+int netdev_send(struct netdev *, struct buffer *, bool block);
+const uint8_t *netdev_get_etheraddr(const struct netdev *);
+int netdev_get_fd(const struct netdev *);
+const char *netdev_get_name(const struct netdev *);
+int netdev_get_speed(const struct netdev *);
+uint32_t netdev_get_features(const struct netdev *);
+
+#endif /* netdev.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "switch-flow.h"
+#include <arpa/inet.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+#include "buffer.h"
+#include "openflow.h"
+#include "packets.h"
+
+/* Internal function used to compare fields in flow. */
+static inline
+int flow_fields_match(const struct flow *a, const struct flow *b, uint16_t w)
+{
+ return ((w & OFPFW_IN_PORT || a->in_port == b->in_port)
+ && (w & OFPFW_DL_VLAN || a->dl_vlan == b->dl_vlan)
+ && (w & OFPFW_DL_SRC || !memcmp(a->dl_src, b->dl_src, ETH_ADDR_LEN))
+ && (w & OFPFW_DL_DST || !memcmp(a->dl_dst, b->dl_dst, ETH_ADDR_LEN))
+ && (w & OFPFW_DL_TYPE || a->dl_type == b->dl_type)
+ && (w & OFPFW_NW_SRC || a->nw_src == b->nw_src)
+ && (w & OFPFW_NW_DST || a->nw_dst == b->nw_dst)
+ && (w & OFPFW_NW_PROTO || a->nw_proto == b->nw_proto)
+ && (w & OFPFW_TP_SRC || a->tp_src == b->tp_src)
+ && (w & OFPFW_TP_DST || a->tp_dst == b->tp_dst));
+}
+
+/* Returns nonzero if 'a' and 'b' match, that is, if their fields are equal
+ * modulo wildcards, zero otherwise. */
+inline
+int flow_matches(const struct sw_flow_key *a, const struct sw_flow_key *b)
+{
+ return flow_fields_match(&a->flow, &b->flow, a->wildcards | b->wildcards);
+}
+
+/* Returns nonzero if 't' (the table entry's key) and 'd' (the key
+ * describing the deletion) match, that is, if their fields are
+ * equal modulo wildcards, zero otherwise. If 'strict' is nonzero, the
+ * wildcards must match in both 't_key' and 'd_key'. Note that the
+ * table's wildcards are ignored unless 'strict' is set. */
+inline
+int flow_del_matches(const struct sw_flow_key *t, const struct sw_flow_key *d, int strict)
+{
+ if (strict && t->wildcards != d->wildcards)
+ return 0;
+
+ return flow_fields_match(&t->flow, &d->flow, d->wildcards);
+}
+
+void flow_extract_match(struct sw_flow_key* to, const struct ofp_match* from)
+{
+ to->wildcards = ntohs(from->wildcards) & OFPFW_ALL;
+ to->flow.in_port = from->in_port;
+ to->flow.dl_vlan = from->dl_vlan;
+ memcpy(to->flow.dl_src, from->dl_src, ETH_ADDR_LEN);
+ memcpy(to->flow.dl_dst, from->dl_dst, ETH_ADDR_LEN);
+ to->flow.dl_type = from->dl_type;
+ to->flow.nw_src = from->nw_src;
+ to->flow.nw_dst = from->nw_dst;
+ to->flow.nw_proto = from->nw_proto;
+ to->flow.tp_src = from->tp_src;
+ to->flow.tp_dst = from->tp_dst;
+ to->flow.reserved = 0;
+}
+
+void flow_fill_match(struct ofp_match* to, const struct sw_flow_key* from)
+{
+ to->wildcards = htons(from->wildcards);
+ to->in_port = from->flow.in_port;
+ to->dl_vlan = from->flow.dl_vlan;
+ memcpy(to->dl_src, from->flow.dl_src, ETH_ADDR_LEN);
+ memcpy(to->dl_dst, from->flow.dl_dst, ETH_ADDR_LEN);
+ to->dl_type = from->flow.dl_type;
+ to->nw_src = from->flow.nw_src;
+ to->nw_dst = from->flow.nw_dst;
+ to->nw_proto = from->flow.nw_proto;
+ to->tp_src = from->flow.tp_src;
+ to->tp_dst = from->flow.tp_dst;
+ memset(to->pad, '\0', sizeof(to->pad));
+}
+
+/* Allocates and returns a new flow with 'n_actions' action, using allocation
+ * flags 'flags'. Returns the new flow or a null pointer on failure. */
+struct sw_flow *flow_alloc(int n_actions)
+{
+ struct sw_flow *flow = malloc(sizeof *flow);
+ if (!flow)
+ return NULL;
+
+ flow->n_actions = n_actions;
+ flow->actions = malloc(n_actions * sizeof *flow->actions);
+ if (!flow->actions && n_actions > 0) {
+ free(flow);
+ return NULL;
+ }
+ return flow;
+}
+
+/* Frees 'flow' immediately. */
+void flow_free(struct sw_flow *flow)
+{
+ if (!flow) {
+ return;
+ }
+ free(flow->actions);
+ free(flow);
+}
+
+/* Prints a representation of 'key' to the kernel log. */
+void print_flow(const struct sw_flow_key *key)
+{
+ const struct flow *f = &key->flow;
+ printf("wild%04x port%04x:vlan%04x mac%02x:%02x:%02x:%02x:%02x:%02x"
+ "->%02x:%02x:%02x:%02x:%02x:%02x "
+ "proto%04x ip%u.%u.%u.%u->%u.%u.%u.%u port%d->%d\n",
+ key->wildcards, ntohs(f->in_port), ntohs(f->dl_vlan),
+ f->dl_src[0], f->dl_src[1], f->dl_src[2],
+ f->dl_src[3], f->dl_src[4], f->dl_src[5],
+ f->dl_dst[0], f->dl_dst[1], f->dl_dst[2],
+ f->dl_dst[3], f->dl_dst[4], f->dl_dst[5],
+ ntohs(f->dl_type),
+ ((unsigned char *)&f->nw_src)[0],
+ ((unsigned char *)&f->nw_src)[1],
+ ((unsigned char *)&f->nw_src)[2],
+ ((unsigned char *)&f->nw_src)[3],
+ ((unsigned char *)&f->nw_dst)[0],
+ ((unsigned char *)&f->nw_dst)[1],
+ ((unsigned char *)&f->nw_dst)[2],
+ ((unsigned char *)&f->nw_dst)[3],
+ ntohs(f->tp_src), ntohs(f->tp_dst));
+}
+
+int flow_timeout(struct sw_flow *flow)
+{
+ if (flow->max_idle == OFP_FLOW_PERMANENT)
+ return 0;
+
+ /* FIXME */
+ return time(0) > flow->timeout;
+}
+
+void flow_used(struct sw_flow *flow, struct buffer *buffer)
+{
+ if (flow->max_idle != OFP_FLOW_PERMANENT)
+ flow->timeout = time(0) + flow->max_idle;
+
+ flow->packet_count++;
+ flow->byte_count += buffer->size;
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef SWITCH_FLOW_H
+#define SWITCH_FLOW_H 1
+
+#include <time.h>
+#include "flow.h"
+#include "list.h"
+
+struct ofp_match;
+
+/* Identification data for a flow. */
+struct sw_flow_key {
+ struct flow flow; /* Flow data (in network byte order). */
+ uint32_t wildcards; /* Wildcard fields (in host byte order). */
+};
+
+/* Maximum number of actions in a single flow entry. */
+#define MAX_ACTIONS 16
+
+struct sw_flow {
+ struct sw_flow_key key;
+
+ uint32_t group_id; /* Flow group ID (for QoS). */
+ uint16_t max_idle; /* Idle time before discarding (seconds). */
+ time_t created; /* When the flow was created. */
+ time_t timeout; /* When the flow expires (if idle). */
+ uint64_t packet_count; /* Number of packets seen. */
+ uint64_t byte_count; /* Number of bytes seen. */
+ struct list node;
+
+ /* Actions (XXX probably most flows have only a single action). */
+ unsigned int n_actions;
+ struct ofp_action *actions;
+};
+
+int flow_matches(const struct sw_flow_key *, const struct sw_flow_key *);
+int flow_del_matches(const struct sw_flow_key *, const struct sw_flow_key *,
+ int);
+struct sw_flow *flow_alloc(int n_actions);
+void flow_free(struct sw_flow *);
+void flow_deferred_free(struct sw_flow *);
+void flow_extract_match(struct sw_flow_key* to, const struct ofp_match* from);
+void flow_fill_match(struct ofp_match* to, const struct sw_flow_key* from);
+
+void print_flow(const struct sw_flow_key *);
+int flow_timeout(struct sw_flow *flow);
+void flow_used(struct sw_flow *flow, struct buffer *buffer);
+
+#endif /* switch-flow.h */
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include <errno.h>
+#include <getopt.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "command-line.h"
+#include "controller.h"
+#include "datapath.h"
+#include "fault.h"
+#include "openflow.h"
+#include "poll-loop.h"
+#include "queue.h"
+#include "util.h"
+#include "vconn.h"
+#include "vconn-ssl.h"
+#include "vlog-socket.h"
+
+#define THIS_MODULE VLM_switch
+#include "vlog.h"
+
+static void parse_options(int argc, char *argv[]);
+static void usage(void) NO_RETURN;
+
+static bool reliable = true;
+static struct datapath *dp;
+static uint64_t dpid = UINT64_MAX;
+static char *port_list;
+
+static void add_ports(struct datapath *dp, char *port_list);
+
+int
+main(int argc, char *argv[])
+{
+ struct controller_connection cc;
+ int error;
+
+ set_program_name(argv[0]);
+ register_fault_handlers();
+ vlog_init();
+ parse_options(argc, argv);
+
+ if (argc - optind != 1) {
+ fatal(0, "missing controller argument; use --help for usage");
+ }
+
+ controller_init(&cc, argv[optind], reliable);
+ error = dp_new(&dp, dpid, &cc);
+ if (error) {
+ fatal(error, "could not create datapath");
+ }
+ if (port_list) {
+ add_ports(dp, port_list);
+ }
+
+ error = vlog_server_listen(NULL, NULL);
+ if (error) {
+ fatal(error, "could not listen for vlog connections");
+ }
+
+ for (;;) {
+ controller_run(&cc, dp);
+ dp_run(dp);
+ dp_wait(dp);
+ controller_wait(&cc);
+ poll_block();
+ }
+
+ return 0;
+}
+
+static void
+add_ports(struct datapath *dp, char *port_list)
+{
+ char *port, *save_ptr;
+
+ /* Glibc 2.7 has a bug in strtok_r when compiling with optimization that
+ * can cause segfaults here:
+ * http://sources.redhat.com/bugzilla/show_bug.cgi?id=5614.
+ * Using ",," instead of the obvious "," works around it. */
+ for (port = strtok_r(port_list, ",,", &save_ptr); port;
+ port = strtok_r(NULL, ",,", &save_ptr)) {
+ int error = dp_add_port(dp, port);
+ if (error) {
+ fatal(error, "failed to add port %s", port);
+ }
+ }
+}
+
+static void
+parse_options(int argc, char *argv[])
+{
+ static struct option long_options[] = {
+ {"interfaces", required_argument, 0, 'i'},
+ {"unreliable", no_argument, 0, 'u'},
+ {"datapath-id", required_argument, 0, 'd'},
+ {"verbose", optional_argument, 0, 'v'},
+ {"help", no_argument, 0, 'h'},
+ {"version", no_argument, 0, 'V'},
+#ifdef HAVE_OPENSSL
+ {"private-key", required_argument, 0, 'p'},
+ {"certificate", required_argument, 0, 'c'},
+ {"ca-cert", required_argument, 0, 'C'},
+#endif
+ {0, 0, 0, 0},
+ };
+ char *short_options = long_options_to_short_options(long_options);
+
+ for (;;) {
+ int indexptr;
+ int c;
+
+ c = getopt_long(argc, argv, short_options, long_options, &indexptr);
+ if (c == -1) {
+ break;
+ }
+
+ switch (c) {
+ case 'u':
+ reliable = false;
+ break;
+
+ case 'd':
+ if (strlen(optarg) != 12
+ || strspn(optarg, "0123456789abcdefABCDEF") != 12) {
+ fatal(0, "argument to -d or --datapath-id must be "
+ "exactly 12 hex digits");
+ }
+ dpid = strtoll(optarg, NULL, 16);
+ if (!dpid) {
+ fatal(0, "argument to -d or --datapath-id must be nonzero");
+ }
+ break;
+
+ case 'h':
+ usage();
+
+ case 'V':
+ printf("%s "VERSION" compiled "__DATE__" "__TIME__"\n", argv[0]);
+ exit(EXIT_SUCCESS);
+
+ case 'v':
+ vlog_set_verbosity(optarg);
+ break;
+
+ case 'i':
+ if (!port_list) {
+ port_list = optarg;
+ } else {
+ port_list = xasprintf("%s,%s", port_list, optarg);
+ }
+ break;
+
+#ifdef HAVE_OPENSSL
+ case 'p':
+ vconn_ssl_set_private_key_file(optarg);
+ break;
+
+ case 'c':
+ vconn_ssl_set_certificate_file(optarg);
+ break;
+
+ case 'C':
+ vconn_ssl_set_ca_cert_file(optarg);
+ break;
+#endif
+
+ case '?':
+ exit(EXIT_FAILURE);
+
+ default:
+ abort();
+ }
+ }
+ free(short_options);
+}
+
+static void
+usage(void)
+{
+ printf("%s: userspace OpenFlow switch\n"
+ "usage: %s [OPTIONS] CONTROLLER\n"
+ "CONTROLLER must be one of the following:\n"
+ " tcp:HOST[:PORT] PORT (default: %d) on remote TCP HOST\n",
+ program_name, program_name, OFP_TCP_PORT);
+#ifdef HAVE_OPENSSL
+ printf(" ssl:HOST[:PORT] SSL PORT (default: %d) on remote HOST\n"
+ "\nPKI configuration (required to use SSL):\n"
+ " -p, --private-key=FILE file with private key\n"
+ " -c, --certificate=FILE file with certificate for private key\n"
+ " -C, --ca-cert=FILE file with peer CA certificate\n",
+ OFP_SSL_PORT);
+#endif
+ printf("Options:\n"
+ " -i, --interfaces=NETDEV[,NETDEV]...\n"
+ " add specified initial switch ports\n"
+ " -d, --datapath-id=ID Use ID as the OpenFlow switch ID\n"
+ " (ID must consist of 12 hex digits)\n"
+ " -u, --unreliable do not reconnect to controller\n"
+ " -v, --verbose set maximum verbosity level\n"
+ " -h, --help display this help message\n"
+ " -V, --version display version information\n");
+ exit(EXIT_SUCCESS);
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "table.h"
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+#include "crc32.h"
+#include "flow.h"
+#include "datapath.h"
+
+struct sw_table_hash {
+ struct sw_table swt;
+ struct crc32 crc32;
+ unsigned int n_flows;
+ unsigned int bucket_mask; /* Number of buckets minus 1. */
+ struct sw_flow **buckets;
+};
+
+static struct sw_flow **find_bucket(struct sw_table *swt,
+ const struct sw_flow_key *key)
+{
+ struct sw_table_hash *th = (struct sw_table_hash *) swt;
+ unsigned int crc = crc32_calculate(&th->crc32, key, sizeof *key);
+ return &th->buckets[crc & th->bucket_mask];
+}
+
+static struct sw_flow *table_hash_lookup(struct sw_table *swt,
+ const struct sw_flow_key *key)
+{
+ struct sw_flow *flow = *find_bucket(swt, key);
+ return flow && !memcmp(&flow->key, key, sizeof *key) ? flow : NULL;
+}
+
+static int table_hash_insert(struct sw_table *swt, struct sw_flow *flow)
+{
+ struct sw_table_hash *th = (struct sw_table_hash *) swt;
+ struct sw_flow **bucket;
+ int retval;
+
+ if (flow->key.wildcards != 0)
+ return 0;
+
+ bucket = find_bucket(swt, &flow->key);
+ if (*bucket == NULL) {
+ th->n_flows++;
+ *bucket = flow;
+ retval = 1;
+ } else {
+ struct sw_flow *old_flow = *bucket;
+ if (!memcmp(&old_flow->key, &flow->key, sizeof flow->key)) {
+ *bucket = flow;
+ flow_free(old_flow);
+ retval = 1;
+ } else {
+ retval = 0;
+ }
+ }
+ return retval;
+}
+
+/* Caller must update n_flows. */
+static void
+do_delete(struct sw_flow **bucket)
+{
+ flow_free(*bucket);
+ *bucket = NULL;
+}
+
+/* Returns number of deleted flows. */
+static int table_hash_delete(struct sw_table *swt,
+ const struct sw_flow_key *key, int strict)
+{
+ struct sw_table_hash *th = (struct sw_table_hash *) swt;
+ unsigned int count = 0;
+
+ if (key->wildcards == 0) {
+ struct sw_flow **bucket = find_bucket(swt, key);
+ struct sw_flow *flow = *bucket;
+ if (flow && !memcmp(&flow->key, key, sizeof *key)) {
+ do_delete(bucket);
+ count = 1;
+ }
+ } else {
+ unsigned int i;
+
+ for (i = 0; i <= th->bucket_mask; i++) {
+ struct sw_flow **bucket = &th->buckets[i];
+ struct sw_flow *flow = *bucket;
+ if (flow && flow_del_matches(&flow->key, key, strict)) {
+ do_delete(bucket);
+ count++;
+ }
+ }
+ }
+ th->n_flows -= count;
+ return count;
+}
+
+static int table_hash_timeout(struct datapath *dp, struct sw_table *swt)
+{
+ struct sw_table_hash *th = (struct sw_table_hash *) swt;
+ unsigned int i;
+ int count = 0;
+
+ for (i = 0; i <= th->bucket_mask; i++) {
+ struct sw_flow **bucket = &th->buckets[i];
+ struct sw_flow *flow = *bucket;
+ if (flow && flow_timeout(flow)) {
+ dp_send_flow_expired(dp, flow);
+ do_delete(bucket);
+ count++;
+ }
+ }
+ th->n_flows -= count;
+ return count;
+}
+
+static void table_hash_destroy(struct sw_table *swt)
+{
+ struct sw_table_hash *th = (struct sw_table_hash *) swt;
+ unsigned int i;
+ for (i = 0; i <= th->bucket_mask; i++) {
+ if (th->buckets[i]) {
+ flow_free(th->buckets[i]);
+ }
+ }
+ free(th->buckets);
+ free(th);
+}
+
+struct swt_iterator_hash {
+ struct sw_table_hash *th;
+ unsigned int bucket_i;
+};
+
+static struct sw_flow *next_flow(struct swt_iterator_hash *ih)
+{
+ for (;ih->bucket_i <= ih->th->bucket_mask; ih->bucket_i++) {
+ struct sw_flow *f = ih->th->buckets[ih->bucket_i];
+ if (f != NULL)
+ return f;
+ }
+
+ return NULL;
+}
+
+static int table_hash_iterator(struct sw_table *swt,
+ struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_hash *ih;
+
+ swt_iter->private = ih = malloc(sizeof *ih);
+
+ if (ih == NULL)
+ return 0;
+
+ ih->th = (struct sw_table_hash *) swt;
+
+ ih->bucket_i = 0;
+ swt_iter->flow = next_flow(ih);
+
+ return 1;
+}
+
+static void table_hash_next(struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_hash *ih;
+
+ if (swt_iter->flow == NULL)
+ return;
+
+ ih = (struct swt_iterator_hash *) swt_iter->private;
+
+ ih->bucket_i++;
+ swt_iter->flow = next_flow(ih);
+}
+
+static void table_hash_iterator_destroy(struct swt_iterator *swt_iter)
+{
+ free(swt_iter->private);
+}
+
+static void table_hash_stats(struct sw_table *swt,
+ struct sw_table_stats *stats)
+{
+ struct sw_table_hash *th = (struct sw_table_hash *) swt;
+ stats->name = "hash";
+ stats->n_flows = th->n_flows;
+ stats->max_flows = th->bucket_mask + 1;
+}
+
+struct sw_table *table_hash_create(unsigned int polynomial,
+ unsigned int n_buckets)
+{
+ struct sw_table_hash *th;
+ struct sw_table *swt;
+
+ th = malloc(sizeof *th);
+ if (th == NULL)
+ return NULL;
+
+ assert(!(n_buckets & (n_buckets - 1)));
+ th->buckets = calloc(n_buckets, sizeof *th->buckets);
+ if (th->buckets == NULL) {
+ printf("failed to allocate %u buckets\n", n_buckets);
+ free(th);
+ return NULL;
+ }
+ th->bucket_mask = n_buckets - 1;
+
+ swt = &th->swt;
+ swt->lookup = table_hash_lookup;
+ swt->insert = table_hash_insert;
+ swt->delete = table_hash_delete;
+ swt->timeout = table_hash_timeout;
+ swt->destroy = table_hash_destroy;
+ swt->iterator = table_hash_iterator;
+ swt->iterator_next = table_hash_next;
+ swt->iterator_destroy = table_hash_iterator_destroy;
+ swt->stats = table_hash_stats;
+
+ crc32_init(&th->crc32, polynomial);
+
+ return swt;
+}
+
+/* Double-hashing table. */
+
+struct sw_table_hash2 {
+ struct sw_table swt;
+ struct sw_table *subtable[2];
+};
+
+static struct sw_flow *table_hash2_lookup(struct sw_table *swt,
+ const struct sw_flow_key *key)
+{
+ struct sw_table_hash2 *t2 = (struct sw_table_hash2 *) swt;
+ int i;
+
+ for (i = 0; i < 2; i++) {
+ struct sw_flow *flow = *find_bucket(t2->subtable[i], key);
+ if (flow && !memcmp(&flow->key, key, sizeof *key))
+ return flow;
+ }
+ return NULL;
+}
+
+static int table_hash2_insert(struct sw_table *swt, struct sw_flow *flow)
+{
+ struct sw_table_hash2 *t2 = (struct sw_table_hash2 *) swt;
+
+ if (table_hash_insert(t2->subtable[0], flow))
+ return 1;
+ return table_hash_insert(t2->subtable[1], flow);
+}
+
+static int table_hash2_delete(struct sw_table *swt,
+ const struct sw_flow_key *key, int strict)
+{
+ struct sw_table_hash2 *t2 = (struct sw_table_hash2 *) swt;
+ return (table_hash_delete(t2->subtable[0], key, strict)
+ + table_hash_delete(t2->subtable[1], key, strict));
+}
+
+static int table_hash2_timeout(struct datapath *dp, struct sw_table *swt)
+{
+ struct sw_table_hash2 *t2 = (struct sw_table_hash2 *) swt;
+ return (table_hash_timeout(dp, t2->subtable[0])
+ + table_hash_timeout(dp, t2->subtable[1]));
+}
+
+static void table_hash2_destroy(struct sw_table *swt)
+{
+ struct sw_table_hash2 *t2 = (struct sw_table_hash2 *) swt;
+ table_hash_destroy(t2->subtable[0]);
+ table_hash_destroy(t2->subtable[1]);
+ free(t2);
+}
+
+struct swt_iterator_hash2 {
+ struct sw_table_hash2 *th2;
+ struct swt_iterator ih;
+ uint8_t table_i;
+};
+
+static int table_hash2_iterator(struct sw_table *swt,
+ struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_hash2 *ih2;
+
+ swt_iter->private = ih2 = malloc(sizeof *ih2);
+ if (ih2 == NULL)
+ return 0;
+
+ ih2->th2 = (struct sw_table_hash2 *) swt;
+ if (!table_hash_iterator(ih2->th2->subtable[0], &ih2->ih)) {
+ free(ih2);
+ return 0;
+ }
+
+ if (ih2->ih.flow != NULL) {
+ swt_iter->flow = ih2->ih.flow;
+ ih2->table_i = 0;
+ } else {
+ table_hash_iterator_destroy(&ih2->ih);
+ ih2->table_i = 1;
+ if (!table_hash_iterator(ih2->th2->subtable[1], &ih2->ih)) {
+ free(ih2);
+ return 0;
+ }
+ swt_iter->flow = ih2->ih.flow;
+ }
+
+ return 1;
+}
+
+static void table_hash2_next(struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_hash2 *ih2;
+
+ if (swt_iter->flow == NULL)
+ return;
+
+ ih2 = (struct swt_iterator_hash2 *) swt_iter->private;
+ table_hash_next(&ih2->ih);
+
+ if (ih2->ih.flow != NULL) {
+ swt_iter->flow = ih2->ih.flow;
+ } else {
+ if (ih2->table_i == 0) {
+ table_hash_iterator_destroy(&ih2->ih);
+ ih2->table_i = 1;
+ if (!table_hash_iterator(ih2->th2->subtable[1], &ih2->ih)) {
+ ih2->ih.private = NULL;
+ swt_iter->flow = NULL;
+ } else {
+ swt_iter->flow = ih2->ih.flow;
+ }
+ } else {
+ swt_iter->flow = NULL;
+ }
+ }
+}
+
+static void table_hash2_iterator_destroy(struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_hash2 *ih2;
+
+ ih2 = (struct swt_iterator_hash2 *) swt_iter->private;
+ if (ih2->ih.private != NULL)
+ table_hash_iterator_destroy(&ih2->ih);
+ free(ih2);
+}
+
+static void table_hash2_stats(struct sw_table *swt,
+ struct sw_table_stats *stats)
+{
+ struct sw_table_hash2 *t2 = (struct sw_table_hash2 *) swt;
+ struct sw_table_stats substats[2];
+ int i;
+
+ for (i = 0; i < 2; i++)
+ table_hash_stats(t2->subtable[i], &substats[i]);
+ stats->name = "hash2";
+ stats->n_flows = substats[0].n_flows + substats[1].n_flows;
+ stats->max_flows = substats[0].max_flows + substats[1].max_flows;
+}
+
+struct sw_table *table_hash2_create(unsigned int poly0, unsigned int buckets0,
+ unsigned int poly1, unsigned int buckets1)
+
+{
+ struct sw_table_hash2 *t2;
+ struct sw_table *swt;
+
+ t2 = malloc(sizeof *t2);
+ if (t2 == NULL)
+ return NULL;
+
+ t2->subtable[0] = table_hash_create(poly0, buckets0);
+ if (t2->subtable[0] == NULL)
+ goto out_free_t2;
+
+ t2->subtable[1] = table_hash_create(poly1, buckets1);
+ if (t2->subtable[1] == NULL)
+ goto out_free_subtable0;
+
+ swt = &t2->swt;
+ swt->lookup = table_hash2_lookup;
+ swt->insert = table_hash2_insert;
+ swt->delete = table_hash2_delete;
+ swt->timeout = table_hash2_timeout;
+ swt->destroy = table_hash2_destroy;
+ swt->stats = table_hash2_stats;
+
+ swt->iterator = table_hash2_iterator;
+ swt->iterator_next = table_hash2_next;
+ swt->iterator_destroy = table_hash2_iterator_destroy;
+
+ return swt;
+
+out_free_subtable0:
+ table_hash_destroy(t2->subtable[0]);
+out_free_t2:
+ free(t2);
+ return NULL;
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "table.h"
+#include <stdlib.h>
+#include "flow.h"
+#include "list.h"
+#include "switch-flow.h"
+#include "datapath.h"
+
+struct sw_table_linear {
+ struct sw_table swt;
+
+ unsigned int max_flows;
+ unsigned int n_flows;
+ struct list flows;
+};
+
+static struct sw_flow *table_linear_lookup(struct sw_table *swt,
+ const struct sw_flow_key *key)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+ struct sw_flow *flow;
+ LIST_FOR_EACH (flow, struct sw_flow, node, &tl->flows) {
+ if (flow_matches(&flow->key, key))
+ return flow;
+ }
+ return NULL;
+}
+
+static int table_linear_insert(struct sw_table *swt, struct sw_flow *flow)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+ struct sw_flow *f;
+
+ /* Replace flows that match exactly. */
+ LIST_FOR_EACH (f, struct sw_flow, node, &tl->flows) {
+ if (f->key.wildcards == flow->key.wildcards
+ && flow_matches(&f->key, &flow->key)) {
+ list_replace(&flow->node, &f->node);
+ flow_free(f);
+ return 1;
+ }
+ }
+
+ /* Table overflow? */
+ if (tl->n_flows >= tl->max_flows) {
+ return 0;
+ }
+ tl->n_flows++;
+
+ /* FIXME: need to order rules from most to least specific. */
+ list_push_back(&tl->flows, &flow->node);
+ return 1;
+}
+
+static void
+do_delete(struct sw_flow *flow)
+{
+ list_remove(&flow->node);
+ flow_free(flow);
+}
+
+static int table_linear_delete(struct sw_table *swt,
+ const struct sw_flow_key *key, int strict)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+ struct sw_flow *flow, *n;
+ unsigned int count = 0;
+
+ LIST_FOR_EACH_SAFE (flow, n, struct sw_flow, node, &tl->flows) {
+ if (flow_del_matches(&flow->key, key, strict)) {
+ do_delete(flow);
+ count++;
+ }
+ }
+ tl->n_flows -= count;
+ return count;
+}
+
+static int table_linear_timeout(struct datapath *dp, struct sw_table *swt)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+ struct sw_flow *flow, *n;
+ int count = 0;
+
+ LIST_FOR_EACH_SAFE (flow, n, struct sw_flow, node, &tl->flows) {
+ if (flow_timeout(flow)) {
+ dp_send_flow_expired(dp, flow);
+ do_delete(flow);
+ count++;
+ }
+ }
+ tl->n_flows -= count;
+ return count;
+}
+
+static void table_linear_destroy(struct sw_table *swt)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+
+ while (!list_is_empty(&tl->flows)) {
+ struct sw_flow *flow = CONTAINER_OF(list_front(&tl->flows),
+ struct sw_flow, node);
+ list_remove(&flow->node);
+ flow_free(flow);
+ }
+ free(tl);
+}
+
+/* Linear table's private data is just a pointer to the table */
+
+static int table_linear_iterator(struct sw_table *swt,
+ struct swt_iterator *swt_iter)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+
+ swt_iter->private = tl;
+
+ if (!tl->n_flows)
+ swt_iter->flow = NULL;
+ else
+ swt_iter->flow = CONTAINER_OF(list_front(&tl->flows), struct sw_flow, node);
+
+ return 1;
+}
+
+static void table_linear_next(struct swt_iterator *swt_iter)
+{
+ struct sw_table_linear *tl;
+ struct list *next;
+
+ if (swt_iter->flow == NULL)
+ return;
+
+ tl = (struct sw_table_linear *) swt_iter->private;
+
+ next = swt_iter->flow->node.next;
+ if (next == &tl->flows)
+ swt_iter->flow = NULL;
+ else
+ swt_iter->flow = CONTAINER_OF(next, struct sw_flow, node);
+}
+
+static void table_linear_iterator_destroy(struct swt_iterator *swt_iter)
+{}
+
+static void table_linear_stats(struct sw_table *swt,
+ struct sw_table_stats *stats)
+{
+ struct sw_table_linear *tl = (struct sw_table_linear *) swt;
+ stats->name = "linear";
+ stats->n_flows = tl->n_flows;
+ stats->max_flows = tl->max_flows;
+}
+
+
+struct sw_table *table_linear_create(unsigned int max_flows)
+{
+ struct sw_table_linear *tl;
+ struct sw_table *swt;
+
+ tl = calloc(1, sizeof *tl);
+ if (tl == NULL)
+ return NULL;
+
+ swt = &tl->swt;
+ swt->lookup = table_linear_lookup;
+ swt->insert = table_linear_insert;
+ swt->delete = table_linear_delete;
+ swt->timeout = table_linear_timeout;
+ swt->destroy = table_linear_destroy;
+ swt->stats = table_linear_stats;
+
+ swt->iterator = table_linear_iterator;
+ swt->iterator_next = table_linear_next;
+ swt->iterator_destroy = table_linear_iterator_destroy;
+
+ tl->max_flows = max_flows;
+ tl->n_flows = 0;
+ list_init(&tl->flows);
+
+ return swt;
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "table.h"
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+#include "crc32.h"
+#include "switch-flow.h"
+#include "openflow.h"
+#include "datapath.h"
+
+struct sw_table_mac {
+ struct sw_table swt;
+ struct crc32 crc32;
+ unsigned int n_flows;
+ unsigned int max_flows;
+ unsigned int bucket_mask; /* Number of buckets minus 1. */
+ struct list *buckets;
+};
+
+static struct list *find_bucket(struct sw_table *swt,
+ const struct sw_flow_key *key)
+{
+ struct sw_table_mac *tm = (struct sw_table_mac *) swt;
+ unsigned int crc = crc32_calculate(&tm->crc32, key, sizeof *key);
+ return &tm->buckets[crc & tm->bucket_mask];
+}
+
+static struct sw_flow *table_mac_lookup(struct sw_table *swt,
+ const struct sw_flow_key *key)
+{
+ struct list *bucket = find_bucket(swt, key);
+ struct sw_flow *flow;
+ LIST_FOR_EACH (flow, struct sw_flow, node, bucket) {
+ if (!memcmp(key->flow.dl_src, flow->key.flow.dl_src, 6)) {
+ return flow;
+ }
+ }
+ return NULL;
+}
+
+static int table_mac_insert(struct sw_table *swt, struct sw_flow *flow)
+{
+ struct sw_table_mac *tm = (struct sw_table_mac *) swt;
+ struct list *bucket;
+ struct sw_flow *f;
+
+ /* MAC table only handles flows that match on Ethernet
+ source address and wildcard everything else. */
+ if (flow->key.wildcards != (OFPFW_ALL & ~OFPFW_DL_SRC))
+ return 0;
+ bucket = find_bucket(swt, &flow->key);
+
+ LIST_FOR_EACH (f, struct sw_flow, node, bucket) {
+ if (!memcmp(f->key.flow.dl_src, flow->key.flow.dl_src, 6)) {
+ list_replace(&flow->node, &f->node);
+ flow_free(f);
+ return 1;
+ }
+ }
+
+ /* Table overflow? */
+ if (tm->n_flows >= tm->max_flows) {
+ return 0;
+ }
+ tm->n_flows++;
+
+ list_push_front(bucket, &flow->node);
+ return 1;
+}
+
+static void
+do_delete(struct sw_flow *flow)
+{
+ list_remove(&flow->node);
+ flow_free(flow);
+}
+
+/* Returns number of deleted flows. */
+static int table_mac_delete(struct sw_table *swt,
+ const struct sw_flow_key *key, int strict)
+{
+ struct sw_table_mac *tm = (struct sw_table_mac *) swt;
+
+ if (key->wildcards == (OFPFW_ALL & ~OFPFW_DL_SRC)) {
+ struct sw_flow *flow = table_mac_lookup(swt, key);
+ if (flow) {
+ do_delete(flow);
+ tm->n_flows--;
+ return 1;
+ }
+ return 0;
+ } else {
+ unsigned int i;
+ int count = 0;
+ for (i = 0; i <= tm->bucket_mask; i++) {
+ struct list *bucket = &tm->buckets[i];
+ struct sw_flow *flow;
+ LIST_FOR_EACH (flow, struct sw_flow, node, bucket) {
+ if (flow_del_matches(&flow->key, key, strict)) {
+ do_delete(flow);
+ count++;
+ }
+ }
+ }
+ tm->n_flows -= count;
+ return count;
+ }
+}
+
+static int table_mac_timeout(struct datapath *dp, struct sw_table *swt)
+{
+ struct sw_table_mac *tm = (struct sw_table_mac *) swt;
+ unsigned int i;
+ int count = 0;
+
+ for (i = 0; i <= tm->bucket_mask; i++) {
+ struct list *bucket = &tm->buckets[i];
+ struct sw_flow *flow;
+ LIST_FOR_EACH (flow, struct sw_flow, node, bucket) {
+ if (flow_timeout(flow)) {
+ dp_send_flow_expired(dp, flow);
+ do_delete(flow);
+ count++;
+ }
+ }
+ }
+ tm->n_flows -= count;
+ return count;
+}
+
+static void table_mac_destroy(struct sw_table *swt)
+{
+ struct sw_table_mac *tm = (struct sw_table_mac *) swt;
+ unsigned int i;
+ for (i = 0; i <= tm->bucket_mask; i++) {
+ struct list *list = &tm->buckets[i];
+ while (!list_is_empty(list)) {
+ struct sw_flow *flow = CONTAINER_OF(list_front(list),
+ struct sw_flow, node);
+ list_remove(&flow->node);
+ flow_free(flow);
+ }
+ }
+ free(tm->buckets);
+ free(tm);
+}
+
+struct swt_iterator_mac {
+ struct sw_table_mac *tm;
+ unsigned int bucket_i;
+};
+
+static struct sw_flow *next_head_flow(struct swt_iterator_mac *im)
+{
+ for (; im->bucket_i <= im->tm->bucket_mask; im->bucket_i++) {
+ struct list *bucket = &im->tm->buckets[im->bucket_i];
+ if (!list_is_empty(bucket)) {
+ return CONTAINER_OF(bucket, struct sw_flow, node);
+ }
+ }
+ return NULL;
+}
+
+static int table_mac_iterator(struct sw_table *swt,
+ struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_mac *im;
+
+ swt_iter->private = im = malloc(sizeof *im);
+ if (im == NULL)
+ return 0;
+
+ im->tm = (struct sw_table_mac *) swt;
+
+ if (!im->tm->n_flows)
+ swt_iter->flow = NULL;
+ else {
+ im->bucket_i = 0;
+ swt_iter->flow = next_head_flow(im);
+ }
+
+ return 1;
+}
+
+static void table_mac_next(struct swt_iterator *swt_iter)
+{
+ struct swt_iterator_mac *im;
+ struct list *next;
+
+ if (swt_iter->flow == NULL)
+ return;
+
+ im = (struct swt_iterator_mac *) swt_iter->private;
+
+ next = swt_iter->flow->node.next;
+ if (next != NULL) {
+ swt_iter->flow = CONTAINER_OF(next, struct sw_flow, node);
+ } else {
+ im->bucket_i++;
+ swt_iter->flow = next_head_flow(im);
+ }
+}
+
+static void table_mac_iterator_destroy(struct swt_iterator *swt_iter)
+{
+ free(swt_iter->private);
+}
+
+static void table_mac_stats(struct sw_table *swt, struct sw_table_stats *stats)
+{
+ struct sw_table_mac *tm = (struct sw_table_mac *) swt;
+ stats->name = "mac";
+ stats->n_flows = tm->n_flows;
+ stats->max_flows = tm->max_flows;
+}
+
+struct sw_table *table_mac_create(unsigned int n_buckets,
+ unsigned int max_flows)
+{
+ struct sw_table_mac *tm;
+ struct sw_table *swt;
+ unsigned int i;
+
+ tm = calloc(1, sizeof *tm);
+ if (tm == NULL)
+ return NULL;
+
+ assert(!(n_buckets & (n_buckets - 1)));
+
+ tm->buckets = malloc(n_buckets * sizeof *tm->buckets);
+ if (tm->buckets == NULL) {
+ printf("failed to allocate %u buckets\n", n_buckets);
+ free(tm);
+ return NULL;
+ }
+ for (i = 0; i < n_buckets; i++) {
+ list_init(&tm->buckets[i]);
+ }
+ tm->bucket_mask = n_buckets - 1;
+
+ swt = &tm->swt;
+ swt->lookup = table_mac_lookup;
+ swt->insert = table_mac_insert;
+ swt->delete = table_mac_delete;
+ swt->timeout = table_mac_timeout;
+ swt->destroy = table_mac_destroy;
+ swt->stats = table_mac_stats;
+
+ swt->iterator = table_mac_iterator;
+ swt->iterator_next = table_mac_next;
+ swt->iterator_destroy = table_mac_iterator_destroy;
+
+ crc32_init(&tm->crc32, 0x04C11DB7); /* Ethernet CRC. */
+ tm->n_flows = 0;
+ tm->max_flows = max_flows;
+
+ return swt;
+}
--- /dev/null
+/* Copyright (C) 2008 Board of Trustees, Leland Stanford Jr. University.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/* Individual switching tables. Generally grouped together in a chain (see
+ * chain.h). */
+
+#ifndef TABLE_H
+#define TABLE_H 1
+
+struct sw_flow;
+struct sw_flow_key;
+struct datapath;
+
+/* Iterator through the flows stored in a table. */
+struct swt_iterator {
+ struct sw_flow *flow; /* Current flow, for use by client. */
+ void *private;
+};
+
+/* Table statistics. */
+struct sw_table_stats {
+ const char *name; /* Human-readable name. */
+ unsigned long int n_flows; /* Number of active flows. */
+ unsigned long int max_flows; /* Flow capacity. */
+};
+
+/* A single table of flows. */
+struct sw_table {
+ /* Searches 'table' for a flow matching 'key', which must not have any
+ * wildcard fields. Returns the flow if successful, a null pointer
+ * otherwise. */
+ struct sw_flow *(*lookup)(struct sw_table *table,
+ const struct sw_flow_key *key);
+
+ /* Inserts 'flow' into 'table', replacing any duplicate flow. Returns
+ * 0 if successful or a negative error. Error can be due to an
+ * over-capacity table or because the flow is not one of the kind that
+ * the table accepts.
+ *
+ * If successful, 'flow' becomes owned by 'table', otherwise it is
+ * retained by the caller. */
+ int (*insert)(struct sw_table *table, struct sw_flow *flow);
+
+ /* Deletes from 'table' any and all flows that match 'key' from
+ * 'table'. If 'strict' set, wildcards must match. Returns the
+ * number of flows that were deleted. */
+ int (*delete)(struct sw_table *table, const struct sw_flow_key *key,
+ int strict);
+
+ /* Performs timeout processing on all the flow entries in 'table'.
+ * Returns the number of flow entries deleted through expiration. */
+ int (*timeout)(struct datapath *dp, struct sw_table *table);
+
+ /* Destroys 'table', which must not have any users. */
+ void (*destroy)(struct sw_table *table);
+
+ int (*iterator)(struct sw_table *, struct swt_iterator *);
+ void (*iterator_next)(struct swt_iterator *);
+ void (*iterator_destroy)(struct swt_iterator *);
+
+ /* Dumps statistics for 'table' into 'stats'. */
+ void (*stats)(struct sw_table *table, struct sw_table_stats *stats);
+};
+
+struct sw_table *table_mac_create(unsigned int n_buckets,
+ unsigned int max_flows);
+struct sw_table *table_hash_create(unsigned int polynomial,
+ unsigned int n_buckets);
+struct sw_table *table_hash2_create(unsigned int poly0, unsigned int buckets0,
+ unsigned int poly1, unsigned int buckets1);
+struct sw_table *table_linear_create(unsigned int max_flows);
+
+#endif /* table.h */