Ben Pfaff [Sat, 3 Jan 2009 00:39:59 +0000 (16:39 -0800)]
vswitchd: Don't make and pass socketpair to secchan if we won't use it.
If there's a remote controller then secchan will talk to it, not to
vswitchd, so there's no need to make a socketpair for communication in
that case.
Ben Pfaff [Sat, 3 Jan 2009 00:10:19 +0000 (16:10 -0800)]
Improve compatibility fallbacks for allocating multicast groups.
The linux-2.6 compatibility code for allocating multicast groups only
allocated a single multicast group per Generic Netlink family. However,
OpenFlow performance is going to be better if we allocate one per OpenFlow
datapath (which are all in the same Generic Netlink family). This commit
implements that.
Thanks to Justin for pointing out the issue.
Ben Pfaff [Sat, 3 Jan 2009 00:13:07 +0000 (16:13 -0800)]
Use separate Netlink multicast groups for different datapaths.
Otherwise every socket that listens for OpenFlow multicast messages will
get them for every single datapath. Not only is that a waste of memory
and time, it also allows one congested bridge to effectively DoS other
bridges that have little traffic.
Ben Pfaff [Sat, 3 Jan 2009 00:24:35 +0000 (16:24 -0800)]
Merge master branch into vswitchd.
Justin Pettit [Fri, 2 Jan 2009 23:40:54 +0000 (15:40 -0800)]
Make the openflow and brcompt modules use different netlink mc groups.
Ben Pfaff [Fri, 2 Jan 2009 21:58:08 +0000 (13:58 -0800)]
Discard OpenFlow messages not for us in dpif_recv_openflow().
Currently there is only a single Netlink multicast group that is shared
by all OpenFlow datapaths. This is undesirable and should be fixed, but
a solution in every case is impossible due to Linux kernel limitations.
This commit works around the problem by dropping OpenFlow messages that
are related to a datapath that we are not interested in.
This should fix vswitchd behavior when more than one bridge is configured.
Thanks to Keith and Justin for diagnosing the problem.
Justin Pettit [Fri, 2 Jan 2009 19:04:21 +0000 (11:04 -0800)]
Correct VERIFY_NUL_STRING back-port that required a "0" instead of a null.
The VERIFY_NUL_STRING back-port to older kernels was looking for an ASCII
zero instead of the null-string terminator.
Ben Pfaff [Thu, 1 Jan 2009 22:42:10 +0000 (14:42 -0800)]
vswitchd: Add support for untagged VLAN ports in VLAN 0.
Ben Pfaff [Thu, 1 Jan 2009 22:33:32 +0000 (14:33 -0800)]
vswitchd: Change port mirroring to match semantics expected by users.
Ben Pfaff [Thu, 1 Jan 2009 01:23:21 +0000 (17:23 -0800)]
vswitchd: Support limiting the number of VLANs carried by a trunk port.
Ben Pfaff [Thu, 1 Jan 2009 00:43:28 +0000 (16:43 -0800)]
vswitchd: Minor code simplification.
The 'vlan' argument is exactly what we want here.
Ben Pfaff [Thu, 1 Jan 2009 18:09:32 +0000 (10:09 -0800)]
brcompat: Remove unneeded RCU locking.
The only memory accessed here is in 'dev', which can't
disappear because we maintain a reference count on it.
net_devices aren't RCU-locked anyway.
Ben Pfaff [Thu, 1 Jan 2009 18:08:28 +0000 (10:08 -0800)]
brcompat: Use ENOMEM to indicate out-of-memory (not EINVAL).
Ben Pfaff [Thu, 1 Jan 2009 18:08:09 +0000 (10:08 -0800)]
brcompat: Mark variables "static".
Ben Pfaff [Thu, 1 Jan 2009 03:52:01 +0000 (19:52 -0800)]
Minor style fixes.
Thanks to Reid for pointing these out.
Ben Pfaff [Thu, 1 Jan 2009 00:13:33 +0000 (16:13 -0800)]
vswitchd: Prevent a single interface from being added to two different ports.
Ben Pfaff [Wed, 31 Dec 2008 23:59:39 +0000 (15:59 -0800)]
secchan: Fix cut-and-paste errors in port speed determination.
Ben Pfaff [Wed, 31 Dec 2008 23:38:56 +0000 (15:38 -0800)]
vswitchd: Drop debug output that was accidentally included.
Ben Pfaff [Wed, 31 Dec 2008 22:38:43 +0000 (14:38 -0800)]
vswitchd: Implement port mirroring.
Ben Pfaff [Wed, 31 Dec 2008 22:34:13 +0000 (14:34 -0800)]
New functions for parsing integers.
Ben Pfaff [Wed, 31 Dec 2008 22:33:47 +0000 (14:33 -0800)]
New function svec_equal().
Justin Pettit [Wed, 31 Dec 2008 18:55:13 +0000 (10:55 -0800)]
Return meaningful errors for brctl modification commands.
When datapaths and interfaces are modified, we now do more thorough
checks to see whether they will succeed or not. When datapaths are
added, we now block until they are created, so that follow-on ioctl
calls to attach interfaces will immediately work.
Ben Pfaff [Wed, 31 Dec 2008 18:35:42 +0000 (10:35 -0800)]
Make the datapath tolerate kernels that lack NLA_NUL_STRING.
NLA_NUL_STRING was introduced in 2.6.19.
Ben Pfaff [Wed, 31 Dec 2008 17:56:20 +0000 (09:56 -0800)]
vswitchd: Be careful to sort all the svecs that are passed to svec_contains().
Justin Pettit [Wed, 31 Dec 2008 06:31:38 +0000 (22:31 -0800)]
Add support for "brctl show".
This makes datapaths and their interfaces show up as you'd expect when
"brctl show" is run. To get this functionality, you must insmod the
brcompat kernel module.
Ben Pfaff [Wed, 31 Dec 2008 00:06:06 +0000 (16:06 -0800)]
vswitchd: Fix fd leaks by closing files that we read in read_file().
Fixes bug #697.
Thanks to Martin for reporting this bug.
Ben Pfaff [Wed, 31 Dec 2008 00:02:17 +0000 (16:02 -0800)]
vswitchd: Add support for remote controller.
Ben Pfaff [Tue, 30 Dec 2008 23:26:33 +0000 (15:26 -0800)]
Fix bug in make_add_simple_flow() that busts secchan's in-band control.
Ben Pfaff [Tue, 30 Dec 2008 21:53:18 +0000 (13:53 -0800)]
vswitchd: Don't try to delete local port from datapath.
The local port (OFFP_LOCAL) is fixed in place and can't be deleted, so
don't try.
Ben Pfaff [Tue, 30 Dec 2008 21:40:10 +0000 (13:40 -0800)]
brcompat: Remove line-length limitations from brc_modify_config().
Ben Pfaff [Tue, 30 Dec 2008 21:25:35 +0000 (13:25 -0800)]
brcompat: Write temporary file to same directory as config file.
Otherwise, we will write it in the current working directory, which will
be / if we're running as a daemon (see daemonize()). We shouldn't assume
that we can write to that directory, and it might not be in the same
file system as the output file anyhow.
Ben Pfaff [Tue, 30 Dec 2008 21:12:48 +0000 (13:12 -0800)]
New function nl_sock_wait(), to improve netlink socket abstraction.
Ben Pfaff [Tue, 30 Dec 2008 21:03:48 +0000 (13:03 -0800)]
Properly lock dp_mutex around changes to the datapath.
We weren't locking dp_mutex() here but it really is necessary. See the
comment on dp_mutex itself for details.
This actually restores some of the locking removed by commit
47b8652d
"Simplify use of dp_mutex." That commit is correct that we can take
dp_mutex at a high level in dp_genl_openflow(), but it removes locking
from functions that are not called through dp_genl_openflow(): in
particular any Netlink command other than DP_GENL_C_OPENFLOW does not
go through that function, so those commands need to acquire the mutex
themselves.
Ben Pfaff [Tue, 30 Dec 2008 20:50:02 +0000 (12:50 -0800)]
brcompat: Fix usage message.
--brcompat isn't required and doesn't substitute for --config, so put it
in a different section.
Ben Pfaff [Tue, 30 Dec 2008 20:49:23 +0000 (12:49 -0800)]
brcompat: Add note about required kernel module to vswitchd manpage.
Ben Pfaff [Tue, 30 Dec 2008 20:40:11 +0000 (12:40 -0800)]
Simplify lookup_dp() now that we can assume that dp_name is null-terminated.
Ben Pfaff [Tue, 30 Dec 2008 20:34:39 +0000 (12:34 -0800)]
Force DP_GENL_A_DP_NAME and DP_GENL_A_PORTNAME to be null-terminated.
The kernel doesn't check for a null terminator on strings in Netlink
attributes unless you force it to do so.
Ben Pfaff [Tue, 30 Dec 2008 20:33:11 +0000 (12:33 -0800)]
Factor datapath common code into new function lookup_dp().
Ben Pfaff [Tue, 30 Dec 2008 19:40:22 +0000 (11:40 -0800)]
Fix GNU make warning about overriding commands for a target.
We would add a target to link a C file into datapath directories each
time that file was mentioned in a list of sources, so when we put a source
file into two different lists of sources it got two such targets. Fixed
by using the GNU make $(sort) function to eliminate duplicates.
Ben Pfaff [Tue, 30 Dec 2008 19:32:23 +0000 (11:32 -0800)]
Update dpif comments and prototypes.
Some of the comments weren't up-to-date, and the prototypes are easier
to understand if parameter names are included.
Ben Pfaff [Tue, 30 Dec 2008 19:31:25 +0000 (11:31 -0800)]
Restore openflow-netlink.h ABI.
Inserting DP_GENL_A_DP_NAME above DP_GENL_A_PORTNAME in commit
660f6596ba31, "First cut at bridge compatibility for vswitchd" forces users
not to use any existing builds of userspace utilities, because the
numbering of all the netlink attributes for OpenFlow has changed. This
change restores the numbering and should make older dpctl, etc. still able
to work.
Ben Pfaff [Tue, 30 Dec 2008 19:23:36 +0000 (11:23 -0800)]
Use rcu_dereference() before we dereference an RCU-protected pointer.
The access to dps[i]->netdev->name is a dereference that should be
protected by rcu_dereference().
Ben Pfaff [Tue, 30 Dec 2008 18:48:59 +0000 (10:48 -0800)]
Fix off-by-one error in looking up datapaths by index.
Justin Pettit [Tue, 30 Dec 2008 19:23:34 +0000 (11:23 -0800)]
Fix missing symbol in brcompat kernel module on older kernels.
In kernels older than 2.6.23, the genl_register_mc_group function is not
defined, so we fake it. The original checkin didn't build the C file
that contains the function's definition.
Justin Pettit [Tue, 30 Dec 2008 18:55:33 +0000 (10:55 -0800)]
Increase max datapaths to 256 (Bug #561).
Increase the maximum number of datapaths from 32 to 256. This ought to
be enough for anyone.
Ben Pfaff [Tue, 30 Dec 2008 18:26:03 +0000 (10:26 -0800)]
brcompat: Build brcompat module only under Linux 2.6.
brcompat fails to compile under Linux 2.4 due to the lack of brioctl_set()
and other symbols, but there's no intention of supporting Linux 2.4 for it
anyhow, so don't build it under Linux 2.4.
Ben Pfaff [Tue, 30 Dec 2008 18:25:50 +0000 (10:25 -0800)]
brcompat: Remove policy from Netlink code.
Policies are only useful for data that is received by a Netlink socket.
They do not apply to data that is sent out. Since this code does not
parse the messages that it receives at all, it does not need any policy.
Ben Pfaff [Tue, 30 Dec 2008 18:25:38 +0000 (10:25 -0800)]
brcompat: Fix typo in user message.
Ben Pfaff [Tue, 30 Dec 2008 18:25:29 +0000 (10:25 -0800)]
brcompat: Indentation fixups.
In a few places four spaces were used in place of one tab. Elsewhere,
function arguments weren't lined up well.
Ben Pfaff [Tue, 30 Dec 2008 18:01:06 +0000 (10:01 -0800)]
Make datapath compile with Xen kernel.
The Xen kernel is based on 2.6.18 but backports many features from later
kernels. It is not always possible, therefore, to detect whether we need
to use compatibility code based on LINUX_VERSION_CODE. This commit fixes
the problem by using configure-time tests to check for the need for the
compatibility code.
Build-tested on Linux 2.6.15 through 2.6.28 with the default configuration
(except that some kernels needed preemption turned off) and with Xen
kernel 2.6.18-92.1.10.el5.xs5.0.0.394.644.
Fixes bug #548.
Justin Pettit [Tue, 30 Dec 2008 06:53:08 +0000 (22:53 -0800)]
First cut at bridge compatibility for vswitchd.
This set of changes allows the bridge ioctls to be used for adding and
removing datapaths and interfaces. To enable, one must insmod the
new "brcompat_mod.ko" kernel module. Then, vswitchd is run with the
"--brcompat" flag. See the man page for vswitchd for more details.
Ben Pfaff [Tue, 30 Dec 2008 00:17:09 +0000 (16:17 -0800)]
vswitchd: Fix SIGHUP behavior for bonded ports.
Ben Pfaff [Tue, 30 Dec 2008 00:01:46 +0000 (16:01 -0800)]
vswitchd: Properly renumber port_ifidx values on iface destruction.
Ben Pfaff [Tue, 30 Dec 2008 00:01:16 +0000 (16:01 -0800)]
vswitchd: Revalidate all flows upon bridge configuration change.
Otherwise, now-invalid flows can linger, causing trouble.
Ben Pfaff [Mon, 29 Dec 2008 23:55:54 +0000 (15:55 -0800)]
New function mac_learning_flush().
Ben Pfaff [Mon, 29 Dec 2008 22:29:26 +0000 (14:29 -0800)]
vswitchd: Fix svec_diff().
The logic bugs here were causing bridge.c to do too much work adding and
deleting interfaces unnecessarily and perhaps in some circumstances getting
the set of interfaces wrong entirely.
Ben Pfaff [Mon, 29 Dec 2008 23:59:48 +0000 (15:59 -0800)]
vswitchd: Comment out annoying bonding-related logging, for now.
Ben Pfaff [Mon, 29 Dec 2008 21:30:48 +0000 (13:30 -0800)]
vswitchd: Fix svec memory leaks.
Pointed out by Justin.
Ben Pfaff [Mon, 29 Dec 2008 21:26:19 +0000 (13:26 -0800)]
Make ds_cstr() always null-terminate the string.
Most of the time the string in "struct ds" is
null-terminated, but there seem to be a few corner cases
where it is not. Make ds_cstr() always put in the null
terminator, for safety.
Thanks to Justin for pointing out the problem.
Ben Pfaff [Mon, 29 Dec 2008 21:07:09 +0000 (13:07 -0800)]
vswitchd: Implement bonding link failure detection & failover.
Ben Pfaff [Mon, 29 Dec 2008 21:06:56 +0000 (13:06 -0800)]
New functions port_array_destroy(), port_array_clear().
Ben Pfaff [Sun, 28 Dec 2008 06:45:25 +0000 (22:45 -0800)]
Document vswitchd.
Ben Pfaff [Sat, 27 Dec 2008 23:36:49 +0000 (15:36 -0800)]
Factor out common parts of manpages.
There was a lot of duplication in the sources for the
manpages, because many of the programs have common options.
This factors out some of the duplication into include
files, using the man ".so" directive. It also uses the
".ds" directive to define strings that should be
customized for each program's manpage.
Ben Pfaff [Sat, 27 Dec 2008 05:26:46 +0000 (21:26 -0800)]
vswitchd: Actually tag flows that go out bonded devices.
The change that introduced rebalancing for bonded devices
set up the infrastructure for revalidating flows that go
out bonded devices, but neglected to actually tag those
flows. This fixes the problem.
Ben Pfaff [Sat, 27 Dec 2008 00:47:35 +0000 (16:47 -0800)]
vswitchd: Basic bonding rebalancing works.
So far only tested with hping3. At least, need to make sure that existing
flows get redirected through the new interface as well.
Ben Pfaff [Sat, 27 Dec 2008 00:48:27 +0000 (16:48 -0800)]
vswitchd: Work on flow statistics gathering.
Ben Pfaff [Wed, 24 Dec 2008 23:09:41 +0000 (15:09 -0800)]
vswitchd: Implement stats request manager.
Ben Pfaff [Wed, 24 Dec 2008 23:11:17 +0000 (15:11 -0800)]
Make tag_set_add() avoid adding tags that are already present.
Ben Pfaff [Wed, 24 Dec 2008 23:10:48 +0000 (15:10 -0800)]
New functions for iterating through flow stats replies.
Ben Pfaff [Wed, 24 Dec 2008 23:09:57 +0000 (15:09 -0800)]
New function ofpbuf_clone_data().
Ben Pfaff [Fri, 26 Dec 2008 19:06:09 +0000 (11:06 -0800)]
vswitchd: Fix treatment of unbuffered packets.
Before, buggy code caused unbuffered packets to be dropped. This fixes
the problem.
Ben Pfaff [Fri, 26 Dec 2008 19:04:15 +0000 (11:04 -0800)]
Drop message about short Ethernet frames entirely.
It's just not useful.
Ben Pfaff [Fri, 26 Dec 2008 18:28:17 +0000 (10:28 -0800)]
Fix learning-switch STP breakage from "out_port" in flow stats request.
ofp_flow_stats_request recently added a new member, "out_port", to select
only flows that output to a particular port. Unfortunately this code
in learning-switch.c was not updated to set that member to OFPP_NONE,
with the result that it would only get flows that output to port 0.
This bug was found when looking at this code for another reason and thus
is no guarantee that the STP code in learning-switch actually works.
Ben Pfaff [Wed, 24 Dec 2008 19:01:37 +0000 (11:01 -0800)]
vswitchd: Automatically restart secchan if it dies.
Ben Pfaff [Wed, 24 Dec 2008 01:06:17 +0000 (17:06 -0800)]
Implement revalidation.
Ben Pfaff [Wed, 24 Dec 2008 00:57:23 +0000 (16:57 -0800)]
Add support for tags to mac-learning library, and update client code.
Ben Pfaff [Wed, 24 Dec 2008 00:57:58 +0000 (16:57 -0800)]
Implement generic tag library.
Ben Pfaff [Wed, 24 Dec 2008 01:02:46 +0000 (17:02 -0800)]
New functions random_uint8(), random_uint16().
Also, reimplement random_uint32() to make fewer calls to rand().
Ben Pfaff [Wed, 24 Dec 2008 01:02:54 +0000 (17:02 -0800)]
New macro IS_POW2().
Ben Pfaff [Tue, 23 Dec 2008 22:59:48 +0000 (14:59 -0800)]
Implement generic hash table.
Ben Pfaff [Tue, 23 Dec 2008 23:05:45 +0000 (15:05 -0800)]
New function flow_equal().
Ben Pfaff [Tue, 23 Dec 2008 23:04:54 +0000 (15:04 -0800)]
Inline flow_compare() and flow_hash(), for performance.
Ben Pfaff [Tue, 23 Dec 2008 23:03:37 +0000 (15:03 -0800)]
Make flow_hash() use hash_lookup3(), for speed and hash quality.
Ben Pfaff [Tue, 23 Dec 2008 23:01:25 +0000 (15:01 -0800)]
Add faster and better-quality hash function hash_lookup3().
Justin Pettit [Tue, 23 Dec 2008 08:30:38 +0000 (00:30 -0800)]
Fix setting "of" device name based on unitialized dp_idx.
The name of the "of" device is of the form "of<dp_idx>". The device
driver assumes the "dp_idx" field has been set in the datapath struct
before its called. This was not the case.
Ben Pfaff [Mon, 22 Dec 2008 06:19:17 +0000 (22:19 -0800)]
Remove misplaced comment.
Ben Pfaff [Sat, 20 Dec 2008 00:33:31 +0000 (16:33 -0800)]
vswitch: Implement basic bonding.
Rebalancing and link failure detection are missing, but the basics are
there (and work OK in simple testing).
Ben Pfaff [Fri, 19 Dec 2008 23:10:18 +0000 (15:10 -0800)]
vswitch: Pass --monitor to secchan processes, to allow monitoring them.
Justin Pettit [Sat, 20 Dec 2008 00:06:11 +0000 (16:06 -0800)]
Add #include <limits.h> to fix build problem with undefined "_POSIX_PIPE_BUF".
Justin Pettit [Fri, 19 Dec 2008 20:51:42 +0000 (12:51 -0800)]
Fix flag to indicate whether Flow End messages should be sent.
The secchan code set whether Flow End messages should based on the last
configuration request. This means that if NetFlow messages need to be
generated, but the cotnroller doesn't flow Flow Expiration messages, the
Flow End meta-message was disabled.
Justin Pettit [Fri, 19 Dec 2008 20:49:09 +0000 (12:49 -0800)]
Fix null pointer dereference when a delete flow command is executed.
A set of missing parentheses was causing an attempt to send a Flow End
message even if no flow existed. The code to send the Flow End message
would try to access data in the flow and cause a kernel panic.
Ben Pfaff [Thu, 18 Dec 2008 22:26:29 +0000 (14:26 -0800)]
vswitchd: Fix stupid thinko.
Ben Pfaff [Thu, 18 Dec 2008 22:00:59 +0000 (14:00 -0800)]
vswitchd: Basic working VLAN support.
Ben Pfaff [Thu, 18 Dec 2008 22:01:35 +0000 (14:01 -0800)]
New functions put_openflow() and put_openflow_xid().
Ben Pfaff [Thu, 18 Dec 2008 22:00:23 +0000 (14:00 -0800)]
Add support for VLAN tags to the MAC learning library.
vswitchd needs to keep separate per-VLAN MAC learning tables, so this adds
a VLAN tag to each MAC learning table entry. The existing users of the
MAC learning table don't care about VLANs, so they always pass in a VLAN
of 0.
There is a very good chance that vswitchd will need additional features in
its MAC learning table that don't fit well into the existing library. In
that case this commit will probably be reverted and a separate MAC learning
implementation added in the vswitch directory.
Ben Pfaff [Thu, 18 Dec 2008 20:43:57 +0000 (12:43 -0800)]
cfg: Fix functions for retrieving keys.
They didn't work. At all.
Ben Pfaff [Thu, 18 Dec 2008 19:17:36 +0000 (11:17 -0800)]
secchan: Switch in-band control traffic by hand only on OpenFlow TCP ports.
To run services, other than the controller itself, on the same IP and MAC
as the controller, sophisticated controllers such as NOX need to have some
insight into the controller's location, etc. Before this commit, this
was not possible, because any traffic to or from the controller's MAC
address was switched "by hand" by secchan, without involving the controller
at all.
After this commit, only traffic to or from the controller's MAC *and on
the OpenFlow TCP or SSL port* is switched by hand, which should fix the
problem.
Ben Pfaff [Wed, 17 Dec 2008 22:39:06 +0000 (14:39 -0800)]
Don't use separate asynchronous event connection for user datapath.
Commit
14439fa80c, "Maintain separate async and sync connections to nl:0
in secchan," modified secchan to use two separate connections to the
datapath, one for asynchronous events, one for requests and replies. This
technique doesn't work for the user datapath, which always sends
asynchronous events on all its connections. Fortunately, it isn't
necessary for the user datapath, either, because the user datapath is
smart enough not to drop message replies.
Tested by Justin.
Justin Pettit [Wed, 17 Dec 2008 22:24:22 +0000 (14:24 -0800)]
Add support for exporting flow information in NetFlow v5 format.
This is implemented by having the datapath send a new meta-Flow End message
that contains all the information needed by NetFlow v5 and the OpenFlow
Flow Expiration messages. secchan grabs these Flow End messages and
generates any requested Flow End and NetFlow messages. The Flow End
message is implemented as a Nicira vendor extension, but it is only used
internally between the datapath and secchan, so the switch is still fully
compatible with OpenFlow v0.8.9.
NOTE: This change has not been ported to "switch", which means that it is
not able to generate NetFlow messages. "switch" is no longer maintained
and will be removed from the repository on January 1, 2009.
Ben Pfaff [Wed, 17 Dec 2008 01:19:09 +0000 (17:19 -0800)]
Initial, skeletal implementation of vswitchd.