Ben Pfaff [Wed, 21 Jan 2009 00:27:27 +0000 (16:27 -0800)]
process: New function process_escape_args().
Ben Pfaff [Wed, 21 Jan 2009 00:24:00 +0000 (16:24 -0800)]
Debian packaging: Remove IP addresses from netdevs within a switch.
Ben Pfaff [Tue, 20 Jan 2009 21:34:13 +0000 (13:34 -0800)]
New function netdev_enumerate().
Ben Pfaff [Tue, 20 Jan 2009 21:34:02 +0000 (13:34 -0800)]
New function svec_join().
Ben Pfaff [Tue, 20 Jan 2009 21:33:44 +0000 (13:33 -0800)]
Debian packaging: Add several new settings to /etc/default/openflow-switch.
Ben Pfaff [Wed, 21 Jan 2009 00:06:59 +0000 (16:06 -0800)]
process: Avoid stealing pclose()'s exit status.
When we use popen() and pclose(), pclose() wants to return the process's
exit status, but it can't if the SIGCHLD handler gets it first. So,
instead of asking for any child process exit status in sigchld_handler(),
only ask for the exit status of registered PIDs.
Ben Pfaff [Wed, 21 Jan 2009 00:34:11 +0000 (16:34 -0800)]
daemon: Fix behavior in read_pidfile() when pid file is not locked.
Ben Pfaff [Wed, 21 Jan 2009 00:33:52 +0000 (16:33 -0800)]
daemon: Fix bogus error message in read_pidfile() when pidfile is empty.
Ben Pfaff [Wed, 21 Jan 2009 00:33:32 +0000 (16:33 -0800)]
daemon: Fix segfault in read_pidfile() when pidfile does not exist.
Ben Pfaff [Mon, 19 Jan 2009 23:54:22 +0000 (15:54 -0800)]
debian: Avoid aborting on switch startup when $COMMANDS is empty.
Ben Pfaff [Mon, 19 Jan 2009 19:23:43 +0000 (11:23 -0800)]
Fix typo in comment.
Keith Amidon [Tue, 13 Jan 2009 23:30:40 +0000 (15:30 -0800)]
Reopen log file in addition to reading conf file when vswitchd receives sighup
This only reopens the vswitchd log file. The child secchan processes
for each bridge are not requested to do the same thing. Since secchan
in general logs very little data, rotating those files isn't being
done right now, so this is probably okay. At some point we should
probably correct it however.
Ben Pfaff [Fri, 16 Jan 2009 18:37:13 +0000 (10:37 -0800)]
vswitchd: Reduce flow idle time when flow table grows large.
This change halves the number of steady-state flows in the flow table
for hping3 --faster --quiet, from over 5000 to less than 2500. It does
cause some oscillation in flow table size, because there is a harsh step
function in idle time when the flow table goes from 1000 to 1001 flows,
from 2001 to 2002 flows, and from 4003 to 4004 flows, but I doubt that is
a problem. (If it is, we can introduce some randomness.)
Ben Pfaff [Fri, 16 Jan 2009 18:23:51 +0000 (10:23 -0800)]
vswitchd: Don't reset idle timer when updating flows.
When a flow was revalidated, we would use a OFPFC_ADD flow_mod message to
change the flow's actions. However, this resets the idle-timer countdown.
In extreme circumstances, such as when VMs are being continuously migrated,
this meant that completely idle flows would never expire, because their
idle timers would keep getting reset more often than every 5 seconds, and
so the flow table would keep growing, never shrinking.
Now, when we revalidate an existing flow and update its actions, we use
an OFPFC_MODIFY_STRICT flow_mod message, which also updates actions but
does not reset the idle-timer countdown.
Ben Pfaff [Fri, 16 Jan 2009 18:11:04 +0000 (10:11 -0800)]
Revert "brcompat: Don't re-read configuration file from inside bridge code."
This reverts commit
2b34f542b1015b69c589bc4fa324d236cd35dd5f, because
not re-reading the config file from phy_port_changed() meant that, later,
the next time we modified the config file we would do it based on the
older version, not the version that we just wrote out.
Ben Pfaff [Fri, 16 Jan 2009 18:08:51 +0000 (10:08 -0800)]
vswitchd: Delete flows on a deleted interface when revalidating.
When an interface is deleted from a datapath by an entity other than
vswitchd (e.g. by a vif being deleted), we would revalidate all the
flows and change them to drop packets. But that's a waste of flow
table space. This commit changes the behavior in this case to delete
those flows entirely.
This commit is complicated by the need to deal gracefully with flows
on datapath interfaces that we don't know about, e.g. from the local port
if the local port is not part of the bridge or from interfaces added to
a datapath by an external mechanism (e.g. added with "dpctl addif"
manually). We don't want to delete those flows, even though they resemble
the ones that we do want to delete, because they potentially save us from
processing a lot of packet-in messages that we don't care about. So we
mark those flows with a new "need_drop" flag.
Ben Pfaff [Fri, 16 Jan 2009 17:45:43 +0000 (09:45 -0800)]
New functions make_flow_mod(), make_del_flow().
Ben Pfaff [Fri, 16 Jan 2009 00:33:51 +0000 (16:33 -0800)]
vswitchd: Avoid mishandling duplicate object names.
If a port was named twice in bridge.BRNAME.port, we would add two different
ports with the same name to bridge BRNAME. Fix the problem.
Also, be more vigilant about duplicate names for other kinds of objects,
even though it should be difficult or impossible to end up with them.
Ben Pfaff [Fri, 16 Jan 2009 00:08:31 +0000 (16:08 -0800)]
New function svec_sort_unique(), svec_is_unique(), svec_get_duplicate().
Ben Pfaff [Fri, 16 Jan 2009 00:08:00 +0000 (16:08 -0800)]
brcompat: Don't re-read configuration file from inside bridge code.
brc_modify_config() re-reads the configuration files by calling cfg_read(),
but we don't want to do that when we're deep inside the bridge code, in
the call to brc_modify_config() from phy_port_changed(). So only call
cfg_read() from the callers of brc_modify_config() that are in brcompat.c.
Also, each cfg_read() call was followed by a call to bridge_reconfigure(),
which is what reconfigure() in vswitchd.c does, so just use that function
instead of open-coding the pair of calls.
This should not have caused a real problem, because no pointers into
configuration data are retained by bridge code, but it still seems like
the "correct" way to do things.
Ben Pfaff [Thu, 15 Jan 2009 23:34:13 +0000 (15:34 -0800)]
brcompat: Drop write-only variable.
Ben Pfaff [Thu, 15 Jan 2009 22:57:59 +0000 (14:57 -0800)]
vlog: Add INFO level and apply it to messages for "normal" behavior.
Fixes bug #246.
Ben Pfaff [Thu, 15 Jan 2009 22:31:55 +0000 (14:31 -0800)]
vconn: Ignore async messages before version negotiation completes.
The kernel can send a packet_in or other asynchronous message to
the secchan before the version negotiation step is finished, which
causes the secchan to drop the connection and try again. This commit
fixes the problem.
Fixes bug #368.
Ben Pfaff [Thu, 15 Jan 2009 21:55:52 +0000 (13:55 -0800)]
secchan: Document --rate-limit and --burst-limit options in manpage.
Fixes bug #674.
Ben Pfaff [Thu, 15 Jan 2009 21:44:04 +0000 (13:44 -0800)]
secchan: Divide options in manpage into labeled subsections.
Ben Pfaff [Thu, 15 Jan 2009 17:57:19 +0000 (09:57 -0800)]
debian: Move ofp-switch-setup and manpage into correct package.
These files were accidentally included in the openflow-switch package,
but they were supposed to be in openflow-switch-config.
Ben Pfaff [Wed, 14 Jan 2009 22:08:40 +0000 (14:08 -0800)]
dpctl: Fix "add-flow" and "add-flows" when actions are specified.
Thanks to Justin for noticing the problem.
Justin Pettit [Wed, 14 Jan 2009 22:53:10 +0000 (14:53 -0800)]
Merge branch 'master' of nicira.dyndns.org:/srv/git/openflow/
Justin Pettit [Wed, 14 Jan 2009 22:52:59 +0000 (14:52 -0800)]
Check wildcards for in_port != out_port output validation.
OpenFlow requires that traffic that is to be sent out the interface it
came in on use the OFPP_IN_PORT virtual port. The action validation
code that enforces this ignored the wildcards field, which meant it was
using the garbage 'in_port' value for this check.
Ben Pfaff [Wed, 14 Jan 2009 21:39:20 +0000 (13:39 -0800)]
Add missing #includes.
root [Wed, 14 Jan 2009 01:30:08 +0000 (17:30 -0800)]
Allow controller to set MAC address to use in ARP responses for SNAT IPs.
This allows the controller to set a MAC address to use in response to
an ARP request for the NAT IP address on a non-NAT interface. This is
useful if a NAT'd device needs to communicate with a non-NAT'd device,
when they are on the same interface on the OpenFlow switch. When the
non-NAT'd device requests the MAC address of the NAT IP address, the
switch responds with the supplied MAC address (often the L3 router
behind it). This allows communication in both directions to bounce off
the L3 router and not confuse controller.
Ben Pfaff [Wed, 14 Jan 2009 01:03:01 +0000 (17:03 -0800)]
vswitchd: Fix more memory leaks.
Ben Pfaff [Wed, 14 Jan 2009 00:22:07 +0000 (16:22 -0800)]
Fix typo.
Ben Pfaff [Wed, 14 Jan 2009 00:21:55 +0000 (16:21 -0800)]
vswitchd: Fix typo in comment.
Ben Pfaff [Wed, 14 Jan 2009 00:21:37 +0000 (16:21 -0800)]
brcompat: Don't try to write the config file if it isn't configured.
Ben Pfaff [Wed, 14 Jan 2009 00:21:18 +0000 (16:21 -0800)]
leak-checker: Make output file unbuffered.
This way, we get an up-to-date record when the process is killed.
Ben Pfaff [Tue, 13 Jan 2009 23:29:58 +0000 (15:29 -0800)]
vswitchd: Fix memory leak.
Ben Pfaff [Tue, 13 Jan 2009 23:10:11 +0000 (15:10 -0800)]
Fix bugs in leak checker.
Oops.
Ben Pfaff [Tue, 13 Jan 2009 22:03:24 +0000 (14:03 -0800)]
vswitchd: Fix memory leaks.
Ben Pfaff [Tue, 13 Jan 2009 21:54:46 +0000 (13:54 -0800)]
Fix memory leak in nl_sock_transact().
Ben Pfaff [Tue, 13 Jan 2009 21:46:15 +0000 (13:46 -0800)]
Implement simple memory leak detector.
Initially, make it available only in secchan and vswitchd. But it's easy
to add it elsewhere too.
Ben Pfaff [Tue, 13 Jan 2009 21:15:55 +0000 (13:15 -0800)]
Add ability to open null fds to process_start().
Ben Pfaff [Tue, 13 Jan 2009 00:50:20 +0000 (16:50 -0800)]
Add libpcre3-dev to build-dependencies.
Ben Pfaff [Tue, 13 Jan 2009 00:49:55 +0000 (16:49 -0800)]
New function process_run().
Ben Pfaff [Tue, 13 Jan 2009 01:21:33 +0000 (17:21 -0800)]
Merge master and vswitchd branches
Ben Pfaff [Fri, 9 Jan 2009 21:28:12 +0000 (13:28 -0800)]
Make dpctl accept an arbitrary number of actions.
This cleanup has been wanted for a while.
As a side effect, this change deletes some dead code pointed out by Chris
Eagle via Fortify: many of the deleted comparisons against act_len were
never true because of the sizes of the objects invovled.
Ben Pfaff [Fri, 9 Jan 2009 01:21:46 +0000 (17:21 -0800)]
learning-switch: Remove unused variable.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 01:20:34 +0000 (17:20 -0800)]
fatal-signal: Fix bug in call_hooks() recursion detection.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 01:16:41 +0000 (17:16 -0800)]
Use xstrdup() instead of xasprintf() for duplicating constant string.
Ben Pfaff [Fri, 9 Jan 2009 01:13:30 +0000 (17:13 -0800)]
dpctl: Fix use-after-free in "probe" command.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 01:10:34 +0000 (17:10 -0800)]
netdev: Fix file descriptor leak.
This could be important since it leaks a file descriptor on every
netdev_open(), but only if an IPv6 address is configured on the network
device (which is rare and indicates an error condition for OpenFlow).
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 01:06:54 +0000 (17:06 -0800)]
datapath: Check DMI strings for NULL.
dmi_get_system_info() can return NULL, so check for it.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 01:06:19 +0000 (17:06 -0800)]
datapath: Avoid pointer arithmetic on possibly-NULL pointer.
Pointer arithmetic on a null pointer yields undefined behavior, even
though it doesn't really matter in the real world (normally).
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 01:00:06 +0000 (17:00 -0800)]
daemon: report error if daemon child process fails to start properly
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:56:50 +0000 (16:56 -0800)]
dpctl: Exit unsuccessfully if a write to stdout or stderr failed.
A program should exit with an error if its output failed, so check for
this before termination.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:49:31 +0000 (16:49 -0800)]
Use strtok_r() instead of strtok().
Not a bug but a style issue, since this code doesn't call and isn't called
by other code that uses strtok().
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:47:01 +0000 (16:47 -0800)]
dhcp-client: Don't report long time to expiration after lease expires.
There is a race between time advancing past the lease expiration time
and actually transitioning to the expired state. Fix this race.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:45:34 +0000 (16:45 -0800)]
dhcp-client: Add comment about time going backward.
Issue raised by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:40:16 +0000 (16:40 -0800)]
datapath: Make 'length' local variable unsigned, for consistency.
This is a style issue, not a bug, if you chase down what the function
and the caller are doing.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:35:47 +0000 (16:35 -0800)]
Use a uint16_t variable to store a 16-bit value, not an int.
This is a style issue, not a bug, because the int only ever held
values in the range 0...UINT16_MAX.
Found by Chris Eagle via Fortify.
Ben Pfaff [Fri, 9 Jan 2009 00:32:21 +0000 (16:32 -0800)]
Mark memory allocation functions with __attribute__((malloc)).
This may improve optimization, and it may make it easier for tools such
as Fortify to see what is going on.
Ben Pfaff [Sat, 10 Jan 2009 00:45:54 +0000 (16:45 -0800)]
datapath: Fix tracking of number of flows in hash table.
Fixes bug #684.
Thanks to Reid for noticing the problem.
Ben Pfaff [Sat, 10 Jan 2009 00:24:56 +0000 (16:24 -0800)]
datapath: Add log level annotations to printk messages.
General approach is:
- KERN_EMERG: Conditions that prevent the modules from loading.
- KERN_ERR: Conditions that indicate an OpenFlow kernel code bug.
- KERN_WARNING: Conditions that might indicate a bug in OpenFlow kernel
code or other kernel code.
- KERN_NOTICE: Conditions that might indicate a bug in secchan or the
OpenFlow controller, or minor conditions that are typically transient.
Justin Pettit [Fri, 9 Jan 2009 23:41:42 +0000 (15:41 -0800)]
Add datapath device name to printk's.
To aid debugging, this prints the datapath device name to printk
messages. Not doing this wasn't a big deal when only a single datapath
was running, but it's very confusing when there are multiple.
Ben Pfaff [Fri, 9 Jan 2009 23:18:36 +0000 (15:18 -0800)]
vswitchd: Delete 'ifaces' pointer to interface when deleting interface.
Otherwise we dereference a dangling pointer to the interface when we
look up the interface by datapath port index, causing a segfault.
Introduced in commit
150ac45, "vswitchd: Eliminate "can't forward to bad
port" when interfaces disappear," which deletes an interface that is known
to be in the datapath port index table.
Ben Pfaff [Fri, 9 Jan 2009 22:30:25 +0000 (14:30 -0800)]
rconn: Fix segfault when the idle timeout races with connection failure.
Noticed in Xen VM migration torture test (thanks Henrik!)
Justin Pettit [Fri, 9 Jan 2009 22:12:24 +0000 (14:12 -0800)]
Delete extermally removed interfaces from bridge compatibility config.
The bridge compatibility code was not notified when interfaces were
removed from datapaths. This fixes that.
Ben Pfaff [Fri, 9 Jan 2009 20:51:19 +0000 (12:51 -0800)]
vswitchd: Eliminate "can't forward to bad port" when interfaces disappear.
When an interface was deleted from a datapath by a process other than
vswitchd (which is not supposed to happen), vswitchd would not realize it
and would continue to set up flows for that interface (and leave in place
existing flows). This caused the kernel to complain "can't forward to bad
port" for each packet on these flows.
Xen triggered this by destroying vifs that were on vswitchd-controlled
datapaths (which removes them from any datapath that they are on).
This fixes the problem, by making vswitchd notice when interfaces
disappear and fixing up the flow table.
Ben Pfaff [Fri, 9 Jan 2009 19:46:06 +0000 (11:46 -0800)]
datapath: Don't drop oversize GSO frames, since GSO will break them up.
Fixes TCP performance problems on Xen.
All credit to Justin for diagnosis.
Justin Pettit [Fri, 9 Jan 2009 01:16:33 +0000 (17:16 -0800)]
Add support for sysfs and ethtool.
Add support for sysfs when the bridge compatibility module is running.
Currently, this only works for 2.6.18 kernels. Working on all kernels
should be fixed soon. Also, add ethtool support to the datapath device.
Ben Pfaff [Fri, 9 Jan 2009 00:24:26 +0000 (16:24 -0800)]
Only enable warning options that the compiler actually understands.
Ben Pfaff [Fri, 9 Jan 2009 00:10:15 +0000 (16:10 -0800)]
Fix memory leak in make_pidfile().
Found by Chris Eagle via Fortify.
Martin Casado [Thu, 8 Jan 2009 23:56:02 +0000 (15:56 -0800)]
Add nicira copyright and gplv3 to vswitchd
Also changed the COPYING to reflect that copyrights are assigned
per file, and to include the GPLv3 text.
Ben Pfaff [Thu, 8 Jan 2009 23:49:03 +0000 (15:49 -0800)]
Enable many additional GCC warnings by default.
Ben Pfaff [Thu, 8 Jan 2009 23:20:57 +0000 (15:20 -0800)]
Mark stubbed-out function parameter as UNUSED.
Ben Pfaff [Thu, 8 Jan 2009 23:20:32 +0000 (15:20 -0800)]
Add missing switch case.
Found with -Wswitch-enum.
Ben Pfaff [Thu, 8 Jan 2009 23:19:50 +0000 (15:19 -0800)]
Add missing trailing initializers.
Found with -Wmissing-field-initializers.
Ben Pfaff [Thu, 8 Jan 2009 23:18:38 +0000 (15:18 -0800)]
Mark out_of_memory() as never returning.
Found with -Wmissing-noreturn.
Ben Pfaff [Thu, 8 Jan 2009 23:18:00 +0000 (15:18 -0800)]
Add missing header file to brcompat.h.
Found with -Wmissing-prototypes.
Ben Pfaff [Thu, 8 Jan 2009 23:27:52 +0000 (15:27 -0800)]
vlog: Fix initializer in VLOG_RATE_LIMIT macro.
Found by -Wmissing-field-initializers.
Ben Pfaff [Thu, 8 Jan 2009 23:27:35 +0000 (15:27 -0800)]
vlog: Mark format_log_message() as taking a printf format string.
Found by -Wmissing-format-attribute.
Ben Pfaff [Thu, 8 Jan 2009 23:27:23 +0000 (15:27 -0800)]
Fix bug that could have caused infinite loop in ofp_print_actions().
Found by -Wextra noticing that len < 0 is always false.
Ben Pfaff [Thu, 8 Jan 2009 23:27:03 +0000 (15:27 -0800)]
Remove unused functions.
Found by -Wmissing-prototypes.
Ben Pfaff [Thu, 8 Jan 2009 23:12:29 +0000 (15:12 -0800)]
Add missing function prototypes to header files.
Found with -Wmissing-prototypes.
Ben Pfaff [Thu, 8 Jan 2009 23:25:24 +0000 (15:25 -0800)]
Remove comparison of unsigned value < 0.
Found by -Wextra.
Ben Pfaff [Thu, 8 Jan 2009 23:25:15 +0000 (15:25 -0800)]
Mark unused callback function parameters as UNUSED.
Found by -Wunused-parameters.
Ben Pfaff [Thu, 8 Jan 2009 23:25:05 +0000 (15:25 -0800)]
Change external functions to static functions, where possible.
Found by -Wmissing-prototypes.
Ben Pfaff [Thu, 8 Jan 2009 23:24:48 +0000 (15:24 -0800)]
Make function declarations into prototypes.
Found by -Wmissing-prototypes.
Ben Pfaff [Thu, 8 Jan 2009 23:23:50 +0000 (15:23 -0800)]
Remove unused function parameter from stp_start().
Found by -Wunused-parameter.
Ben Pfaff [Thu, 8 Jan 2009 23:23:29 +0000 (15:23 -0800)]
Remove unused function parameter from stp_received_tcn_bpdu().
Found with -Wunused-parameter.
Ben Pfaff [Thu, 8 Jan 2009 23:00:13 +0000 (15:00 -0800)]
netdev: Remove unused 'fd' parameter from set_flags().
Found by -Wunused-parameter.
Ben Pfaff [Thu, 8 Jan 2009 22:59:42 +0000 (14:59 -0800)]
netdev: Remove unused 'netdev' parameter from dpif_add_router().
Found by -Wunused-parameter.
Ben Pfaff [Thu, 8 Jan 2009 22:58:35 +0000 (14:58 -0800)]
Remove 'wait' parameter from dpif_send_openflow().
The 'wait' parameter had no effect (it was not implemented) and it was
only ever set to false. It's easier to simply remove it than to
implement an unneeded feature.
Found by -Wunused-parameter.
Ben Pfaff [Thu, 8 Jan 2009 21:54:41 +0000 (13:54 -0800)]
vswitchd: Fix port mirroring example.
Thanks to Keith for pointing this out.
Ben Pfaff [Thu, 8 Jan 2009 21:39:56 +0000 (13:39 -0800)]
vswitchd: Fix inaccurate comment.
Thanks to Pete for pointing this out.
Ben Pfaff [Thu, 8 Jan 2009 21:38:10 +0000 (13:38 -0800)]
vswitchd: Make secchan subprocesses listen for management connections.
Ben Pfaff [Thu, 8 Jan 2009 21:37:44 +0000 (13:37 -0800)]
secchan: Fix inverted logic (down vs. up).
Ben Pfaff [Thu, 8 Jan 2009 20:25:18 +0000 (12:25 -0800)]
datapath: Fix deadlock in switch port removal.
dp_del_switch_port() cancels the port_task and waits for
it to finish, but the port_task requires the RTNL lock
to complete, which dp_del_switch_port() holds, thus a
deadlock.
This commit fixes the problem, by deleting the port_task
entirely and moving its work into secchan.
Ben Pfaff [Thu, 8 Jan 2009 00:57:05 +0000 (16:57 -0800)]
datapath: Avoid deadlock on dp_mutex versus kthread_stop().
When a datapath is deleted, del_dp() acquires dp_mutex and uses
kthread_stop() to wait for the dp_task to die. Meanwhile, dp_task may
be waiting to acquire dp_mutex and won't die until it does so.
This commit fixes the problem by sending SIGKILL to dp_task before
call kthread_stop() and making dp_task give up if it receives a signal.
Ben Pfaff [Wed, 7 Jan 2009 23:19:09 +0000 (15:19 -0800)]
datapath: Fix deadlock in network device notifier.
When a network device that is part of a datapath is destroyed, the
notifier handler in dp_device_event() calls dp_del_switch_port() to
delete the switch port. dp_del_switch_port() then calls rtnl_lock()
to drop the device's promiscuous count. But the RTNL lock has already
been taken at this point, so taking it recursively deadlocks.
The obvious fix is for dp_del_switch_port()'s caller to take the RTNL
lock for it, if it is not already taken. This, however, causes a
different deadlock, as dp_del_switch_port() needs dp_mutex and that
means that we are nesting dp_mutex and the RTNL both ways.
The fix used here is to always nest dp_mutex inside the RTNL lock and
never vice versa.
=============================================
[ INFO: possible recursive locking detected ]
---------------------------------------------
tunctl/2934 is trying to acquire lock:
(rtnl_mutex){--..}, at: [<
c029667d>] mutex_lock+0x1c/0x1f
but task is already holding lock:
(rtnl_mutex){--..}, at: [<
c029667d>] mutex_lock+0x1c/0x1f
other info that might help us debug this:
1 lock held by tunctl/2934:
#0: (rtnl_mutex){--..}, at: [<
c029667d>] mutex_lock+0x1c/0x1f
stack backtrace:
[<
c01040cf>] show_trace+0xd/0x10
[<
c0104655>] dump_stack+0x19/0x1b
[<
c012f897>] __lock_acquire+0x747/0x9a6
[<
c0130058>] lock_acquire+0x4b/0x6b
[<
c029651b>] __mutex_lock_slowpath+0xb0/0x1f6
[<
c029667d>] mutex_lock+0x1c/0x1f
[<
c023f045>] rtnl_lock+0xd/0xf
[<
c89703eb>] dp_del_switch_port+0x2d/0xb9 [openflow_mod]
[<
c8972554>] dp_device_event+0x84/0xb0 [openflow_mod]
[<
c0125033>] notifier_call_chain+0x20/0x31
[<
c012506d>] raw_notifier_call_chain+0x8/0xa
[<
c0237e75>] unregister_netdevice+0x129/0x1ce
[<
c020ed97>] tun_chr_close+0x5e/0x69
[<
c01573a5>] __fput+0xb3/0x15e
[<
c0157467>] fput+0x17/0x19
[<
c0154de1>] filp_close+0x51/0x5b
[<
c011aeca>] put_files_struct+0x6d/0xa9
[<
c011becf>] do_exit+0x206/0x790
[<
c011c4d2>] sys_exit_group+0x0/0x11
[<
c011c4e1>] sys_exit_group+0xf/0x11
[<
c0102ab7>] syscall_call+0x7/0xb
Problem identified with this script (in conjunction with the brcompat
module):
#! /bin/sh
brctl delbr a
brctl delbr b
brctl delbr c
set -x
stress () {
bridge=$1
shift
while true; do
brctl addbr $bridge; brctl show
for iface
do
case $iface in
tap*) tunctl -t $iface ;;
esac
brctl addif $bridge $iface
done
for iface
do
brctl delif $bridge $iface
case $iface in
tap*) tunctl -d $iface ;;
esac
done
brctl delbr $bridge; brctl show
done
}
stress a tap0 tap1 tap2 &
stress b tap3 tap4 tap5 &
trap 'killall stress-brctl' 0 SIGINT
wait