Ben Pfaff [Wed, 20 May 2009 16:17:04 +0000 (09:17 -0700)]
ovs-pki: Fix bashism.
Thanks to Nicolas Perrenoud <nicolape@ee.ethz.ch> for reporting the
problem.
Justin Pettit [Wed, 20 May 2009 01:01:49 +0000 (18:01 -0700)]
xenserver: Fix setting pool-wide controller setting
The latest set of changes left out pushing the controller configuration
changes out to members of the pool. This patch adds back the call to do
that.
Justin Pettit [Tue, 19 May 2009 00:08:47 +0000 (17:08 -0700)]
xen: Fix a missed xapi plugin from the Nicira brand stripping
Missed a xapi plugin during the rebranding. This adds that and fixes a
couple of case-senstive variable issues. Ugh, need to figure out how to
make local Xen RPM builds...
Justin Pettit [Mon, 18 May 2009 23:31:27 +0000 (16:31 -0700)]
xen: Add missing change to Nicira brand removal commit
Also need to update the makefile!
Justin Pettit [Mon, 18 May 2009 23:22:40 +0000 (16:22 -0700)]
xen: Remove Nicira branding from vSwitch xsconsole plugin
Since the plan is to eventually integrate this code into the Xen
distribution, the Nicira tags have been removed. The copyright is still
held by Nicira until a better one is found.
Jesse Gross [Mon, 18 May 2009 22:41:11 +0000 (15:41 -0700)]
brcompatd: Fix type in brcompatd man page.
Ben Pfaff [Mon, 18 May 2009 20:43:45 +0000 (13:43 -0700)]
xenserver: Drop unused variable.
Ben Pfaff [Mon, 18 May 2009 20:31:00 +0000 (13:31 -0700)]
xenserver: Fix missing default route problem.
Taking down the vlan_slave's network device deletes any routes associated
with it, including the default route, and those routes are not restored
when the network device is brought up again later.
So don't take that device down at all. There's no reason to do so.
We still want to bring that device up after configuring the pif, just in
case it was down, so this commit only deletes the down_netdev, not the
corresponding up_netdev later on.
Second half of bug #1327.
Ben Pfaff [Mon, 18 May 2009 20:15:28 +0000 (13:15 -0700)]
xenserver: Choose correct management PIF even with --force.
The --force option was failing to choose the management PIF based on
the "management" PIF field. This commit fixes the problem.
This is a partial fix for bug #1327.
Ben Pfaff [Mon, 18 May 2009 19:23:39 +0000 (12:23 -0700)]
xenserver: Delete code unneeded for openvswitch.
openvswitch doesn't care about the ordering here.
Ben Pfaff [Mon, 18 May 2009 19:23:03 +0000 (12:23 -0700)]
xenserver: Make primary management interface on VLAN possible.
Before xapi puts the primary management interface on a new PIF, it verifies
that the PIF's network device exists and has an IP address. For physical
PIFs and for bonds, this works fine. For VLANs, the openvswitch does not
put the IP address on the network device that xapi expects, but on a
different network device (e.g. openvswitch puts it on eth0.X instead of on
xapiY), so this checks fails with the message "The specified interface
cannot be used because it has no IP address".
This commit makes interface-reconfigure "fake out" xapi by putting a
dummy IP address on the xapi# interface.
Bug #1325.
Justin Pettit [Mon, 18 May 2009 15:49:45 +0000 (08:49 -0700)]
xen: Fix permission on xsconsole plugin
Not sure why the file permissions on the xsconsole plugin changed, but
this reverts them back.
Justin Pettit [Mon, 18 May 2009 15:45:56 +0000 (08:45 -0700)]
xen: xsconsole plugin cleanup based on Citrix feedback
Various cleanups to the vSwitch plugin for xsconsole based on feedback
from Citrix. The changes include a cleaner way to pop-up temporary
messages, leaving a success or failure box after performing an action,
and safer subprocess calling.
Ben Pfaff [Fri, 15 May 2009 23:56:33 +0000 (16:56 -0700)]
ovs-dpctl: Rename commands for consistency.
Ben Pfaff [Fri, 15 May 2009 22:57:53 +0000 (15:57 -0700)]
ovs-dpctl: Remove get-idx and get-name commands.
These commands made some sense when dpctl only accepted numerical datapath
identifiers, but now it accepts names as well as numbers. These commands
were never really used much, so delete them.
Ben Pfaff [Fri, 15 May 2009 22:48:58 +0000 (15:48 -0700)]
Break dpctl into two programs: ovs-ofctl and ovs-dpctl.
The datapath and OpenFlow are fairly different and it seems wrong
conceptually to work with both in a single program. So this commit breaks
them up into two programs.
Ben Pfaff [Fri, 15 May 2009 20:18:46 +0000 (13:18 -0700)]
xenserver: Fix package build.
Ben Pfaff [Fri, 15 May 2009 19:59:35 +0000 (12:59 -0700)]
ovs-cfg-mod: Accept -v option before any targets are specified.
Setting the log levels should be allowed before specifying targets, but it
wasn't. This fixes it.
This fixes a failure to bring up network interfaces at boot (and at any
other time too).
Ben Pfaff [Fri, 15 May 2009 19:40:05 +0000 (12:40 -0700)]
Rename vlogconf to ovs-appctl, for consistency and as a better name.
The vlogconf program at one time only affected log levels. Now, it can do
more, and probably will be expanded in the future, so appctl is a better
name.
Ben Pfaff [Fri, 15 May 2009 18:16:52 +0000 (11:16 -0700)]
Rename cfg-mod to ovs-cfg-mod, for consistency.
Ben Pfaff [Fri, 15 May 2009 18:14:39 +0000 (11:14 -0700)]
xenserver: Fix path to invoke cfg-mod utility when removing a vif.
It's possible that this fixes a real bug.
Ben Pfaff [Fri, 15 May 2009 18:08:03 +0000 (11:08 -0700)]
Drop stray references to udatapath, which doesn't exist any longer.
Ben Pfaff [Fri, 15 May 2009 18:02:08 +0000 (11:02 -0700)]
Rename "controller" to "ovs-controller" and move to utilities.
The controller is unimportant and we don't want people thinking that it is
important. (They should be using NOX, not the OpenVSwitch controller.)
So kill off its top-level directory.
Justin Pettit [Fri, 15 May 2009 01:11:57 +0000 (18:11 -0700)]
netflow: Set Engine and Add Alternate Port ID Flag
Allow the Engine Type and Engine ID to be configured in NetFlow
messages. Previously, they were always set to zero. Now, by default,
they are both set to the datapath index. These can be overridden with
the "netflow.bridge.engine-type" and "netflow.bridge.engine-id" keys,
respectively.
Add the ability to allow collectors to distinguish between virtual
switches sending message from the host. When the
"netflow.bridge.add-id-to-iface" flag is enabled, the least significant
7 bits of the engine id are placed into the most significant bits of
the ingress and egress interface fields of flow records. This
mimics the behavior of VMware ESX when the "-p" option is given to the
NetFlow configuration program.
(Addresses Bug #1222 and Bug #1223)
Justin Pettit [Thu, 14 May 2009 23:21:03 +0000 (16:21 -0700)]
netflow: Document 1400-byte packet length limit.
The NetFlow code accumulates records until the packet is 1400 bytes or
some amount of time has passed. This just adds a comment that NetFlow
messages are limited to 30 records, which places a ceiling on how large
the message can be (1400 is below that ceiling).
Justin Pettit [Thu, 14 May 2009 22:07:03 +0000 (15:07 -0700)]
vswitch: reduce passes through config loop
When vswitchd configures itself, it loops through the bridges to pull
relevant configuration information from vswitchd.conf. Configuration
for determining the dpid, controller, and NetFlow was inside a loop for
configuring a bridge's interfaces. This meant that these configuration
parameters were being re-set for each interface in the bridge as opposed
to just once for the bridge. This change pushes this configuration
outside of that interface loop. The LIST_FOR_EACH_SAFE macro to loop
through the bridges is switched to LIST_FOR_EACH, since bridges are not
deleted in this loop.
Ben Pfaff [Thu, 14 May 2009 23:25:12 +0000 (16:25 -0700)]
netflow: Report largest possible value when counters exceed 32 bits.
secchan tracks packets and bytes using 64-bit counters, but the
corresponding NetFlow v5 fields are only 32 bits wide. Until now we would
just report the lower 32 bits on overflow; this commit changes the
behavior to reporting 0xffffffff instead.
Bug #1316.
Ben Pfaff [Thu, 14 May 2009 23:13:30 +0000 (16:13 -0700)]
xenserver: Avoid printing log messages to the console from scripts.
Log messages from cfg-mode tended to overwrite random bits of the xsconsole
screen.
Bug #1315.
Ben Pfaff [Thu, 14 May 2009 22:35:47 +0000 (15:35 -0700)]
brcompatd: Fix typo in previous commit.
Ben Pfaff [Thu, 14 May 2009 22:31:15 +0000 (15:31 -0700)]
brcompatd: Don't remove nonexistent ports if vswitchd will create them.
There is a race between brcompatd and vswitchd for internal ports, e.g.
in this scenario:
1. cfg-mod adds "bridge.xenbr0.port=xenbr0".
2. vswitchd creates xenbr0.
we can have brcompatd slip in between them:
2.5. brcompatd notices that there is no network device xenbr0
and deletes the new line from the config file.
For the local port and other internal ports we don't want brcompatd
interfering, so this commit makes it ignore them.
Bug #1314.
Ben Pfaff [Thu, 14 May 2009 21:06:08 +0000 (14:06 -0700)]
Get rid of vswitchext entirely.
Ben Pfaff [Thu, 14 May 2009 20:39:12 +0000 (13:39 -0700)]
Move watchdog timer utility from vswitchext to openvswitch.
Ben Pfaff [Thu, 14 May 2009 20:19:58 +0000 (13:19 -0700)]
Move ovs-monitor script from vswitchext to openvswitch.
Ben Pfaff [Thu, 14 May 2009 19:53:10 +0000 (12:53 -0700)]
Add configure option --disable-userspace for building kernel modules only.
Ben Pfaff [Thu, 14 May 2009 19:33:34 +0000 (12:33 -0700)]
vswitch: Give up hope that the config file delimiter will be changed.
This code originally assumed that it could iterate over all the subsections
of "port" in the configuration file to obtain the names of network devices,
but this didn't work because "." is both the configuration file section
delimiter and a valid (and fairly common) character in network device
names. So it was changed to use a different technique with the hope that
the original code could be restored when the configuration file syntax was
changed.
Now we've agreed that the configuration file syntax is not going to change
before we change to a different configuration model entirely, so this
commit deletes the original code (which was #if'd out, not deleted).
Another reason to do this is to kill off some warnings due to unused
functions and variables.
Ben Pfaff [Thu, 14 May 2009 19:25:26 +0000 (12:25 -0700)]
Delete OpenFlow management spec.
Although this spec was canonical, it was inaccurate and we plan to replace
it by netconf anyhow.
Ben Pfaff [Thu, 14 May 2009 19:24:43 +0000 (12:24 -0700)]
Delete OpenFlow spec.
The openvswitch tree is not the canonical source of the OpenFlow spec.
Get it from openflowswitch.org instead.
Ben Pfaff [Thu, 14 May 2009 18:22:01 +0000 (11:22 -0700)]
Revert "Apply temporary band-aid to VLAN-related OOPS on XenServer."
This reverts commit
ef4a656e73b3bf157519a8f6e239c0e5f54600a3,
since the need for it was eliminated (I hope).
Ben Pfaff [Thu, 14 May 2009 18:20:10 +0000 (11:20 -0700)]
datapath: Fix VLAN-related kernel OOPS on XenServer.
In deleting internal ports (other than the local port) we were failing to
call dp_del_if_hook even though we had called dp_add_if_hook when we added
it. This prevented the sysfs kobject from being released and caused the
wrong address to be passed to kfree. The former could cause random
memory corruption; the latter may be benign since the address was still in
the same slab object.
Ben Pfaff [Thu, 14 May 2009 16:08:20 +0000 (09:08 -0700)]
Apply temporary band-aid to VLAN-related OOPS on XenServer.
Now, VLAN devices will be disabled by default. To enable them, create a
file named /etc/vswitchd.enable-vlans.
This commit will be reverted when a real fix is available.
Ben Pfaff [Wed, 13 May 2009 22:11:37 +0000 (15:11 -0700)]
Move EZIO utilities from vswitchext into openvswitch.
Keith Amidon [Wed, 13 May 2009 21:48:42 +0000 (14:48 -0700)]
Fix for typo in warning message.
Ben Pfaff [Wed, 13 May 2009 21:17:09 +0000 (14:17 -0700)]
xenserver: Add comments describing open issues for interface-reconfigure.
Ben Pfaff [Wed, 13 May 2009 21:15:58 +0000 (14:15 -0700)]
xenserver: Fix --force up/down behavior in a resource pool.
The PIFs key of a network lists one PIF for each member of the pool, not
one PIF per bond or whatever I had in mind. So we need to iterate over
all the PIFs in the network and find the one for our current host.
Ben Pfaff [Wed, 13 May 2009 21:01:32 +0000 (14:01 -0700)]
Add support for Citrix XenServer.
This was previously in openflowext. Now we are adding it to openvswitch.
Ben Pfaff [Wed, 13 May 2009 19:14:46 +0000 (12:14 -0700)]
datapath: Fix build warnings and errors on Linux 2.6.15, 2.6.16, 2.6.17.
Ben Pfaff [Mon, 11 May 2009 21:26:44 +0000 (14:26 -0700)]
datapath: Add support for "internal" ports similar to the local port.
The datapath has supported a simulated "local port" for a long time, but it
has never been possible to create additional ports with the same
characteristics. One way to do this is using the veth driver, but this is
somewhat awkward, since there is no desire to create a pair of devices;
one suffices.
The immediate purpose for this feature is to allow an IP address to be put
on both a physical interface and a tagged VLAN attached to that interface
on Xen.
Justin Pettit [Wed, 13 May 2009 06:35:45 +0000 (23:35 -0700)]
Don't print warning about removing policy on startup.
The policing code attempts to delete any traffic control configuration on
startup, so that interfaces come up in a known state. If the interface
didn't have any traffic control configuration, this would cause it to
print a couple of scary sounding warning messages. This commit makes it
so those no longer print.
Justin Pettit [Wed, 13 May 2009 05:58:25 +0000 (22:58 -0700)]
Fix return value call on send() when sending NetFlow messages.
When sending NetFlow messages, we use the send() call, but were checking
the wrong return value. It would report an error when any non-zero
value was returned. The send() call returns the number of bytes sent or
-1 on error. Thus, whenever a NetFlow message was sent, it would
generate an error message. Now, we only log a message when a value of
-1 is returned. (Bug #1166)
Ben Pfaff [Thu, 7 May 2009 00:35:40 +0000 (17:35 -0700)]
datapath: Call rcu_barrier() before unloading module.
According to article "RCU and Unloadable Modules" available at lwn.net,
a module that uses RCU callbacks should call rcu_barrier() before
unloading, because synchronize_rcu() does not ensure that all RCU callbacks
have actually completed, only that a grace period has elapsed.
Ben Pfaff [Thu, 7 May 2009 00:14:37 +0000 (17:14 -0700)]
datapath: Always call dp_process_received_packet() with BHs disabled.
dp_process_received_packet() was assuming that bottom-halves were disabled,
but this was not true where it was called from dp_dev_do_xmit().
Allow, add comments documenting synchronization.
Ben Pfaff [Fri, 8 May 2009 18:24:58 +0000 (11:24 -0700)]
datapath: Omit sysfs-specific data when sysfs is not enabled or not supported.
This saves a few bytes of memory but it also makes it clear to the reader
what data is used for what.
Ben Pfaff [Tue, 5 May 2009 21:23:32 +0000 (14:23 -0700)]
datapath: Omit SNAT-specific data when SNAT is not enabled.
This saves a few bytes of memory but it also makes it clear to the reader
what data is used for what.
Ben Pfaff [Fri, 8 May 2009 17:46:19 +0000 (10:46 -0700)]
brcompatd: Log high-level actions and their results.
brcompatd did not log the addbr, delbr, addif, and delif actions that it
was taking. This commit adds that logging.
Ben Pfaff [Tue, 12 May 2009 20:48:26 +0000 (13:48 -0700)]
cfg-mod: Add --changes option for logging configuration changes.
This makes it a lot easier to see what actually changed.
Ben Pfaff [Tue, 12 May 2009 20:41:02 +0000 (13:41 -0700)]
cfg-mod: Make --query print all values, not just those that are valid keys.
A "key" has a strict syntax, so calling cfg_get_all_keys() will discard
all the values that don't have that syntax.
Ben Pfaff [Fri, 8 May 2009 17:39:17 +0000 (10:39 -0700)]
cfg: Log changes to config, not whole config, in cfg_read().
The configuration file is re-read on a regular basis by brcompatd and
vswitchd in practice. When debug-level logging is enabled on the cfg
module, it was logging the entire config file each time. Not only is this
a waste of log-file space, it's difficult for humans to see what actually
changed, if anything.
So this commit changes cfg_read() to log a diff instead of the whole config
file.
Ben Pfaff [Thu, 7 May 2009 17:35:17 +0000 (10:35 -0700)]
cfg: Improve comment.
Justin Pettit [Tue, 12 May 2009 17:44:36 +0000 (10:44 -0700)]
Only send NetFlow notifications for IP traffic.
NetFlow only supports exporting information about IP. We were sending a
notification for any flow that expired, which included non-IP packets.
This would generate NetFlow messages with nearly all fields set to zero.
Now, we only send NetFlow for packets that are IP. (Bug #1256)
Ben Pfaff [Tue, 12 May 2009 17:25:37 +0000 (10:25 -0700)]
Remove the ChangeLog since it is no longer relevant for OpenVSwitch.
It might make perfect sense to start a new file here for OpenVSwitch, but
the first item should be something like "1 June 2009: Initial public
release".
Ben Pfaff [Tue, 12 May 2009 17:21:43 +0000 (10:21 -0700)]
Remove spanning tree documentation, since STP doesn't work right now.
Justin Pettit [Mon, 11 May 2009 23:33:08 +0000 (16:33 -0700)]
Update OpenFlow tcpdump patch to work with latest code.
With some of the recent (and not so recent) changes to the source code,
the OpenFlow patch to tcpdump came out of sync. This brings it back so
it works again.
Justin Pettit [Mon, 11 May 2009 23:01:14 +0000 (16:01 -0700)]
Rename strlcpy to ovs_strlcpy.
If strlpy is not defined on the build system, we build our own and added
it to the OpenVSwitch library. Unfortunately, programs that link
against the library may do the same thing, and there will be a name
conflict. This renames our implementation to prevent these linking errors.
Ben Pfaff [Mon, 11 May 2009 17:36:32 +0000 (10:36 -0700)]
Rename the project to OpenVSwitch and change version number to 0.90.0.
The Debian packages have not been renamed yet, since they need plenty of
other work at the moment too.
Ben Pfaff [Tue, 5 May 2009 20:59:17 +0000 (13:59 -0700)]
datapath: Remove hardware table support.
This support was broken anyhow. We have no immediate plans to fix it, so
it's better not to claim to support it.
Ben Pfaff [Wed, 6 May 2009 22:40:21 +0000 (15:40 -0700)]
datapath: Compare entire flow during lookup, not just first 4 or 8 bytes.
The size of a pointer is not the size of the referent.
Only God knows how much havoc this was wreaking.
(The change in dp_table_lookup is for conformance with kernel style only.)
Ben Pfaff [Wed, 6 May 2009 22:35:25 +0000 (15:35 -0700)]
datapath: Make sure that the "reserved" byte in user-provided flow is zero.
Otherwise we could return a "false negative" lookup result to the user.
(This is not known to fix any real bug; for it to do so, there would have
to be userspace code that doesn't initialize the "reserved" byte, but I
don't know of any.)
Ben Pfaff [Tue, 5 May 2009 20:26:08 +0000 (13:26 -0700)]
Fix complaint from "make distcheck" about failing to clean cfg-mod.8.
Ben Pfaff [Tue, 5 May 2009 20:22:49 +0000 (13:22 -0700)]
datapath: Remove support for Linux 2.4.
Ben Pfaff [Tue, 5 May 2009 18:47:36 +0000 (11:47 -0700)]
vswitch: Restore MAC learning for broadcast ARP replies on bonds.
Bonding has a special exception for MAC learning: don't learn from packets
on bonded ports if we already have learned it on another port. This is
because packets sent out one port can be received on the other, which would
cause us to learn incorrect locations.
But we need to make an exception for broadcast ARP replies, which indicate
that the MAC in question has moved to another switch. Before commit
76fdb7e57 "Implement OFPP_NORMAL action in secchan and hook into vswitchd"
we did so, and this commit restores that behavior.
Ben Pfaff [Tue, 5 May 2009 17:45:16 +0000 (10:45 -0700)]
vswitch: Eliminate dead code.
The bridge had a flow_idle_time member that was set to a constant value
and never modified. This commit removes it.
(secchan is now responsible for configuring the flow idle time, so it is
not desirable to revive this member.)
Ben Pfaff [Tue, 5 May 2009 17:23:02 +0000 (10:23 -0700)]
Add support for coverage counters.
This commit implements a simple form of coverage instrumentation. Points
in source code that are of interest must be explicitly annotated with
COVERAGE_INC. The coverage counters may be logged at any time with
coverage_log().
This form of coverage instrumentation is intended to be so lightweight that
it can be enabled in production builds. It is obviously not a substitute
for traditional coverage instrumentation with e.g. "gcov", but it is still
a useful debugging tool.
Ben Pfaff [Mon, 4 May 2009 23:18:19 +0000 (16:18 -0700)]
secchan: When listing flows, uninstall rules that shouldn't be installed.
To implement flow expiration, secchan periodically queries all the flows
in the datapath flow table. Until now, it has then uninstalled flows that
do not have corresponding rules at all. It has not uninstalled flows that
do have rules that are not supposed to be installed. This commit makes it
also uninstall the latter.
(This is not known to fix any real problem. It is only for completeness.)
Ben Pfaff [Mon, 4 May 2009 23:14:35 +0000 (16:14 -0700)]
secchan: Reinstall flows deleted externally.
If something external to secchan deletes flows from the datapath (e.g.
the administrator runs "dpctl dp-del-flows") then until now secchan would
switch all of those packets manually, using dpif_execute(). Better
behavior is to reinstall the flow. This commit implements that.
Ben Pfaff [Mon, 4 May 2009 23:10:28 +0000 (16:10 -0700)]
datapath: Generalize flow creation and modification.
The ODP_FLOW_ADD and ODP_FLOW_SET_ACTS datapath commands can be usefully
generalized based on whether they should be allowed to create or modify
flows, or both, and whether they reset flow statistics when they modify
an existing flow. This commit does so by merging them into a single
ODP_FLOW_PUT command and adding a set of flags.
In particular this is needed to allow flows to be reinstalled if it is
uncertain whether they have been externally deleted (e.g. with "dpctl
dp-del-flows") without requiring first reading the flow table (as
handle_odp_msg() wants to do), and to replace a flow's actions and reset
its statistics without first deleting it (as rule_update_actions() wants
to do).
Also renames some other datapath commands, for naming consistency.
Also adapts userspace to these changes.
Ben Pfaff [Fri, 1 May 2009 20:35:36 +0000 (13:35 -0700)]
secchan: Honor OFPPC_NO_RECV, OFPPC_NO_RECV_STP, OFPPC_NO_FWD.
The refactoring of secchan and the kernel module dropped support for these
(required) OpenFlow port flags. This commit reimplements them.
Ben Pfaff [Fri, 1 May 2009 20:20:07 +0000 (13:20 -0700)]
secchan: Don't let queued packets exhaust memory.
The ofproto code was queuing OpenFlow messages to connections without
limiting the maximum number that could be queued at a time. Thus, the
backlog could grow without bound and exhaust all system memory.
This commit introduces a cap on the maximum number of queued messages
in two different categories: packet-in messages and replies to OpenFlow
requests.
Ben Pfaff [Thu, 30 Apr 2009 00:19:08 +0000 (17:19 -0700)]
secchan: Fix TCP flags and IP TOS tracking for packets sent from userspace.
Ben Pfaff [Wed, 29 Apr 2009 22:45:20 +0000 (15:45 -0700)]
secchan: Fix flow statistics tracking.
Updates of flow statistics have been ad hoc and somewhat broken for some
time now. This commit makes them much more systematic and more likely
to be correct.
Ben Pfaff [Wed, 29 Apr 2009 22:43:54 +0000 (15:43 -0700)]
secchan: Clean up and simplify handle_odp_msg().
Ben Pfaff [Wed, 29 Apr 2009 22:43:24 +0000 (15:43 -0700)]
secchan: Factor common code into new function rule_update_actions().
rule_update() was only called in two places, and each time it was done
in the same way, so factor this out into a single new function
rule_update_actions().
Ben Pfaff [Wed, 29 Apr 2009 22:42:59 +0000 (15:42 -0700)]
secchan: Optimize no-change case in modify_flow().
Ben Pfaff [Wed, 29 Apr 2009 22:42:16 +0000 (15:42 -0700)]
secchan: Factor common code into rule_remove().
Several pieces of code were calling rule_uninstall(), classifier_remove(),
then rule_destroy() in sequence. Factor this out into a helper function.
Ben Pfaff [Tue, 28 Apr 2009 00:07:16 +0000 (17:07 -0700)]
secchan: Factor common code into new function rule_insert().
This is primarily a code cleanup. It also fixes a corner case for
statistics that formerly was properly handled in add_flow() but not in
ofproto_add_flow().
Ben Pfaff [Mon, 27 Apr 2009 23:56:44 +0000 (16:56 -0700)]
secchan: Remove unused parameter from ofproto_add_flow().
The 'packet != NULL' case was effectively dead, since every caller passed
a constant NULL here, so delete the parameter and the code to handle the
non-NULL case.
Ben Pfaff [Wed, 29 Apr 2009 22:36:44 +0000 (15:36 -0700)]
secchan: Eliminate UNKNOWN_SUPER.
When a super-rule is destroyed, secchan must reassess each of its subrules.
Each subrule might now have no super-rule (which we suspect is the common
case) or it might have a new super-rule.
Until now, secchan has "optimized" this reassessment by initially assigning
each of the deleted super-rule's subrules a super-rule of UNKNOWN_SUPER,
which is not a valid rule at all. It did this in the hope that the
subrule would get deleted before we need to know what its super-rule is.
However, this has repeatedly led to bugs, since it's not always obvious
what code will need to find a rule's super-rule.
This commit fixes the problem by removing the "optimization" (in quotes
because there is no evidence that it was a useful optimization in
practice).
Ben Pfaff [Tue, 28 Apr 2009 17:28:03 +0000 (10:28 -0700)]
classifier: Make classifier_for_each() easier to use.
classifier_for_each() and classifier_for_each_match() previously had the
restriction that the callback could not delete any rule that would be
visited in the same call, even if it was in a different table (except for
the rule actually passed to the callback). But a number of callers do
want to delete rules in other tables, and it is easy to eliminate that
restriction, so this commit does so.
Ben Pfaff [Wed, 29 Apr 2009 22:28:43 +0000 (15:28 -0700)]
secchan: Reduce redundancy in handle_odp_msg().
Code cleanup.
Ben Pfaff [Mon, 27 Apr 2009 21:28:54 +0000 (14:28 -0700)]
secchan: Fix OpenFlow matching on output port with OFPFC_DELETE.
The implementation of matching on out_port was only half-implemented for
OFPFC_DELETE. It was probably just overlooked. This commit fixes it, by
supplying the other half.
Ben Pfaff [Wed, 29 Apr 2009 22:27:46 +0000 (15:27 -0700)]
secchan: Update byte, packet counts for packets switched by hand.
Sometimes packets can get passed down to userspace, in which case secchan
has to send them using dpif_execute(). When this happened we weren't
updating the packet or byte counters. Fix this.
Ben Pfaff [Fri, 1 May 2009 17:20:21 +0000 (10:20 -0700)]
datapath: Eliminate synchronize_rcu() in table swap.
We found out some time ago that synchronize_rcu() can block for multiple
seconds in some cases, so it's a good idea to eliminate as many of them
as we can.
This commit eliminates a call to synchronize_rcu() from functions that
expand or flush flow tables. To avoid adding a member to dp_table that
specifies the "free_flows" argument to dp_table_destroy(), the commit
uses two different callback functions and manually inlines dp_table_swap()
into its callers.
Bug #1233.
Ben Pfaff [Fri, 1 May 2009 17:20:35 +0000 (10:20 -0700)]
datapath: Eliminate synchronize_rcu() in port group update.
We found out some time ago that synchronize_rcu() can block for multiple
seconds in some cases, so it's a good idea to eliminate as many of them
as we can.
This removes such a call in set_port_group(). This requires adding an
rcu_head to the data structure for port groups; since until now we've just
used the same "struct odp_port_group" exported to userspace, this means
that we need to introduce a new "struct dp_port_group" for internal use,
which in turn causes a fair bit of code motion.
Bug #1233.
Ben Pfaff [Thu, 30 Apr 2009 19:57:20 +0000 (12:57 -0700)]
datapath: Fix memory leak in port group.
When we destroy a datapath, we need to free its port groups also.
This is a fairly small memory leak: a few hundred bytes, at most, and
it only occurred each time a datapath was destroyed.
Ben Pfaff [Thu, 30 Apr 2009 21:42:38 +0000 (14:42 -0700)]
leak-checker: Stop logging after fstat() fails.
When fstat() fails, we should either stop logging or re-set the leak
checker hooks. The previous code didn't do either.
Ben Pfaff [Thu, 30 Apr 2009 21:45:30 +0000 (14:45 -0700)]
leak-checker: Stop logging after an output error.
In particular it's not polite to continue trying to write to the output
file after ENOSPC, since it will immediately fill up the disk again should
anyone free up any space and keeping the log file open prevents usefully
deleting it.
Ben Pfaff [Thu, 30 Apr 2009 21:37:19 +0000 (14:37 -0700)]
leak-checker: Make output line-buffered.
Unbuffered output was ideal from the viewpoint of getting the maximum
amount of output when the process was killed, but it causes a dozen or
more system calls per log entry. Line buffering should be a reasonable
compromise.
Ben Pfaff [Thu, 30 Apr 2009 21:00:06 +0000 (14:00 -0700)]
brcompatd: Fix formatting of /proc/net/vlan files.
The C source file that I copped this from originally didn't spell out the
tab as \t, thus it was very difficult to see.
Thanks for Justin for pointing this out.
Ben Pfaff [Thu, 30 Apr 2009 20:59:08 +0000 (13:59 -0700)]
brcompat: Use named macro in place of literal constants.
Thanks to Justin for the suggestion.
Ben Pfaff [Thu, 30 Apr 2009 20:46:44 +0000 (13:46 -0700)]
datapath: Break up GSO packets before sending to userspace.
On a Xen host, over-MTU GSO packets from virtual machines can end up sent
down by the virtual switch to userspace. This happens, for example, if a
TCP flow has a long enough "pause" that the datapath flow times out. When
this happens, the packet is not marked as GSO when secchan sends it back up
via dpif_execute(), and the packet is then discarded in dp_xmit_skb() as
too large.
This commit solves the problem by breaking GSO packets into MTU-size pieces
before passing them along to userspace.
Tested by running "netperf" between two VMs on different boxes and running
"dpctl dp-del-flows" on the appropriate datapath a few times in the middle
and seeing that the total bandwidth didn't change much. Verified that
packets were actually being broken up by adding a printk call inside the
"if (skb_is_gso())" block.
Thanks to Justin and Keith for review.
Bug #1133.
Justin Pettit [Wed, 29 Apr 2009 22:43:57 +0000 (15:43 -0700)]
Fix policing performance issues with VIFs.
Policing is configured with the "tc" command. By default, it picks up
the MTU from the interface having policy applied. When a guest operating
systems is configured for segmentation offloading, the packets handed to
DOM0 may be substantially larger than the MTU. The policing code was
dropping these packets, which caused performance to dive. We now
configure policing with an MTU of 64K, which solves the problem.
Thanks to Ben for diagnosing the problem.