Ben Pfaff [Fri, 15 Apr 2011 16:31:36 +0000 (09:31 -0700)]
Fix calls to ctype functions.
The ctype functions often need casts to be fully C standards compliant.
Here's the full explanation that I used to post to comp.lang.c from time
to time when the issue came up:
With the to*() and is*() functions, you should be careful to cast
`char' arguments to `unsigned char' before calling them. Type `char'
may be signed or unsigned, depending on your compiler or its
configuration. If `char' is signed, then some characters have
negative values; however, the arguments to is*() and to*() functions
must be nonnegative (or EOF). Casting to `unsigned char' fixes this
problem by forcing the character to the corresponding positive value.
This fixes the following warnings from some version of GCC:
lib/ofp-parse.c:828: warning: array subscript has type 'char'
lib/ofp-print.c:617: warning: array subscript has type 'char'
Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Ethan Jackson [Fri, 15 Apr 2011 18:06:02 +0000 (11:06 -0700)]
bridge: Destroy bond when port is destroyed.
Ethan Jackson [Thu, 14 Apr 2011 23:50:26 +0000 (16:50 -0700)]
bond: Completely pull LACP module out of bond.
The bonding code only needs to know whether a given slave may be
enabled, and whether LACP has been negotiated on the bond. Instead
of passing in the LACP handle and letting the bond query this
information. This patch passes in the information directly.
Ethan Jackson [Fri, 15 Apr 2011 00:37:29 +0000 (17:37 -0700)]
bond: Create new 'stable_id' parameter.
For BM_STABLE bonds, instead of choosing the sort key in the
qsort() comparator, this patch makes it a configuration setting of
each slave. This will help wrest LACP out of the bonding code
further in future patches.
Ethan Jackson [Thu, 14 Apr 2011 00:58:26 +0000 (17:58 -0700)]
bond: Give bridge control over LACP module.
Before this patch, the bonding code had taken over responsibility
for running the LACP module. However, the bonding code only needs
the LACP module for some basic status queries. LACP and bonding
are actually logically parallel modules and do not really have a
parent child relationship. Furthermore, we need to be able to run
LACP on non-bonded interfaces which the existing approach
prevented. This patch gives control of the LACP module back to the
bridge.
Ethan Jackson [Thu, 14 Apr 2011 22:24:18 +0000 (15:24 -0700)]
lacp: Remove enabled flag.
The enabled flag in the LACP module was only used to set the
Collecting and Distributing flags in the LACP protocol. It was
intended to be set by the bonding code to mimic its enabled flag.
The spec is relatively vague on the precise meaning of these flags,
and most implementations do something completely different with
them. For these reasons, it seems acceptable to remove the enabled
flag for the sake of simplicity. A slave is now Collecting and
Distributing if it is attached, or LACP couldn't be negotiated.
Ben Pfaff [Thu, 14 Apr 2011 17:22:21 +0000 (10:22 -0700)]
vswitchd: Document how to disable inactivity probes.
This has always been implemented but it was not documented until now.
Reported-by: Alex Yip <alex@nicira.com>
Ethan Jackson [Tue, 12 Apr 2011 21:15:46 +0000 (14:15 -0700)]
bond: New bonding mode "stable".
Stable bonds attempt to assign a given flow to the same slave
consistently.
Ethan Jackson [Tue, 12 Apr 2011 20:39:32 +0000 (13:39 -0700)]
bond: New function bond_is_balanced().
As new bond modes are added, it will be nice to have the logic
indicating whether or not a given bond mode requires rebalancing in
one place.
Ethan Jackson [Wed, 13 Apr 2011 21:55:19 +0000 (14:55 -0700)]
lacp: New function lacp_slave_get_port_id().
Will be used in future commits.
Ethan Jackson [Tue, 12 Apr 2011 22:15:32 +0000 (15:15 -0700)]
bond: Use bond_enable_slave at slave registration.
Slave registration should go through the normal slave enabling
facilities instead of doing it by hand. Before this patch, newly
created slaves would have no tag associated with them.
Furthermore, any further changes to how slaves are enabled would
not be picked up by the registration code.
Ethan Jackson [Wed, 13 Apr 2011 20:56:37 +0000 (13:56 -0700)]
bond: Reset bond_entry's during massive flow revalidations.
When all flows in a bond are revalidated, stale bond_entry's can
cause incorrect load balancing. These issues will naturally
resolve themselves overtime. However, it's better to deal with
them immediately.
Ethan Jackson [Wed, 13 Apr 2011 01:28:04 +0000 (18:28 -0700)]
bond: Revalidate flows when bond_is_tcp_hash() changes;
If LACP causes the return of bond_is_tcp_hash to change for
whatever reason, all flows should be revalidated because they will
have a different hash result.
Ethan Jackson [Wed, 13 Apr 2011 00:53:24 +0000 (17:53 -0700)]
bond: Reconfigure flows when bond mode changes.
Changes in the bonding mode can cause drastic changes in flow
assignments to slaves. This commit causes all flows in a bridge
to be revalidated when bond_reconfigure() changes its bonding mode.
This approach is a bit aggressive, but bond reconfiguration
shouldn't happen often.
Ben Pfaff [Tue, 12 Apr 2011 18:43:11 +0000 (11:43 -0700)]
configure: Add option --enable-Werror to add -Werror to CFLAGS.
-Werror is useful for development, but it screws up configure because it's
impossible to guess what new warnings compilers will add in the future.
This commit adds a new configure option to add CFLAGS after the configure
checks are done.
The use of AC_CONFIG_COMMANDS_PRE is based on Eric Blake's suggestion on
the autoconf mailing list: "AC_CONFIG_COMMANDS_PRE probably fits the bill
as the ideal macro to use for guaranteeing that you inject your shell code
at the last possible moment."
Requested-by: Andrew Evans <aevans@nicira.com>
Ethan Jackson [Tue, 12 Apr 2011 20:20:13 +0000 (13:20 -0700)]
xenserver: Fix typo in RPM install message.
Ethan Jackson [Tue, 12 Apr 2011 01:33:06 +0000 (18:33 -0700)]
xenserver: Don't openvswitch-xapi-update in bridge mode.
This commit causes the init scripts not to call the
openvswitch-cfg-update plugin when in bridge mode.
Ethan Jackson [Tue, 12 Apr 2011 01:29:02 +0000 (18:29 -0700)]
xenserver: Warn when upgrading OVS on a bridged system.
Ben Pfaff [Tue, 12 Apr 2011 18:31:58 +0000 (11:31 -0700)]
ovsdb-idl: Suppress "delete" operations for garbage-collected tables.
Deciding what delete operations to issue on garbage-collected tables has
been a bit of a difficult issue for ovs-vsctl. When garbage collection was
introduced in commit
c5f341a "ovsdb: Implement garbage collection",
ovs-vsctl did not issue any deletions for these tables at all. As a side
effect, ovs-vsctl did not notice that records were going to be deleted.
That meant that when multiple commands were issued in one ovs-vsctl run,
ovs-vsctl could get confused by apparent duplicate records that did not
in fact exist. Commit
28a14bf "ovs-vsctl: Back out garbage collection
changes" fixed the problem by putting all of the explicit deletions back
into ovs-vsctl.
However, adding these explicit deletions had the price that it then became
(again) impossible to use ovs-vsctl commands to delete duplicates, for
example to use "ovs-vsctl del-br" to delete a bridge that points to the
same Port records that some other Bridge record also does. This commit
makes that possible again, by implementing a compromise:
* Internally, ovs-vsctl deletes the records that it believes should be
deleted.
* ovsdb-idl suppresses the deletions when it makes the RPC call into
the database server.
Bug #5358.
Reported-by: Henrik Amren <henrik@nicira.com>
Ethan Jackson [Mon, 11 Apr 2011 23:17:39 +0000 (16:17 -0700)]
tests: Unit test autopath via ovs-ofctl.
This patch adds test designed to verify the correctness of the
parsing function introduced with the autopath action.
Andrew Evans [Tue, 12 Apr 2011 17:40:15 +0000 (10:40 -0700)]
pcap: Silence warnings about fwrite(3) return value being ignored.
Ben Pfaff [Tue, 12 Apr 2011 17:02:40 +0000 (10:02 -0700)]
debian: Do not call obsolete command "ovs-ofctl status" in ovs-bugtool.
This command was removed in commit
9b45d7f5d (ofproto: Get rid of archaic
"switch status" OpenFlow extension) but I didn't notice that ovs-bugtool
uses that command and forgot to remove it at the time.
Bug #5360.
Reported-by: Michael Mao <mmao@nicira.com>
Reported-by: Keith Amidon <keith@nicira.com>
Justin Pettit [Wed, 6 Apr 2011 05:17:03 +0000 (22:17 -0700)]
Release Open vSwitch 1.1.0
Ethan Jackson [Tue, 5 Apr 2011 19:37:52 +0000 (12:37 -0700)]
autopath: Create the autopath action.
The newly created autopath action will be the way OpenFlow
interacts with the existing bonding infrastructure.
Ben Pfaff [Fri, 8 Apr 2011 23:38:42 +0000 (16:38 -0700)]
dpif-linux: Avoid logging error on ENOENT in dpif_linux_is_internal_device().
ENOENT can be returned if the kernel module isn't loaded. If that's the
case then we've already logged that and there's no point in logging it
again.
Ben Pfaff [Fri, 8 Apr 2011 23:37:22 +0000 (16:37 -0700)]
dpif-linux: Avoid segfault on netdev_get_stats() without kernel module.
netdev_linux_get_stats() calls into netdev_vport_get_stats(), which in
turn attempts a transaction on genl_sock. If the kernel module isn't
loaded, then genl_sock won't be there, and in any case there's nothing that
guarantees that it's been initialized yet.
This fixes the problem by ensuring that dpif_linux was initialized properly
before attempting a transaction on genl_sock.
Reported-by: Aaron Rosen <arosen@clemson.edu>
Ben Pfaff [Fri, 8 Apr 2011 23:34:17 +0000 (16:34 -0700)]
netdev-linux: Fix netdev_send() to tap device.
Commit
76c308b50d3 "netdev-linux: Support 'send' for netdevs opened with
NETDEV_ETH_TYPE_NONE" broke sending packets to tap devices. Sending a
packet to a tap device with an AF_PACKET socket causes that packet to be
looped back to be received on the tap device again, which obviously isn't
useful.
Ben Pfaff [Fri, 8 Apr 2011 21:23:13 +0000 (14:23 -0700)]
netdev-linux: Fix blocking while sending packets.
The AF_PACKET socket needs to be in nonblocking mode or trying to send
a packet can take a long time.
Ben Pfaff [Fri, 8 Apr 2011 23:44:31 +0000 (16:44 -0700)]
netdev-linux: Avoid "cleverness" in swap_uint64().
Obviously correct code is easier on everyone. As the C FAQ says:
20.15c: How can I swap two values without using a temporary?
A: The standard hoary old assembly language programmer's trick is:
a ^= b;
b ^= a;
a ^= b;
But this sort of code has little place in modern, HLL
programming. Temporary variables are essentially free,
and the idiomatic code using three assignments, namely
int t = a;
a = b;
b = t;
is not only clearer to the human reader, it is more likely to be
recognized by the compiler and turned into the most-efficient
code (e.g. using a swap instruction, if available). The latter
code is obviously also amenable to use with pointers and
floating-point values, unlike the XOR trick. See also questions
3.3b and 10.3.
Ben Pfaff [Wed, 30 Mar 2011 00:05:52 +0000 (17:05 -0700)]
bridge: Monitor fewer OVSDB columns.
By omitting columns that ovs-vswitchd does not use at all, and omitting
alerts for columns that ovs-vswitchd writes to but does not read, we can
save CPU time and bandwidth.
Ben Pfaff [Fri, 1 Apr 2011 17:50:52 +0000 (10:50 -0700)]
ovsdb-idl: Fix atomicity of writes that don't change a column's value.
The existing ovsdb_idl_txn_commit() drops any writes that don't change a
column's value from what was last reported by the database. But this isn't
a valid optimization, because it breaks the atomicity of transactions.
Suppose columns A and B initially have values 1 and 2. Client 1 writes
value 1 to both columns in one transaction. Client 2 writes value 2 to
both columns in another transaction. The only possible valid results for
any serial ordering of transactions are 1,1 or 2,2. But if both clients
drop writes to columns that they have not modified, then 2,1 also becomes
possible (because client 1 just writes to B and client 2 just writes to A).
However, for write-only columns we can optimize this out because the IDL
can assume it is the only client writing to a column.
Found by inspection.
Andrew Evans [Thu, 7 Apr 2011 19:43:18 +0000 (19:43 +0000)]
datapath: Update netdev_frame_hook() for 2.6.39 rx handler API change.
netdev_rx_handler_register() changed the type of the skb argument to the
callback function as well as the return type. Special-case
netdev_frame_hook() to do the right thing on 2.6.39 and later.
Signed-off-by: Andrew Evans <aevans@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Ethan Jackson [Thu, 7 Apr 2011 00:23:40 +0000 (17:23 -0700)]
cfm: Fix broken fault logic.
If the last receive time for a remote MP was before the last fault
check, the CFM code would not declare a fault. This is, of course,
exactly the wrong response.
Bug #5303.
Ethan Jackson [Wed, 6 Apr 2011 23:59:22 +0000 (16:59 -0700)]
bridge: Run once before configuring CFM.
CFM configuration requires the ofproto_run function to have been
executed at least once in order to guarantee that the relevant
ports exist.
Bug #5303.
Ben Pfaff [Wed, 6 Apr 2011 16:15:58 +0000 (09:15 -0700)]
Update top-level documentation to bring it up to date with latest features.
Ethan Jackson [Mon, 4 Apr 2011 23:55:34 +0000 (16:55 -0700)]
dpif-linux: Choose port numbers more prudently.
Before this patch the kernel chose the lowest available number for
newly created datapath ports. This patch moves the port number
choosing responsibility to user space, and implements a least
recently used port number queue in an attempt to avoid reuse.
Bug #2140.
Ethan Jackson [Sat, 2 Apr 2011 00:37:56 +0000 (17:37 -0700)]
bond: Choose slaves randomly.
When the bonding library encounters a flow it hasn't seen before,
it assigns it to the active slave and waits for load balancing to
move it to a more appropriate place. This commit causes it to
first attempt a random slave.
Ben Pfaff [Mon, 4 Apr 2011 17:59:19 +0000 (10:59 -0700)]
daemon: Avoid races on pidfile creation.
Until now, if two copies of one OVS daemon started up at the same time,
then due to races in pidfile creation it was possible for both of them to
start successfully, instead of just one. This was made worse when a
previous copy of the daemon had died abruptly, leaving a stale pidfile.
This commit implements a new pidfile creation and removal protocol that I
believe closes these races. Now, a pidfile is asserted with "link" instead
of "rename", which prevents the race on creation, and a stale pidfile may
only be deleted by a process after it has taken a lock on it.
This may solve mysterious problems seen occasionally on vswitch restart.
I'm still puzzled by these problems, however, because I don't see anything
in our tests cases that would actually cause two copies of a daemon to
start at the same time, which as far as I can see is a necessary
precondition for the problem.
Ben Pfaff [Thu, 31 Mar 2011 16:44:30 +0000 (09:44 -0700)]
daemon: Integrate checking for an existing pidfile into daemonize_start().
Until now, it has been the responsibility of an individual daemon to call
die_if_already_running() at an appropriate time. A long time ago, this
had to happen *before* daemonizing, because once the process daemonized
itself there was no way to report failure to the process that originally
started the daemon. With the introduction of daemonize_start(), this is
now possible, but we haven't been taking advantage of it.
Therefore, this commit integrates the die_if_already_running() call into
daemonize_start() and deletes the calls to it from individual daemons.
Ben Pfaff [Thu, 31 Mar 2011 16:36:10 +0000 (09:36 -0700)]
daemon: Tolerate EINTR in fork_and_wait_for_startup().
It seems possible that a signal coming in at the wrong time could confuse
this code. It's always best to loop on EINTR.
Ben Pfaff [Thu, 31 Mar 2011 23:23:50 +0000 (16:23 -0700)]
Log anything that could prevent a daemon from starting.
If a daemon doesn't start, we need to know why. Being able to
consistently consult the log to find out is helpful.
Ben Pfaff [Thu, 31 Mar 2011 21:50:58 +0000 (14:50 -0700)]
util: New function ovs_fatal_valist().
This commit adds a few initial users but more are coming up.
Ben Pfaff [Fri, 1 Apr 2011 17:22:51 +0000 (10:22 -0700)]
signals: New function signal_name().
This will acquire a new user in an upcoming commit.
Ben Pfaff [Fri, 1 Apr 2011 17:20:17 +0000 (10:20 -0700)]
type-props: New macro for estimating length of a decimal integer.
Ben Pfaff [Thu, 31 Mar 2011 21:50:20 +0000 (14:50 -0700)]
Add a few more users for ovs_retval_to_string().
Ben Pfaff [Thu, 31 Mar 2011 21:58:37 +0000 (14:58 -0700)]
vlog: Use PRINTF_FORMAT macro from compiler.h.
PRINTF_FORMAT is more portable than raw __attribute__. We use it
elsewhere, I don't know why it wasn't used here.
Ben Pfaff [Thu, 31 Mar 2011 21:52:36 +0000 (14:52 -0700)]
stream-ssl: Use out_of_memory() to abort due to lack of memory.
This matches what xmalloc() does. It will be handled better by a monitor
process (created with --monitor), which will restart the child instead of
exiting.
Ben Pfaff [Fri, 1 Apr 2011 20:47:51 +0000 (13:47 -0700)]
xenserver: Fix up iface-id after it changes or disappears too.
ovs-xapi-sync is supposed to always keep external-ids:iface-id up to date,
but in fact it would only set it when an interface initially appeared. If
the interface quickly disappeared and reappeared, then it failed to notice
that iface-id had changed or disappeared. This happens in practice on
Citrix XenServer, where VM "tap" devices often disappear and then reappear
almost immediately during VM boot. This commit fixes the problem.
This also fixes the similar problem for external-ids:bridge-id in Bridge
records. Bridges aren't ordinarily destroyed and re-created quickly, so
this problem might never have manifested in practice for bridges.
Many thanks to Reid Price <reid@nicira.com> for identifying the problem
and supplying an initial fix.
Bug #5239.
Reported-by: Henrik Amren <henrik@nicira.com>
Ben Pfaff [Thu, 24 Mar 2011 19:46:08 +0000 (12:46 -0700)]
bridge: Avoid partitioning the dst set.
Scanning the dsts twice seems may be a little more efficient than
partitioning it, and it now seems more straightforward to me.
Ben Pfaff [Thu, 24 Mar 2011 19:51:14 +0000 (12:51 -0700)]
bridge: Separate mirroring logic from forwarding logic.
In my opinion this is easier to understand than the way that these
two logically separate steps were previously entangled.
Ben Pfaff [Thu, 24 Mar 2011 19:30:51 +0000 (12:30 -0700)]
bridge: Change "struct dst" from containing a dp_ifidx to a struct iface *.
The following commit will need to iterate over a set of "struct
dst"s, obtaining the iface for each. It could look them up using
the hash table that indexes over dp_ifidx, but it's easier if we
simply store the iface pointer directly.
Ben Pfaff [Thu, 24 Mar 2011 17:29:36 +0000 (10:29 -0700)]
bridge: Get rid of 'n_ifaces' member of struct port.
If it doesn't exist then it can't have the wrong value.
Ben Pfaff [Wed, 30 Mar 2011 18:03:16 +0000 (11:03 -0700)]
bridge: Break bonding implementation out into library.
This removes over 1000 lines of code from bridge.c and will make it
easier to moving the bonding implementation into ofproto as part of
future development.
Ben Pfaff [Wed, 23 Mar 2011 17:47:15 +0000 (10:47 -0700)]
bridge: Simplify and clean up bond slave enable/disable.
The code that enables and disables bond slaves was a bit of a mess:
* Disabling a slave could recursively enable a different slave.
* Processing a flow could enable a slave.
This commit gets rid of both of those properties, which made it difficult
to reason about the code paths along which slaves would be enabled and
disabled.
Bug #5121.
Ben Pfaff [Thu, 24 Mar 2011 19:46:26 +0000 (12:46 -0700)]
bridge: Drop obsolete comment.
It's quite clear that we don't support double tagging now.
Ben Pfaff [Thu, 24 Mar 2011 18:11:32 +0000 (11:11 -0700)]
bridge: Improve comment.
Ben Pfaff [Tue, 29 Mar 2011 18:33:10 +0000 (11:33 -0700)]
bridge: Change Ethernet address array from 8 bytes to ETH_ADDR_LEN bytes.
I don't know why this was declared as 8 bytes long but I only see 6
actually in use, as one would expect of an Ethernet address.
Ben Pfaff [Tue, 29 Mar 2011 17:16:43 +0000 (10:16 -0700)]
bridge: Avoid redundant dpif_flow_flush().
ofproto_create() also calls dpif_flow_flush() very soon afterward. This
seems more clearly in ofproto's domain anyhow.
Ben Pfaff [Mon, 21 Mar 2011 21:30:33 +0000 (14:30 -0700)]
lacp: Encapsulate configuration into new structs.
This makes it easier to pass configuration between modules.
Ben Pfaff [Mon, 21 Mar 2011 19:49:44 +0000 (12:49 -0700)]
bridge: Drop LACP configuration members from struct iface and struct port.
There's no reason that I can see to maintain this information in struct
port and struct iface. It's redundant, since the lacp implementation
maintains the same information.
Ben Pfaff [Mon, 21 Mar 2011 20:15:53 +0000 (13:15 -0700)]
lacp: Remove unneeded forward references from header file.
Ben Pfaff [Mon, 21 Mar 2011 20:15:31 +0000 (13:15 -0700)]
lacp: Fix misleading prototype for lacp_configure().
Only the first 6 bytes (ETH_ADDR_LEN) of the 'sys_id' argument are used,
but the prototype declared it as an array of 8 bytes. This has no effect
on the generated code--the declared size of an array parameter is
irrelevant--but it is misleading.
Also, add 'const' since the array is not modified.
Ben Pfaff [Tue, 29 Mar 2011 16:30:04 +0000 (09:30 -0700)]
vswitch: Improve schema documentation.
Ben Pfaff [Thu, 24 Mar 2011 18:00:39 +0000 (11:00 -0700)]
tag: New function tag_set_union().
Ben Pfaff [Thu, 24 Mar 2011 16:40:07 +0000 (09:40 -0700)]
list: New functions list_is_singleton(), list_is_short().
Ben Pfaff [Tue, 29 Mar 2011 16:28:49 +0000 (09:28 -0700)]
packets: Reserve headroom for VLAN header in eth_compose(), snap_compose().
This allows callers to add a VLAN header to the composed packet and send
it out on a VLAN without copying the whole payload.
Ben Pfaff [Tue, 29 Mar 2011 16:27:47 +0000 (09:27 -0700)]
packets: New function eth_set_vlan_tci(), from dpif-netdev.
This will soon be used in the upcoming bond library.
Ben Pfaff [Thu, 24 Mar 2011 20:35:15 +0000 (13:35 -0700)]
packets: Fix potential use-after-free in compose_benign_packet().
The second call to ofpbuf_put_zeros() could cause the 'eth' pointer to
be invalidated.
It appears that this does not fix a real bug because the existing callers
all preallocate 128 bytes of tailroom, but the interface doesn't document
that requirement.
Ben Pfaff [Thu, 24 Mar 2011 20:34:05 +0000 (13:34 -0700)]
packets: New function snap_compose(); rename compose_packet() for consistency.
The following commit will introduce the first use of snap_compose().
Ben Pfaff [Fri, 1 Apr 2011 17:17:52 +0000 (10:17 -0700)]
netdev-vport: Implement 'send' function.
The new implementation of the bonding code expects to be able to send
packets on netdevs using netdev_send(). This implements it.
Ben Pfaff [Fri, 1 Apr 2011 16:22:39 +0000 (09:22 -0700)]
netdev-linux: Support 'send' for netdevs opened with NETDEV_ETH_TYPE_NONE.
The new implementation of the bonding code expects to be able to
send packets using netdev_send(). This makes it possible for
Linux netdevs.
Ben Pfaff [Fri, 1 Apr 2011 22:46:22 +0000 (15:46 -0700)]
ovsdb-server: Avoid intermittent test failures due to lockfile log message.
Sometimes lockfile will emit a message saying that it took a little while
to get the lock, which caused spurious test failures. This commit
suppresses the message. With this change, I was able to run these tests
continuously for some time without failures.
This was a bug in the testsuite, not in the code under test.
Ethan Jackson [Fri, 1 Apr 2011 20:10:49 +0000 (13:10 -0700)]
cfm: Allow time for CCM reception after cfm_configure();
Before this (and the previous) patch, whenever cfm_configure was
called it would set the fault_timer to expired. Thus, the next
call to cfm_run would notice a lack of CCM reception and trigger a
faulted status. This is a bug in and of itself, but normally would
not be a big deal because cfm_configure should only be called
infrequently (when the database changes). However due to an
unrelated bug, cfm_configure() was getting called approximately once
per second. This resulted in all monitors showing faults all of
the time.
This patch fixes the problem by not expiring the timer at
cfm_configure(). Instead it gives it the appropriate
fault_interval amount of time to miss heartbeats.
Bug #5244.
Ethan Jackson [Fri, 1 Apr 2011 20:22:44 +0000 (13:22 -0700)]
cfm: cfm_configure() only update when necessary.
Calling cfm_configure often could cause timers to be reset
resulting in unexpected behavior. This commit only updates when
cfm configuration actually changed.
Bug #5244.
Ben Pfaff [Mon, 28 Mar 2011 20:05:40 +0000 (13:05 -0700)]
ovsdb: Truncate bad transactions from database log.
When ovsdb-server reads a database file that is corrupted at the
transaction level (that is, the transaction is valid JSON and has the
correct SHA-1 hash, but it does not describe a valid database transaction),
then ovsdb-server should truncate it and overwrite it by valid
transactions. However, until now, it didn't. Instead, it would keep the
invalid transaction and possibly every transaction in the database file
(depending on in what way the transaction was invalid), which would just
cause the same trouble again the next time the database was read.
This fixes the problem. An invalid transaction will be deleted from the
database file at the first write to the database.
Bug #5144.
Bug #5149.
Ben Pfaff [Mon, 28 Mar 2011 19:57:20 +0000 (12:57 -0700)]
ovsdb: Check that ovsdb-server truncates corrupted database logs.
When ovsdb-server reads a database that is corrupted at the log level
(that is, when ovsdb_log detects the corruption by checking the SHA-1 hash
of the record or JSON parser error reporting), then writing to the database
should discard the corrupted data and thereby fix the problem for future
ovsdb-server runs.
This already worked OK. This just adds an extra test.
Ben Pfaff [Mon, 28 Mar 2011 19:59:18 +0000 (12:59 -0700)]
ovsdb: Raise database corruption log level from warning to error.
If there's database corruption then it indicates that something went wrong,
e.g. the machine was powered-off by power failure. It's definitely
something that the admin should know about. This sounds like an error to
me, so use that log level.
Ben Pfaff [Thu, 31 Mar 2011 23:43:43 +0000 (16:43 -0700)]
ovsdb: Force strong references to non-root tables to be persistent.
When a strong reference to a non-root table is ephemeral, the database log
can contain inconsistencies. In particular, if the column in question is
the only reference to a row, then the row will be created in one logged
transaction but the reference to it will not be logged (because it is
ephemeral). Thus, any later occurrence of the row later in the log (to
modify it, to delete it, or just to reference it) will yield a transaction
error and reading the database will abort at that point.
This commit fixes the problem by forcing any column with a strong reference
to a non-root table to be persistent.
The change to ovsdb_schema_from_json() looks bigger than it really is: it
just swaps the order of two operations on the schema and updates their
comments. Similarly for the update to ovs.db.DbSchema.__init__().
Bug #5144.
Reported-by: Sujatha Sumanth <ssumanth@nicira.com>
Bug #5149.
Reported-by: Ram Jothikumar <rjothikumar@nicira.com>
Ben Pfaff [Mon, 28 Mar 2011 17:48:36 +0000 (10:48 -0700)]
ovsdb-types: Fix bug in ovsdb_base_type_is_ref().
This function only worked properly inside OVSDB itself, because that is
the only place where the 'refTable' member of ovsdb_base_type is set.
Both inside and outside OVSDB, 'refTableName' is set for reference types,
so it's better to check for that.
This doesn't fix any existing bug because this function was only used
inside OVSDB until now.
Ben Pfaff [Fri, 25 Mar 2011 22:21:18 +0000 (15:21 -0700)]
ovs-brcompatd: Convert svecs to ssets.
Ben Pfaff [Fri, 25 Mar 2011 22:15:33 +0000 (15:15 -0700)]
bridge: Convert svecs to ssets.
Ben Pfaff [Fri, 25 Mar 2011 22:11:05 +0000 (15:11 -0700)]
ovs-openflowd: Use sset in place of svec.
Also deletes svec_split() since this was the only user.
Ben Pfaff [Fri, 25 Mar 2011 22:04:12 +0000 (15:04 -0700)]
ofproto: Change string sets in interface from svec to sset.
Ben Pfaff [Fri, 25 Mar 2011 20:20:35 +0000 (13:20 -0700)]
ovsdb-parser: Use sset instead of svec for detecting unused members.
Should be slightly cheaper than sorting a list (O(n) vs. O(n lg n)).
Ben Pfaff [Fri, 25 Mar 2011 20:04:47 +0000 (13:04 -0700)]
netdev: Use sset instead of svec in netdev interface.
Ben Pfaff [Fri, 25 Mar 2011 20:00:13 +0000 (13:00 -0700)]
dpif: Use sset instead of svec in dpif interface.
Ben Pfaff [Fri, 25 Mar 2011 22:26:30 +0000 (15:26 -0700)]
Convert shash users that don't use the 'data' value to sset instead.
In each of the cases converted here, an shash was used simply to maintain
a set of strings, with the shash_nodes' 'data' values set to NULL. This
commit converts them to use sset instead.
Ben Pfaff [Wed, 30 Mar 2011 20:44:10 +0000 (13:44 -0700)]
sset: New data type for a set of strings.
Many uses of "shash" or "svec" data structures really call for a "set of
strings" data type. This commit introduces such a data structure. Later
commits convert inappropriate uses of shash and svec to use sset instead.
Ethan Jackson [Tue, 29 Mar 2011 19:37:49 +0000 (12:37 -0700)]
learning-switch: Remove dead assignment.
Ethan Jackson [Thu, 31 Mar 2011 23:38:56 +0000 (16:38 -0700)]
ovs-ofctl: Remove dead assignment.
Ethan Jackson [Tue, 29 Mar 2011 19:25:17 +0000 (12:25 -0700)]
netdev-linux: Remove dead assignments.
Ethan Jackson [Thu, 31 Mar 2011 20:55:23 +0000 (13:55 -0700)]
ofproto: Use new timer library.
Ethan Jackson [Thu, 31 Mar 2011 20:54:44 +0000 (13:54 -0700)]
cfm: Use new timer library.
Ethan Jackson [Thu, 31 Mar 2011 20:54:15 +0000 (13:54 -0700)]
lacp: Use new timer library.
Ethan Jackson [Thu, 31 Mar 2011 20:46:04 +0000 (13:46 -0700)]
lib: Create new timer library.
Scattered throughout the code base we use long integers to
implement timers. When the result of timer_msec() is greater than
the time stored, we preform some action.
This commit creates a new timer library intended to replace these
manually managed timers. Code using the timer library will be more
obviously correct, and more consistent with other code using the
library.
Ethan Jackson [Thu, 31 Mar 2011 23:12:01 +0000 (16:12 -0700)]
cfm: Fix appctl negative report.
When the cfm module has never received a bad CCM message, it would
report a negative time.
Ben Pfaff [Thu, 31 Mar 2011 21:12:57 +0000 (14:12 -0700)]
bridge: Destroy ofproto before deleting dpif.
Otherwise the ofproto's attempt to flush flows from the dpif will fail with
an error, causing a spurious log message.
Ben Pfaff [Thu, 31 Mar 2011 21:13:36 +0000 (14:13 -0700)]
connmgr: Fix wild pointer dereference in connmgr_broadcast().
Fixes a segfault when fail-open goes into effect.
Ben Pfaff [Thu, 31 Mar 2011 21:11:57 +0000 (14:11 -0700)]
ofproto: Fix order of destruction in ofproto_destroy().
ofproto_flush_flows() calls into the connmgr (via connmgr_flushed()) so
it must be called before destroying the connmgr to avoid a use-after-free
error.
Bug #5231.
Reported-by: Krishna Miriyala <krishna@nicira.com>
Simon Horman [Thu, 31 Mar 2011 07:32:07 +0000 (16:32 +0900)]
datapath: Update for changes in 2.6.39-rc1
Update for flowi4 and ip_route_output_flow() changes
in 2.6.39-rc1.
Signed-off-by: Simon Horman <horms@verge.net.au>
[Jesse: drop redundant unlikely() from IS_ERR()]
Signed-off-by: Jesse Gross <jesse@nicira.com>