Ben Pfaff [Fri, 16 Jul 2010 16:22:23 +0000 (09:22 -0700)]
xenserver: Kill bond slaves' dhclients when bringing up bond
This fixes the converse of the problem addressed by commit
fe19e820
"xenserver: Kill bond master's dhclient when bringing up bond slave". In
that commit's log message, I claimed that the converse was not a problem,
but I was wrong. I must have screwed up in testing, because it really is
a problem. This commit fixes it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
CC: Dominic Curran <dominic.curran@citrix.com>
Reported-by: Michael Mao <mmao@nicira.com>
Bug #2668.
Jesse Gross [Thu, 15 Jul 2010 20:47:59 +0000 (13:47 -0700)]
doc: Make explicit that ovs-vswitchd is the preferred switch.
Many people who are looking for an OpenFlow switch assume that
the only way to do it is using ovs-openflowd. Sometimes they also
run ovs-vswitchd at the same time. Try to clarify that neither
of these are necessary and ovs-vswitchd can handle OpenFlow by
itself and is the preferred method of doing so.
Jesse Gross [Thu, 15 Jul 2010 03:15:53 +0000 (20:15 -0700)]
datapath: Don't update flow key when applying actions.
Currently the flow key is updated to match an action that is applied
to a packet but these field are never looked at again. Not only is
this a waste of time it also makes optimizations involving caching
the flow key more difficult.
Jesse Gross [Thu, 15 Jul 2010 02:42:57 +0000 (19:42 -0700)]
datapath: Don't set tunnel_id in a function.
We don't need a function to set a variable. In practice it will
almost certainly get inlined but this makes it easier to read.
Jesse Gross [Thu, 15 Jul 2010 01:45:45 +0000 (18:45 -0700)]
gre: Use struct for parsing GRE header.
GRE is a somewhat annoying protocol because the header is variable
length. However, it does have a few fields that are always present
so we can make the parsing seem less magical by using a struct for
those fields instead of building it up field by field.
Jesse Gross [Thu, 15 Jul 2010 01:35:58 +0000 (18:35 -0700)]
gre: Wait for an RCU grace period before freeing port.
We currently remove ports from the GRE hash table and then immediately
free the ports. Since received packets could be using that port this
can lead to a crash (the port has already been detached from the
datapath so this can't happen for transmitted packets). As a result
we need to wait for an RCU grace period to elapse before actually
freeing the port.
In an ideal world we would actually remove the port from the hash
table in a hypothetical gre_detach() function since this is one of
the purposes of detaching. However, we also use the hash table to
look for collisions in the lookup criteria and don't want to allow
two identical ports to exist. It doesn't matter though because we
aren't blocking on the freeing of resources.
Jesse Gross [Thu, 15 Jul 2010 02:59:41 +0000 (19:59 -0700)]
vport: Use DEFINE_PER_CPU instead of dynamically allocating loop counter.
DEFINE_PER_CPU is simpler and faster than alloc_percpu() so use it
for the loop counter, which is already statically defined.
Jesse Gross [Thu, 15 Jul 2010 02:27:18 +0000 (19:27 -0700)]
datapath: Put return type on same line as arguments for functions.
In some places we would put the return type on the same line as
the rest of the function definition and other places we wouldn't.
Reformat everything to match kernel style.
Jesse Gross [Thu, 15 Jul 2010 02:02:10 +0000 (19:02 -0700)]
vport: Use EBUSY to represent already attached device.
We currently use EEXIST to represent both a device that is already
attached and for GRE devices that are the same as another one.
Instead use EBUSY for already attached devices to disambiguate the
two situations.
Jesse Gross [Thu, 15 Jul 2010 02:46:23 +0000 (19:46 -0700)]
datapath: Make checksum offsets unsigned.
The offsets for checksum offsets should always be positive so make
that explicit by using unsigned ints. This helps bug checks that
test if the offsets are greater than their upper limits.
Ben Pfaff [Wed, 14 Jul 2010 23:36:42 +0000 (16:36 -0700)]
ovs-vsctl: Make --help capitalization and spelling more consistent.
Reported-by: Reid Price <reid@nicira.com>
Bug #3175.
Ben Pfaff [Wed, 14 Jul 2010 23:29:33 +0000 (16:29 -0700)]
Tests: Fix nonportable \" escape in printf(1) invocation.
The \" escape is not part of POSIX, but it is a common extension. The
dash shell's built-in "printf" implementation does not include this
extension, which caused the testsuite to be generated incorrectly if it
is used as the default shell (as it is on newer versions of Debian and
Ubuntu).
However, there's no reason to use a backslash here at all, so just omit
it.
Ben Pfaff [Wed, 14 Jul 2010 17:05:53 +0000 (10:05 -0700)]
ofpbuf: Implement unsupported feature in ofpbuf_trim().
Until now, ofpbuf_trim() has not handled the case where the ofpbuf has
nonzero headroom. This causes an assertion failure when pinsched_send()
queues a packet to be sent later, because such packets have been
guaranteed to have nonzero headroom since commit
43253595 "ofproto: Avoid
buffer copy in OFPT_PACKET_IN path." This commit fixes the problem by
implementing the until-now unsupported case.
This commit factors code out of ofpbuf_prealloc_tailroom() into two new
functions, ofpbuf_rebase__() and ofpbuf_resize_tailroom__(), and uses the
latter to implement both ofpbuf_prealloc_tailroom() and ofpbuf_trim().
ofpbuf_rebase__() isn't used on its own at all, but it seems potentially
useful so I made it an independent function.
Reported-by: Tom Everman <teverman@google.com>
Jesse Gross [Mon, 12 Jul 2010 23:02:12 +0000 (16:02 -0700)]
datapath: Make our checksumming more closely match skb_checksum_help().
Our code that handles checksumming does essentially the same thing
as skb_checksum_help() except it folds the process into copying to
userspace. This makes the two functions more closely resemble
each other in structure, including adding a couple of BUG() checks.
This should have no functional change but makes comparision easier
when debugging.
Jesse Gross [Tue, 13 Jul 2010 00:40:38 +0000 (17:40 -0700)]
gre: Fix headroom calculation.
If we need to make a copy we add a little bit extra to the headroom
that we think we need in order to avoid future copies. Previously we
would always just allocate max(headroom, 64) which is generally
enough space. However, if we are using IPsec our expected headroom
is large and there is no extra. Instead, it is better to just tack on
a few extra bytes so we'll always have a cushion.
Jesse Gross [Mon, 12 Jul 2010 20:27:57 +0000 (13:27 -0700)]
datapath: Move over-MTU checking into vport_send().
We currently check for packets that are over the MTU in do_output().
There is a one-to-one correlation between this function and
vport_send() so move the MTU check there for consistency with
other error checking.
Jesse Gross [Mon, 12 Jul 2010 20:13:17 +0000 (13:13 -0700)]
datapath: Add loop checking.
New types of vports such as patch and GRE make it possible to
connect multiple (or the same) datapaths together, which introduces
the possibility of loops on a single machine. Local loops are
even worse than network loops because they will quickly crash the
machine once the kernel stack is exhausted. This adds loop
checking within the OVS datapath that will drop packets that have
been seen more than 5 times.
CC: Paul Ingram <paul@nicira.com>
Ben Pfaff [Fri, 2 Jul 2010 17:10:04 +0000 (10:10 -0700)]
configure: Add --with-l26-source option to specify source dir explicitly.
OVS can usually find the kernel source itself, but when it can't it's
handy if the user can just specify it on the "configure" command line.
Signed-off-by: Ian Campbell <Ian.Campbell@citrix.com>.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Reported-by: Jad Naous <jnaous@gmail.com>
Ben Pfaff [Thu, 17 Jun 2010 22:31:56 +0000 (15:31 -0700)]
vswitch: Distinguish mirrors by UUID, not by name.
A "feature" that ovs-vswitchd inherited from its previous form of
configuration is that every mirror has a name. Names are not necessarily
meaningful, and there is no reason that they should be unique. But the
existing implementation depends on them being unique within a given
bridge, and if they are not drops one of the duplicates.
This commit drops the uniqueness requirement. Instead, it distinguishes
mirrors based on UUID alone.
This commit does not drop the concept of names for mirrors. There is no
technical reason to retain them, but it is possible that users find them
useful for management reasons. The names appear in log messages related
to mirrors, which may make the messages easier to understand.
Bug #2416.
Ben Pfaff [Thu, 17 Jun 2010 22:32:09 +0000 (15:32 -0700)]
vswitch: Fix memory leak in mirror code.
Ben Pfaff [Thu, 17 Jun 2010 22:43:27 +0000 (15:43 -0700)]
vswitch: Use ovsrec_qos_get_queues() to avoid open-coding binary search.
Surely this is a better approach.
Ben Pfaff [Wed, 16 Jun 2010 22:56:15 +0000 (15:56 -0700)]
ovsdb-idlc: Add "get" accessor functions to generated code.
It can be useful to get direct access to the "struct ovsdb_datum"s that
underlie the "cooked" data structures maintained by the IDL code. This
commit provides a convenient interface through the IDL.
Ben Pfaff [Wed, 16 Jun 2010 22:57:22 +0000 (15:57 -0700)]
vswitch: Use ovsdb_idl_get() to avoid O(n) in get_ovsrec_key_value().
Ben Pfaff [Wed, 16 Jun 2010 21:35:48 +0000 (14:35 -0700)]
ovsdb-idl: Transition to better interfaces for reading table columns.
The existing ovsdb_idl_txn_read() was somewhat difficult and expensive to
use, because it always made a copy of the data in the column. This was
necessary at the time it was introduced, because there was no way for it
to return a "default" value for columns that had not yet been populated
without allocating data and hence requiring the caller to free it.
Now that ovsdb_datum_default() exists, this is no longer required. This
commit introduces a pair of new functions, ovsdb_idl_read() and
ovsdb_idl_get(), that return a pointer to existing data and do not do any
copying. It also transitions all of ovsdb_idl_txn_read()'s callers to
the new interfaces.
Ben Pfaff [Wed, 16 Jun 2010 21:08:49 +0000 (14:08 -0700)]
ovsdb-idlc: Check and restore ovsdb_datum invariants when setting data.
ovsdb_datum has an important invariant (documented in the large comment
on the declaration of struct ovsdb_datum) that a lot of code relies upon:
keys must be unique and in sorted order. Most code that creates or
modifies ovsdb_datum structures maintains this invariant. However, the
"set" functions generated by ovsdb-idlc.in do not check or restore it.
This commit adds that checking.
This might fix an actual bug or two--none have been reported--but mostly it
is just to add some additional checking to avoid future subtle bugs.
Ben Pfaff [Wed, 16 Jun 2010 20:44:08 +0000 (13:44 -0700)]
ovsdb: New functions ovsdb_datum_sort_unique(), ovsdb_datum_from_json_unique().
These new functions are more forgiving than the corresponding functions
without "_unique". The goal is to be more tolerant of data provided by
IDL clients, which will happen in a followup patch.
Ben Pfaff [Mon, 12 Jul 2010 17:05:16 +0000 (10:05 -0700)]
ovsdb: New functions ovsdb_atom_default(), ovsdb_datum_default().
Having access to const copies of default atoms and data will allow OVSDB
code to avoid memory allocations and reduce copying in upcoming commits.
Ben Pfaff [Mon, 12 Jul 2010 17:03:33 +0000 (10:03 -0700)]
ovsdb: Document some ovsdb-data.[ch] functions.
Ben Pfaff [Thu, 1 Jul 2010 16:47:46 +0000 (09:47 -0700)]
ovsdb: Extend "monitor" to select different operations in a single table.
Until now, "monitor" has only allowed the client to choose the kinds of
changes that will be monitored on a per-table basis. However, it makes
sense to be able to choose operations on a per-column basis. The
immediate need for this is to make sure that the final statistics of
deleted Interface records are known at time of deletion, even though the
intermediate values of the statistics are not important.
CC: Jeremy Stribling <strib@nicira.com>
Ben Pfaff [Wed, 30 Jun 2010 19:44:17 +0000 (12:44 -0700)]
tests: Add OVSDB tests for monitors that select only certain operations.
This has been supported as long as table monitoring has been supported, but
until now there were no tests, so this commit adds some.
Ben Pfaff [Wed, 30 Jun 2010 19:43:24 +0000 (12:43 -0700)]
ovsdb-client: Fix "selects" argument to "monitor" command.
This code assumed that the types of operations that were selected were
default-off, so it only added JSON to the query to turn on the ones that
were wanted, but in fact they are default-on, so this commit changes it
to add JSON for each possible operation type.
Ben Pfaff [Wed, 30 Jun 2010 21:35:02 +0000 (14:35 -0700)]
ovsdb: Document in SPECS how OVS uses JSON.
CC: Jeremy Stribling <strib@nicira.com>
Ben Pfaff [Wed, 30 Jun 2010 20:57:24 +0000 (13:57 -0700)]
json: Better handle JSON objects with duplicate names.
RFC 4627 (which defines JSON) says:
The names within an object SHOULD be unique.
In my view, this means that the treatment of duplicate names within a
JSON object is more or less up to the implementation. Until now, the OVS
JSON parser has dealt with duplicates fairly badly: they all get shoved
into the hash table and you get one or the other value semi-randomly
(typically the one added later). This commit makes the behavior
predictable: old values are deleted and replaced by newer values.
Ben Pfaff [Wed, 30 Jun 2010 20:26:42 +0000 (13:26 -0700)]
shash: New function shash_replace().
To be used in an upcoming commit.
Ben Pfaff [Wed, 30 Jun 2010 20:22:28 +0000 (13:22 -0700)]
shash: Refactor shash_add_nocopy(), shash_find().
By breaking out the hash calculation we can enable operations that need
to do both to avoid duplicating the hash calculations. A following commit
will add such an operation.
Ben Pfaff [Wed, 30 Jun 2010 20:28:22 +0000 (13:28 -0700)]
Simplify shash_find() followed by shash_add() into shash_add_once().
This is just a cleanup.
Ben Pfaff [Tue, 29 Jun 2010 21:58:05 +0000 (14:58 -0700)]
ovs-pki: Allow generating certificates with duplicate subjects.
Without this setting, the certificate authorities that ovs-pki creates will
not allow two switches or two controllers to have the same name. This
causes problem in testing, since it's often convenient to test with short,
common names like "tmp".
(If you need to fix a PKI that you already created, in addition to
modifying ca.cnf you will need to make the same change to index.txt.attr.)
CC: Pierre Ettori <pettori@nicira.com>
Ben Pfaff [Tue, 29 Jun 2010 21:29:40 +0000 (14:29 -0700)]
doc: Change "-" to "\-" in appropriate places.
The newer manpages tend to get this right more often than the old ones,
but there were lots of places that needed to be corrected.
Ben Pfaff [Tue, 29 Jun 2010 19:32:16 +0000 (12:32 -0700)]
daemon: Clarify documentation.
CC: Yu Zhiguo <yuzg@cn.fujitsu.com>
Ben Pfaff [Tue, 29 Jun 2010 16:54:10 +0000 (09:54 -0700)]
ovs-vsctl: Add "wait-until" command.
The "wait-until" command causes ovs-vsctl to wait until a specified
condition becomes true.
Requested-by: Sajjad Lateef <slateef@nicira.com>
Ben Pfaff [Thu, 24 Jun 2010 18:16:03 +0000 (11:16 -0700)]
ovs-vsctl: Make partial matches on record names optional.
The wait-until command to be added to ovs-vsctl in an upcoming commit
doesn't really want to wait for partial matches: if I'm waiting for br1
to be created I really don't want to be fooled by br10. So this commit
adds infrastructure to avoid such partial matches.
Ben Pfaff [Tue, 29 Jun 2010 00:20:15 +0000 (17:20 -0700)]
ovs-vsctl: Prepare for more flexible database argument parsing.
The wait-until command to be added in an upcoming commit needs to support
!=, <, >, <=, and >= operators in addition to =, so this commit adds that
infrastructure.
Ben Pfaff [Thu, 24 Jun 2010 18:01:53 +0000 (11:01 -0700)]
ovs-vsctl: Allow commands to tell ovs-vsctl to try again later.
The "wait-until" command to be introduced in an upcoming commit needs to
be able to tell the ovs-vsctl main loop to try again later, since the
condition that it is looking for has not yet been satisfied. This commit
adds the infrastructure for this. (It's being broken out into a separate
commit because it modifies scattered code in ovs-vsctl.c and thus might
be easier to review this way.)
Justin Pettit [Mon, 28 Jun 2010 20:43:10 +0000 (13:43 -0700)]
debian: Correct naming in init scripts
A number of the init scripts assumed that the package name was the same
as the binary, which is not always true. This fixes those issues as
well as some incorrect names in usage messages.
Reported-by: Ram Jothikumar <rjothikumar@nicira.com>
Ben Pfaff [Thu, 24 Jun 2010 22:02:46 +0000 (15:02 -0700)]
ovs-ofctl: Warn about flows not in normal form.
Lots of people get this wrong.
Bug #185.
Ben Pfaff [Thu, 24 Jun 2010 21:20:45 +0000 (14:20 -0700)]
ofproto: Log changes made by flow normalization.
Open vSwitch has always "normalized" flows, that is, zeroed out fields that
are wildcarded or that otherwise cannot affect whether a packet actually
matches the flow. But until now it has done so silently, which prevents
the authors of controllers from learning what is happening and makes it
less likely that they will update code on their end. This commit makes
OVS log when normalization changes a flow.
Suggested by partner.
Ben Pfaff [Fri, 25 Jun 2010 20:57:22 +0000 (13:57 -0700)]
xenserver: Avoid errors from ovs-vsctl at system shutdown.
Commit
823c5699 "interface-reconfigure: callout to datapath backend class
method on rewrite" changed "interface-reconfigure rewrite" to update
bridge external-ids in the vswitch database. But this had the side effect
of causing errors at system shutdown, since ovsdb-server gets shut down
before the rewrite action is called. This commit fixes the problem by
skipping the update if the database socket does not exist. (It's just
fine to skip the update, since the external-ids will be re-set the next
time the system boots anyhow.)
This commit fixed the problem on 5.6.810-34773p for me. I don't see the
problem at all on 5.5.0. Presumably system shutdown order has changed.
NIC-136.
CC: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Mon, 28 Jun 2010 17:34:10 +0000 (10:34 -0700)]
xenserver: Kill bond master's dhclient when bringing up bond slave.
interface-reconfigure is never explicitly called to down a bond master.
However, when it is called to up a slave it is implicit that we are
destroying the master. The "bridge" version of interface-reconfigure
always "ifdown"s the bond master in such a case, but until now the
"vswitch" version has not done so. Usually, it doesn't matter, because
the bond master network device disappears when the slave is brought up,
but one case was missed: for a bond master with an IP address obtained
via DHCP, the dhclient process needs to be killed, and we were not doing
it. This commit starts doing it (by invoking ifdown on the bond master).
The dhclient process that hangs around doesn't cause problems until the
bond master is brought back up, at which point "ifup" fails because it
refuses to start another dhclient for the same interface.
The converse behavior is also important; that is, when a bond PIF is
brought up, interface-reconfigure is expected to implicitly take down its
slave PIFs. My testing (on 5.5.0) shows that this behavior is already
correct. At any rate, this commit does not change that behavior.
Bug #2668.
Bug #2734.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Ben Pfaff [Thu, 24 Jun 2010 18:35:27 +0000 (11:35 -0700)]
timeval: Hide "memory leak" from Valgrind.
In glibc, "timer_t" is a "void *" that appears to point into malloc()'d
memory. By throwing it away entirely, we leak it, which makes valgrind
complain. We really don't ever care to use the timer object again, but
we can't destroy it without stopping the periodic timer. So make it
static to avoid a warning from Valgrind.
Ben Pfaff [Fri, 25 Jun 2010 16:39:33 +0000 (09:39 -0700)]
Add AUTHORS file.
Suggested-by: Martin Casado <casado@nicira.com>
Yu Zhiguo [Fri, 25 Jun 2010 09:33:07 +0000 (17:33 +0800)]
datapath: fix header file include
linux/highmem.h should be included rather than asm/highmem.h,
otherwise openvswitch_mod cannot resolve kmap and kunmap on some arch.
Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Ben Pfaff [Thu, 24 Jun 2010 23:06:07 +0000 (16:06 -0700)]
ovs-vsctl: Make "ovs-vsctl get <table> <record> _uuid" work.
Requested-by: Sean Brady <sbrady@gtfservices.com>
Jesse Gross [Thu, 24 Jun 2010 22:45:04 +0000 (15:45 -0700)]
docs: Correct DB init instructions in INSTALL.Linux.
We tell people to run "ovs-vsctl init" before starting
ovs-vswitchd but this causes it to hang until it times
out so add "--no-wait" as well.
Jesse Gross [Thu, 24 Jun 2010 22:31:18 +0000 (15:31 -0700)]
ovsdb-idl: Check if row->written is valid.
Commit cde3f1 "ovsdb-idl: Drop unnecessary allocation from
ovsdb_idl_txn_insert()." does lazy allocation of row->written
on the assumption that ovsdb_idl_txn_write() will handle it.
However, this isn't the case for empty rows created by something
like "ovs-vsctl init" so add a check before reading the bitfield.
Ben Pfaff [Wed, 23 Jun 2010 16:41:09 +0000 (09:41 -0700)]
vswitchd: Add entity-relationship diagram to ovs-vswitchd.conf.db.5.
I've updated http://openvswitch.org/ovs-vswitchd.conf.db.5.pdf with
example output.
Ben Pfaff [Wed, 23 Jun 2010 19:29:45 +0000 (12:29 -0700)]
tests: Tolerate ignored SIGPIPE in daemon tests.
I noticed that when I run "make check" inside an Emacs compile-mode buffer,
the "daemon --detach closes standard fds" and "daemon --detach --monitor closes
standard fds" tests failed. Investigation showed that Emacs ignores
SIGPIPE in the compile subprocess, which caused the "yes" process in these
tests to emit the message "yes: Broken pipe" and exit with status 1 instead
of dying from SIGPIPE.
This commit changes these tests to allow either behavior.
Ben Pfaff [Wed, 23 Jun 2010 16:59:49 +0000 (09:59 -0700)]
ovs-vsctl: Add details to documentation for "emer-reset" command.
Someone asked me what emer-reset really does and I had to look in the
source code to find out. It's kinder just to document it.
Ben Pfaff [Thu, 24 Jun 2010 19:56:30 +0000 (12:56 -0700)]
ovsdb-server: Implement unixctl command to reconnect JSON-RPC connections.
This feature may be useful for debugging.
Feature #2222.
Ben Pfaff [Fri, 18 Jun 2010 20:58:42 +0000 (13:58 -0700)]
vswitch: Implement unixctl command to reconnect OpenFlow connections.
This feature may be useful for debugging.
Feature #2222.
Ben Pfaff [Thu, 24 Jun 2010 19:44:47 +0000 (12:44 -0700)]
rconn: Suppress failed connection messages after the backoff maxes out.
Logging a few messages about a failed connection every few seconds for
every bridge is too much. This just logs failed connection messages for a
few attempts, then shuts them off until something changes.
Bug #2610.
Ben Pfaff [Fri, 11 Jun 2010 22:58:14 +0000 (15:58 -0700)]
vconn: Fix tracking of "connected" state.
While I was looking at the rconn code for connection backoff and retry, I
noticed that ovs-vswitchd was logging the following on each connection
attempt:
Jun 11 15:17:41|00020|vconn_stream|ERR|send: Connection refused
The "send:" part didn't make much sense. The configured controller was not
actually running, so the vconn code should not have been able to connect
at all, so the message should have been about a connection failing, not
about sending on a completed connection failing.
Investigation showed that different parts of the library have different
ideas about return value semantics. vconn_open() and stream_open() both
return 0 if a connection succeeded or if one is in progress, but some of
its callers thought that it returned 0 if the connection succeeded and
EAGAIN if the connection was in progress. This commit fixes up the callers
that had the wrong idea, by making them instead all vconn_connect() or
stream_connect() to determine whether the connection is complete.
Ben Pfaff [Fri, 11 Jun 2010 21:59:43 +0000 (14:59 -0700)]
rconn: Remove superfluous \n from log message.
The vlog library suppresses a trailing \n but that's no reason to supply
one.
Ben Pfaff [Wed, 23 Jun 2010 18:02:46 +0000 (11:02 -0700)]
bridge: Implement basic periodic update of interface statistics.
Ben Pfaff [Thu, 10 Jun 2010 21:32:32 +0000 (14:32 -0700)]
bridge: Make configuration database records valid all the time.
Before, it was possible for records in the configuration database to
disappear, so all of the ovsrec pointers inside bridge structures had
comments cautioning against their use except during reconfiguration. But
now that the bridge has direct control over when ovsdb_idl_run() is called,
it can ensure that bridge_reconfigure() is always called immediately
whenever the IDL data structures change. That means that we can use the
ovsrec configuration at any time after the reconfiguration process
initializes them, not just during reconfiguration.
Ben Pfaff [Thu, 10 Jun 2010 21:17:41 +0000 (14:17 -0700)]
ovs-vswitchd: Allow bridge code to manage the database connection itself.
Until now, the ovs-vswitchd main loop has managed the connection to the
database. This worked adequately until now, but upcoming patches will tie
the bridge code more tightly to the database, which means that the bridge
needs more control over interaction with the database connection and thus
that it is better for the bridge to handle that connection itself. This
commit makes the latter change, moving the database interaction from the
ovs-vswitchd main loop into bridge.c.
Ben Pfaff [Thu, 10 Jun 2010 22:31:55 +0000 (15:31 -0700)]
ovsdb-idlc: Fix sizeof calculation in generated code.
Generated <prefix>_<struct>_parse_<column> functions did not allocate
enough memory for the "value" array, because code that should have said,
e.g.:
row->value_options = xmalloc(datum->n * sizeof *row->value_options);
actually said:
row->value_options = xmalloc(datum->n * sizeof row->value_options);
This fixes the problem. I also checked that the same problem didn't occur
elsewhere in the generated code.
(This would be a fairly serious bug fix, because without it reads and
writes beyond the end of allocated memory would be almost inevitable,
except that every existing map has string values, and sizeof(char*)
== sizeof(char**) on any sane system.)
Ben Pfaff [Thu, 10 Jun 2010 21:09:51 +0000 (14:09 -0700)]
ovsdb-idl: Drop unnecessary allocation from ovsdb_idl_txn_insert().
There's no need to allocate row->written ahead of time because the code
that can use it allocates it on demand if row->written is NULL.
Ben Pfaff [Wed, 23 Jun 2010 17:13:39 +0000 (10:13 -0700)]
ovsdb-idl: Start documenting the public interface.
Long overdue.
Ben Pfaff [Wed, 9 Jun 2010 22:18:17 +0000 (15:18 -0700)]
ovsdb-idl: Simplify usage of ovsdb_idl_run().
It makes client code simpler if ovsdb_idl_run() simply lets the caller
know whether anything changed.
Ben Pfaff [Thu, 10 Jun 2010 21:10:51 +0000 (14:10 -0700)]
bridge: Drop unused enum definition.
Ben Pfaff [Thu, 10 Jun 2010 23:31:59 +0000 (16:31 -0700)]
bridge: Remove unused functions.
Ben Pfaff [Thu, 10 Jun 2010 21:40:13 +0000 (14:40 -0700)]
Use shash_destroy_free_data() to simplify a few scattered pieces of code.
Ian Campbell [Wed, 23 Jun 2010 08:22:03 +0000 (09:22 +0100)]
xenserver: Largely untested patch for 5.5 and 5.6 CHIN compatibility
Ian Campbell [Wed, 23 Jun 2010 08:18:57 +0000 (09:18 +0100)]
interface-reconfigure: Handle CHIN Tunnel objects
Only the vswitch backend is able to implement CHIN.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ben Pfaff [Fri, 4 Jun 2010 23:47:55 +0000 (16:47 -0700)]
stream-ssl: Log protocol details at DBG level.
Sometimes seeing a little bit of SSL protocol information can be valuable
in debugging connection problems. With this commit, setting the stream_ssl
logging module to DBG level will cause basic SSL handshake information to
be logged for new connections.
Ben Pfaff [Fri, 18 Jun 2010 22:58:13 +0000 (15:58 -0700)]
Better document how ovsdb-server is meant to be used.
DarkBls <darkbls@yahoo.com> had the idea that a single ovsdb-server could
be used to serve configuration details across the network to multiple
remote ovs-vswitchd instances. This doesn't work, but the documentation
didn't spell it out. This commit should help.
Ben Pfaff [Wed, 23 Jun 2010 16:50:47 +0000 (09:50 -0700)]
tests: Fix occasional failure of "self-linking idl, inconsistent ops" test.
Commit
97f7803 made this test nondeterministic: depending on the ordering
of the random row UUIDs, the error message might mention one of two
different nonexistent rows that are referenced. This commit fixes the
problem by accepting either one of the two correct rows.
Previous related fixes took a different approach, of instead making the
test deterministic. But now I think that that is not as good an idea,
because we need to be able to test that OVSDB works whether the query is
deterministic or not.
(Of course, every OVSDB test is nondeterministic, since the UUIDs are
random. This test was just more nondeterministic than most.)
Jun Nakajima [Wed, 23 Jun 2010 05:53:15 +0000 (22:53 -0700)]
Update INSTALL.XenServer to indicate that 5.6.0 is supported
Open vSwitch is compatible with XenServer 5.6.0, so indicate that in
INSTALL.XenServer. Remove the reference to 5.5.900, since that was just
a pre-release version of 5.6.0.
Ben Pfaff [Tue, 22 Jun 2010 20:50:55 +0000 (13:50 -0700)]
INSTALL.Linux: Add some troubleshooting instructions for module loading.
Suggested-by: kk yap <yapkke@stanford.edu>
Ben Pfaff [Tue, 22 Jun 2010 19:47:03 +0000 (12:47 -0700)]
ofp-util: Also normalize nw_tos in normalize_match().
The OpenFlow reference implementation was sensibly normalizing these
fields but OVS did not. This change should make OVS resemble the OpenFlow
reference implementation at least in this respect.
For more information:
https://mailman.stanford.edu/pipermail/openflow-spec/2010-June/001040.html
Reported-by: Takayuki HAMA <t-hama@cb.jp.nec.com>
Ben Pfaff [Fri, 11 Jun 2010 21:45:24 +0000 (14:45 -0700)]
Suppress ovsdb-server log messages about connections from ovs-vsctl.
In the ovsdb-server log there are fairly continuous messages like these:
Apr 26 11:27:55|15254|reconnect|INFO|unix:/tmp/stream-unix.31734.0: connected
Apr 26 11:27:55|15255|reconnect|INFO|unix:/tmp/stream-unix.31734.0: connection dropped
Apr 26 11:28:00|15256|reconnect|INFO|unix:/tmp/stream-unix.31810.0: connecting...
Apr 26 11:28:00|15257|reconnect|INFO|unix:/tmp/stream-unix.31810.0: connected
Apr 26 11:28:00|15258|reconnect|INFO|unix:/tmp/stream-unix.31810.0: connection dropped
These just indicate that ovs-vsctl is connecting to ovsdb-server from,
for example, the "vif" script. But there's no need to log all that detail;
it's simply not useful. This commit suppresses it.
Bug #2715.
Ben Pfaff [Fri, 11 Jun 2010 21:37:33 +0000 (14:37 -0700)]
jsonrpc: Suppress duplicate logging.
Both jsonrpc and reconnect were logging ordinary connection closure.
There's no need for both to do it.
Ben Pfaff [Fri, 11 Jun 2010 21:39:13 +0000 (14:39 -0700)]
jsonrpc: Propagate error code to reconnect_disconnected().
Always passing 0 to reconnect_disconnected() means that it uses a generic
log message ("connection dropped"). By passing the error code, as done by
this commit, reconnect_disconnected() can log a more specific message.
Ben Pfaff [Fri, 11 Jun 2010 20:56:36 +0000 (13:56 -0700)]
ovsdb-server: Improve logging for referential integrity violations.
This may help with bug #2727 "ovs-vsctl transaction error while changing
VM power state", if it is reproducible.
Jesse Gross [Sat, 19 Jun 2010 00:11:44 +0000 (17:11 -0700)]
datapath: Don't compute checksums on partial packets.
If we are only copying part of a packet to userspace don't bother
computing the checksum on that part since it is meaningless.
Instead, fall back to the old method of checksumming before
copying to ensure the correct result.
This was supposed to be part of the previous commit but was left
off.
Jesse Gross [Thu, 17 Jun 2010 22:20:16 +0000 (15:20 -0700)]
datapath: Compute checksum while sending packets to userspace().
Currently we compute the checksums on packets being sent to
userspace first and then copy them to a userspace buffer. However,
these two operations can be combined for a significant savings
because the packet data only has to be loaded once. This also
allows GSO packets to save an extra copy.
This will likely have an impact on NIC-121 because it eliminates
the code path that triggers the issue. However, it is not a fix
for the root cause.
Jesse Gross [Thu, 17 Jun 2010 22:17:53 +0000 (15:17 -0700)]
datapath: Check vswitch_skb_checksum_setup() return code.
If vswitch_skb_checksum_setup() returns an error, the checksum
pointers probably haven't been set correctly which could cause
a crash later. We should give up immediately on error.
Jesse Gross [Thu, 17 Jun 2010 22:15:11 +0000 (15:15 -0700)]
datapath: Call vswitch_skb_checksum_setup() before doing GSO.
Since GSO computes checksums as it does segmentation, we need to
setup the checksum pointers before calling skb_gso_segment(). Failing
to do so can potentially result in warnings, incorrect checksums,
crashes, or redundant checksum computation. In general we don't
hit this case because the code path is run during the first packet
in a flow, which is generally not a large GSO packet.
This was found during the investigation of NIC-121 but has no impact
on it because vswitch_skb_checksum_setup() is a no-op on 2.6.27-based
XenServer kernels.
Jesse Gross [Fri, 18 Jun 2010 17:24:28 +0000 (10:24 -0700)]
ofproto: Consistently make queue_id uint32_t.
In some places queue_id was a uint16_t which caused issues when
comparing to 32-bit constants and doing byte swaps.
Ben Pfaff [Thu, 17 Jun 2010 22:24:54 +0000 (15:24 -0700)]
ovsdb-idlc: Fix warning in generated code.
Without this fix, ovsdb-idlc generates the following line of code:
c->type.key.u.integer.max =
4294967295;
which causes GCC to issue this warning:
this decimal constant is unsigned only in ISO C90
This commit changes the generated code to:
c->type.key.u.integer.max = INT64_C(
4294967295);
which eliminates the warning.
Ben Pfaff [Tue, 15 Jun 2010 19:14:43 +0000 (12:14 -0700)]
vswitchd: Attempt to further clarify Port "trunk" and "tag" documentation.
CC: Hao Zheng <hzheng@nicira.com>
Ben Pfaff [Thu, 17 Jun 2010 22:04:12 +0000 (15:04 -0700)]
Implement QoS framework.
ovs-vswitchd doesn't declare its QoS capabilities in the database yet,
so the controller has to know what they are. We can add that later.
The linux-htb QoS class has been tested to the extent that I can see that
it sets up the queues I expect when I run "tc qdisc show" and "tc class
show". I haven't tested that the effects on flows are what we expect them
to be. I am sure that there will be problems in that area that we will
have to fix.
Ben Pfaff [Mon, 7 Jun 2010 23:58:57 +0000 (16:58 -0700)]
shash: New functions shash_destroy_free_data() and shash_clear_free_data().
Ben Pfaff [Thu, 3 Jun 2010 17:17:51 +0000 (10:17 -0700)]
port-array: Add port_array_delete() function.
port_array_delete(pa, idx) is equivalent to port_array_set(pa, idx, NULL),
but it never allocates memory and avoids conditionals.
Ben Pfaff [Fri, 28 May 2010 23:34:55 +0000 (16:34 -0700)]
netdev-linux: Create rtnetlink socket up front instead of on demand.
This simplifies a bit of existing code since it is known that an rtnetlink
socket will always be available. It will simplify additional code in
upcoming commits.
Ben Pfaff [Fri, 28 May 2010 23:16:23 +0000 (16:16 -0700)]
netlink: Improve support for nested Netlink attributes.
Fairly often it happens that nested Netlink attributes must themselves
contain Netlink attributes. In such a case, nlmsg_put_nested() is not so
convenient, because it requires the contents to be pre-assembled and then
copied into place. This commit introduces a new interface that instead
allows the nested attributes to be assembled in-place. As a demonstration,
it updates nl_msg_put_nested() to use this new interface.
Ben Pfaff [Fri, 21 May 2010 00:14:22 +0000 (17:14 -0700)]
netlink: Add support for Netlink table dumping.
Ben Pfaff [Thu, 20 May 2010 22:57:56 +0000 (15:57 -0700)]
netlink: Add functions for handling nested attributes.
Ben Pfaff [Thu, 20 May 2010 22:57:36 +0000 (15:57 -0700)]
netlink: Make nl_sock_transact() always return a reply on success.
Until now, if nl_sock_transact() received a reply that merely acknowledged
success, without providing any other payload, it would return success but
not provide the reply to its caller. This is inconsistent and could easily
cause a segfault in a caller that expects to see the reply on success, if
kernel behavior changed, for whatever reason, so that a request that
previously returned data now just returns an acknowledgment. In practice
this kind of change should never happen, but it is still better to handle
it properly.
Ben Pfaff [Thu, 20 May 2010 22:53:17 +0000 (15:53 -0700)]
netlink: Fix bad assumption about nested Netlink attributes.
I had assumed that nested Netlink attributes contained an entire Netlink
message, including header. This is wrong: they contain only a series of
attributes.
Nothing in the tree actually used nested attributes until now, so this
doesn't fix any existing bugs.