From: Ben Pfaff Date: Mon, 14 Nov 2011 23:56:43 +0000 (-0800) Subject: datapath: Describe policy for extending flow key, implement needed changes. X-Git-Url: https://pintos-os.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=fea393b1d6b2729a784b898dbdd48d30d42e3ff7;p=openvswitch datapath: Describe policy for extending flow key, implement needed changes. When the datapath was converted to use Netlink attributes for describing flow keys, I had a vague idea of how it could be smoothly extensible, but I didn't actually implement extensibility or carefully think it through. This commit adds a document that describes how flow keys can be extended in a compatible fashion and adapts the existing interface to match what it says. This commit doesn't actually implement extensibility. I already have a separate patch series out for that. This patch series borrows from that one heavily, but the extensibility series will need to be reworked somewhat once this one is in. This commit is only lightly tested because I don't have a good test setup for VLANs. Signed-off-by: Ben Pfaff Acked-by: Jesse Gross --- diff --git a/datapath/Modules.mk b/datapath/Modules.mk index 22b328c6..e015a9dd 100644 --- a/datapath/Modules.mk +++ b/datapath/Modules.mk @@ -41,6 +41,7 @@ openvswitch_headers = \ vport-netdev.h openvswitch_extras = \ + README \ CAPWAP.txt dist_sources = $(foreach module,$(dist_modules),$($(module)_sources)) diff --git a/datapath/README b/datapath/README new file mode 100644 index 00000000..2d8a141b --- /dev/null +++ b/datapath/README @@ -0,0 +1,160 @@ +Open vSwitch datapath developer documentation +============================================= + +The Open vSwitch kernel module allows flexible userspace control over +flow-level packet processing on selected network devices. It can be +used to implement a plain Ethernet switch, network device bonding, +VLAN processing, network access control, flow-based network control, +and so on. + +The kernel module implements multiple "datapaths" (analogous to +bridges), each of which can have multiple "vports" (analogous to ports +within a bridge). Each datapath also has associated with it a "flow +table" that userspace populates with "flows" that map from keys based +on packet headers and metadata to sets of actions. The most common +action forwards the packet to another vport; other actions are also +implemented. + +When a packet arrives on a vport, the kernel module processes it by +extracting its flow key and looking it up in the flow table. If there +is a matching flow, it executes the associated actions. If there is +no match, it queues the packet to userspace for processing (as part of +its processing, userspace will likely set up a flow to handle further +packets of the same type entirely in-kernel). + + +Flow key compatibility +---------------------- + +Network protocols evolve over time. New protocols become important +and existing protocols lose their prominence. For the Open vSwitch +kernel module to remain relevant, it must be possible for newer +versions to parse additional protocols as part of the flow key. It +might even be desirable, someday, to drop support for parsing +protocols that have become obsolete. Therefore, the Netlink interface +to Open vSwitch is designed to allow carefully written userspace +applications to work with any version of the flow key, past or future. + +To support this forward and backward compatibility, whenever the +kernel module passes a packet to userspace, it also passes along the +flow key that it parsed from the packet. Userspace then extracts its +own notion of a flow key from the packet and compares it against the +kernel-provided version: + + - If userspace's notion of the flow key for the packet matches the + kernel's, then nothing special is necessary. + + - If the kernel's flow key includes more fields than the userspace + version of the flow key, for example if the kernel decoded IPv6 + headers but userspace stopped at the Ethernet type (because it + does not understand IPv6), then again nothing special is + necessary. Userspace can still set up a flow in the usual way, + as long as it uses the kernel-provided flow key to do it. + + - If the userspace flow key includes more fields than the + kernel's, for example if userspace decoded an IPv6 header but + the kernel stopped at the Ethernet type, then userspace can + forward the packet manually, without setting up a flow in the + kernel. This case is bad for performance because every packet + that the kernel considers part of the flow must go to userspace, + but the forwarding behavior is correct. (If userspace can + determine that the values of the extra fields would not affect + forwarding behavior, then it could set up a flow anyway.) + +How flow keys evolve over time is important to making this work, so +the following sections go into detail. + + +Flow key format +--------------- + +A flow key is passed over a Netlink socket as a sequence of Netlink +attributes. Some attributes represent packet metadata, defined as any +information about a packet that cannot be extracted from the packet +itself, e.g. the vport on which the packet was received. Most +attributes, however, are extracted from headers within the packet, +e.g. source and destination addresses from Ethernet, IP, or TCP +headers. + +The header file defines the exact format of the +flow key attributes. For informal explanatory purposes here, we write +them as comma-separated strings, with parentheses indicating arguments +and nesting. For example, the following could represent a flow key +corresponding to a TCP packet that arrived on vport 1: + + in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4), + eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0, + frag=no), tcp(src=49163, dst=80) + +Often we ellipsize arguments not important to the discussion, e.g.: + + in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...) + + +Basic rule for evolving flow keys +--------------------------------- + +Some care is needed to really maintain forward and backward +compatibility for applications that follow the rules listed under +"Flow key compatibility" above. + +The basic rule is obvious: + + ------------------------------------------------------------------ + New network protocol support must only supplement existing flow + key attributes. It must not change the meaning of already defined + flow key attributes. + ------------------------------------------------------------------ + +This rule does have less-obvious consequences so it is worth working +through a few examples. Suppose, for example, that the kernel module +did not already implement VLAN parsing. Instead, it just interpreted +the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the +packet. The flow key for any packet with an 802.1Q header would look +essentially like this, ignoring metadata: + + eth(...), eth_type(0x8100) + +Naively, to add VLAN support, it makes sense to add a new "vlan" flow +key attribute to contain the VLAN tag, then continue to decode the +encapsulated headers beyond the VLAN tag using the existing field +definitions. With this change, an TCP packet in VLAN 10 would have a +flow key much like this: + + eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...) + +But this change would negatively affect a userspace application that +has not been updated to understand the new "vlan" flow key attribute. +The application could, following the flow compatibility rules above, +ignore the "vlan" attribute that it does not understand and therefore +assume that the flow contained IP packets. This is a bad assumption +(the flow only contains IP packets if one parses and skips over the +802.1Q header) and it could cause the application's behavior to change +across kernel versions even though it follows the compatibility rules. + +The solution is to use a set of nested attributes. This is, for +example, why 802.1Q support uses nested attributes. A TCP packet in +VLAN 10 is actually expressed as: + + eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800), + ip(proto=6, ...), tcp(...))) + +Notice how the "eth_type", "ip", and "tcp" flow key attributes are +nested inside the "encap" attribute. Thus, an application that does +not understand the "vlan" key will not see either of those attributes +and therefore will not misinterpret them. (Also, the outer eth_type +is still 0x8100, not changed to 0x0800.) + +Other rules +----------- + +The other rules for flow keys are much less subtle: + + - Duplicate attributes are not allowed at a given nesting level. + + - Ordering of attributes is not significant. + + - When the kernel sends a given flow key to userspace, it always + composes it the same way. This allows userspace to hash and + compare entire flow keys that it may not be able to fully + interpret. diff --git a/datapath/actions.c b/datapath/actions.c index ac7187bc..88eca6cd 100644 --- a/datapath/actions.c +++ b/datapath/actions.c @@ -96,7 +96,7 @@ static int pop_vlan(struct sk_buff *skb) return 0; } -static int push_vlan(struct sk_buff *skb, const struct ovs_key_8021q *q_key) +static int push_vlan(struct sk_buff *skb, const struct ovs_action_push_vlan *vlan) { if (unlikely(vlan_tx_tag_present(skb))) { u16 current_tag; @@ -112,7 +112,7 @@ static int push_vlan(struct sk_buff *skb, const struct ovs_key_8021q *q_key) + ETH_HLEN, VLAN_HLEN, 0)); } - __vlan_hwaccel_put_tag(skb, ntohs(q_key->q_tci)); + __vlan_hwaccel_put_tag(skb, ntohs(vlan->vlan_tci)); return 0; } @@ -368,15 +368,13 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb, output_userspace(dp, skb, a); break; - case OVS_ACTION_ATTR_PUSH: - /* Only supported push action is on vlan tag. */ - err = push_vlan(skb, nla_data(nla_data(a))); + case OVS_ACTION_ATTR_PUSH_VLAN: + err = push_vlan(skb, nla_data(a)); if (unlikely(err)) /* skb already freed. */ return err; break; - case OVS_ACTION_ATTR_POP: - /* Only supported pop action is on vlan tag. */ + case OVS_ACTION_ATTR_POP_VLAN: err = pop_vlan(skb); break; diff --git a/datapath/datapath.c b/datapath/datapath.c index 49d93aac..2d67657d 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -524,10 +524,9 @@ static int validate_sample(const struct nlattr *attr, return validate_actions(actions, key, depth + 1); } -static int validate_action_key(const struct nlattr *a, - const struct sw_flow_key *flow_key) +static int validate_set(const struct nlattr *a, + const struct sw_flow_key *flow_key) { - int act_type = nla_type(a); const struct nlattr *ovs_key = nla_data(a); int key_type = nla_type(ovs_key); @@ -539,27 +538,15 @@ static int validate_action_key(const struct nlattr *a, nla_len(ovs_key) != ovs_key_lens[key_type]) return -EINVAL; -#define ACTION(act, key) (((act) << 8) | (key)) - - switch (ACTION(act_type, key_type)) { + switch (key_type) { const struct ovs_key_ipv4 *ipv4_key; - const struct ovs_key_8021q *q_key; - - case ACTION(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_PRIORITY): - case ACTION(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_TUN_ID): - case ACTION(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_ETHERNET): - break; - - case ACTION(OVS_ACTION_ATTR_PUSH, OVS_KEY_ATTR_8021Q): - q_key = nla_data(ovs_key); - if (q_key->q_tpid != htons(ETH_P_8021Q)) - return -EINVAL; - if (q_key->q_tci & htons(VLAN_TAG_PRESENT)) - return -EINVAL; + case OVS_KEY_ATTR_PRIORITY: + case OVS_KEY_ATTR_TUN_ID: + case OVS_KEY_ATTR_ETHERNET: break; - case ACTION(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_IPV4): + case OVS_KEY_ATTR_IPV4: if (flow_key->eth.type != htons(ETH_P_IP)) return -EINVAL; @@ -575,7 +562,7 @@ static int validate_action_key(const struct nlattr *a, break; - case ACTION(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_TCP): + case OVS_KEY_ATTR_TCP: if (flow_key->ip.proto != IPPROTO_TCP) return -EINVAL; @@ -584,7 +571,7 @@ static int validate_action_key(const struct nlattr *a, break; - case ACTION(OVS_ACTION_ATTR_SET, OVS_KEY_ATTR_UDP): + case OVS_KEY_ATTR_UDP: if (flow_key->ip.proto != IPPROTO_UDP) return -EINVAL; @@ -595,7 +582,7 @@ static int validate_action_key(const struct nlattr *a, default: return -EINVAL; } -#undef ACTION + return 0; } @@ -632,13 +619,14 @@ static int validate_actions(const struct nlattr *attr, nla_for_each_nested(a, attr, rem) { /* Expected argument lengths, (u32)-1 for variable length. */ static const u32 action_lens[OVS_ACTION_ATTR_MAX + 1] = { - [OVS_ACTION_ATTR_OUTPUT] = 4, + [OVS_ACTION_ATTR_OUTPUT] = sizeof(u32), [OVS_ACTION_ATTR_USERSPACE] = (u32)-1, - [OVS_ACTION_ATTR_PUSH] = (u32)-1, - [OVS_ACTION_ATTR_POP] = 2, + [OVS_ACTION_ATTR_PUSH_VLAN] = sizeof(struct ovs_action_push_vlan), + [OVS_ACTION_ATTR_POP_VLAN] = 0, [OVS_ACTION_ATTR_SET] = (u32)-1, [OVS_ACTION_ATTR_SAMPLE] = (u32)-1 }; + const struct ovs_action_push_vlan *vlan; int type = nla_type(a); if (type > OVS_ACTION_ATTR_MAX || @@ -662,14 +650,19 @@ static int validate_actions(const struct nlattr *attr, break; - case OVS_ACTION_ATTR_POP: - if (nla_get_u16(a) != OVS_KEY_ATTR_8021Q) + case OVS_ACTION_ATTR_POP_VLAN: + break; + + case OVS_ACTION_ATTR_PUSH_VLAN: + vlan = nla_data(a); + if (vlan->vlan_tpid != htons(ETH_P_8021Q)) + return -EINVAL; + if (vlan->vlan_tci & htons(VLAN_TAG_PRESENT)) return -EINVAL; break; case OVS_ACTION_ATTR_SET: - case OVS_ACTION_ATTR_PUSH: - err = validate_action_key(a, key); + err = validate_set(a, key); if (err) return err; break; diff --git a/datapath/flow.c b/datapath/flow.c index b0252b83..fded9375 100644 --- a/datapath/flow.c +++ b/datapath/flow.c @@ -844,10 +844,11 @@ void flow_tbl_remove(struct flow_table *table, struct sw_flow *flow) /* The size of the argument for each %OVS_KEY_ATTR_* Netlink attribute. */ const u32 ovs_key_lens[OVS_KEY_ATTR_MAX + 1] = { + [OVS_KEY_ATTR_ENCAP] = 0, [OVS_KEY_ATTR_PRIORITY] = sizeof(u32), [OVS_KEY_ATTR_IN_PORT] = sizeof(u32), [OVS_KEY_ATTR_ETHERNET] = sizeof(struct ovs_key_ethernet), - [OVS_KEY_ATTR_8021Q] = sizeof(struct ovs_key_8021q), + [OVS_KEY_ATTR_VLAN] = sizeof(__be16), [OVS_KEY_ATTR_ETHERTYPE] = sizeof(__be16), [OVS_KEY_ATTR_IPV4] = sizeof(struct ovs_key_ipv4), [OVS_KEY_ATTR_IPV6] = sizeof(struct ovs_key_ipv6), @@ -968,6 +969,30 @@ static int ipv6_flow_from_nlattrs(struct sw_flow_key *swkey, int *key_len, return 0; } +static int parse_flow_nlattrs(const struct nlattr *attr, + const struct nlattr *a[], u64 *attrsp) +{ + const struct nlattr *nla; + u64 attrs; + int rem; + + attrs = 0; + nla_for_each_nested(nla, attr, rem) { + u16 type = nla_type(nla); + + if (type > OVS_KEY_ATTR_MAX || attrs & (1ULL << type) || + nla_len(nla) != ovs_key_lens[type]) + return -EINVAL; + attrs |= 1ULL << type; + a[type] = nla; + } + if (rem) + return -EINVAL; + + *attrsp = attrs; + return 0; +} + /** * flow_from_nlattrs - parses Netlink attributes into a flow key. * @swkey: receives the extracted flow key. @@ -980,26 +1005,16 @@ int flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp, { const struct nlattr *a[OVS_KEY_ATTR_MAX + 1]; const struct ovs_key_ethernet *eth_key; - const struct nlattr *nla; int key_len; u64 attrs; - int rem; + int err; memset(swkey, 0, sizeof(struct sw_flow_key)); key_len = SW_FLOW_KEY_OFFSET(eth); - attrs = 0; - nla_for_each_nested(nla, attr, rem) { - u16 type = nla_type(nla); - - if (type > OVS_KEY_ATTR_MAX || attrs & (1ULL << type) || - nla_len(nla) != ovs_key_lens[type]) - return -EINVAL; - attrs |= 1ULL << type; - a[type] = nla; - } - if (rem) - return -EINVAL; + err = parse_flow_nlattrs(attr, a, &attrs); + if (err) + return err; /* Metadata attributes. */ if (attrs & (1 << OVS_KEY_ATTR_PRIORITY)) { @@ -1030,18 +1045,18 @@ int flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp, memcpy(swkey->eth.src, eth_key->eth_src, ETH_ALEN); memcpy(swkey->eth.dst, eth_key->eth_dst, ETH_ALEN); - if (attrs & (1 << OVS_KEY_ATTR_8021Q)) { - const struct ovs_key_8021q *q_key; - - q_key = nla_data(a[OVS_KEY_ATTR_8021Q]); - /* Only standard 0x8100 VLANs currently supported. */ - if (q_key->q_tpid != htons(ETH_P_8021Q)) - return -EINVAL; - if (q_key->q_tci & htons(VLAN_TAG_PRESENT)) + if (attrs == ((1 << OVS_KEY_ATTR_VLAN) | + (1 << OVS_KEY_ATTR_ETHERTYPE) | + (1 << OVS_KEY_ATTR_ENCAP)) && + nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]) == htons(ETH_P_8021Q)) { + swkey->eth.tci = nla_get_be16(a[OVS_KEY_ATTR_VLAN]); + if (swkey->eth.tci & htons(VLAN_TAG_PRESENT)) return -EINVAL; - swkey->eth.tci = q_key->q_tci | htons(VLAN_TAG_PRESENT); + swkey->eth.tci |= htons(VLAN_TAG_PRESENT); - attrs &= ~(1 << OVS_KEY_ATTR_8021Q); + err = parse_flow_nlattrs(a[OVS_KEY_ATTR_ENCAP], a, &attrs); + if (err) + return err; } if (attrs & (1 << OVS_KEY_ATTR_ETHERTYPE)) { @@ -1072,7 +1087,7 @@ int flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp, swkey->ipv4.addr.dst = ipv4_key->ipv4_dst; if (swkey->ip.frag != OVS_FRAG_TYPE_LATER) { - int err = ipv4_flow_from_nlattrs(swkey, &key_len, a, &attrs); + err = ipv4_flow_from_nlattrs(swkey, &key_len, a, &attrs); if (err) return err; } @@ -1098,7 +1113,7 @@ int flow_from_nlattrs(struct sw_flow_key *swkey, int *key_lenp, sizeof(swkey->ipv6.addr.dst)); if (swkey->ip.frag != OVS_FRAG_TYPE_LATER) { - int err = ipv6_flow_from_nlattrs(swkey, &key_len, a, &attrs); + err = ipv6_flow_from_nlattrs(swkey, &key_len, a, &attrs); if (err) return err; } @@ -1181,7 +1196,7 @@ int flow_metadata_from_nlattrs(u32 *priority, u16 *in_port, __be64 *tun_id, int flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) { struct ovs_key_ethernet *eth_key; - struct nlattr *nla; + struct nlattr *nla, *encap; if (swkey->phy.priority) NLA_PUT_U32(skb, OVS_KEY_ATTR_PRIORITY, swkey->phy.priority); @@ -1200,15 +1215,16 @@ int flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) memcpy(eth_key->eth_dst, swkey->eth.dst, ETH_ALEN); if (swkey->eth.tci != htons(0)) { - struct ovs_key_8021q q_key; - - q_key.q_tpid = htons(ETH_P_8021Q); - q_key.q_tci = swkey->eth.tci & ~htons(VLAN_TAG_PRESENT); - NLA_PUT(skb, OVS_KEY_ATTR_8021Q, sizeof(q_key), &q_key); + NLA_PUT_BE16(skb, OVS_KEY_ATTR_ETHERTYPE, htons(ETH_P_8021Q)); + NLA_PUT_BE16(skb, OVS_KEY_ATTR_VLAN, + swkey->eth.tci & ~htons(VLAN_TAG_PRESENT)); + encap = nla_nest_start(skb, OVS_KEY_ATTR_ENCAP); + } else { + encap = NULL; } if (swkey->eth.type == htons(ETH_P_802_2)) - return 0; + goto unencap; NLA_PUT_BE16(skb, OVS_KEY_ATTR_ETHERTYPE, swkey->eth.type); @@ -1326,6 +1342,10 @@ int flow_to_nlattrs(const struct sw_flow_key *swkey, struct sk_buff *skb) } } +unencap: + if (encap) + nla_nest_end(skb, encap); + return 0; nla_put_failure: diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h index 966ef7a0..2d141e0d 100644 --- a/include/linux/openvswitch.h +++ b/include/linux/openvswitch.h @@ -264,10 +264,11 @@ struct ovs_flow_stats { enum ovs_key_attr { OVS_KEY_ATTR_UNSPEC, + OVS_KEY_ATTR_ENCAP, /* Nested set of encapsulated attributes. */ OVS_KEY_ATTR_PRIORITY, /* u32 skb->priority */ OVS_KEY_ATTR_IN_PORT, /* u32 OVS dp port number */ OVS_KEY_ATTR_ETHERNET, /* struct ovs_key_ethernet */ - OVS_KEY_ATTR_8021Q, /* struct ovs_key_8021q */ + OVS_KEY_ATTR_VLAN, /* be16 VLAN TCI */ OVS_KEY_ATTR_ETHERTYPE, /* be16 Ethernet type */ OVS_KEY_ATTR_IPV4, /* struct ovs_key_ipv4 */ OVS_KEY_ATTR_IPV6, /* struct ovs_key_ipv6 */ @@ -306,11 +307,6 @@ struct ovs_key_ethernet { __u8 eth_dst[6]; }; -struct ovs_key_8021q { - __be16 q_tpid; - __be16 q_tci; -}; - struct ovs_key_ipv4 { __be32 ipv4_src; __be32 ipv4_dst; @@ -440,35 +436,49 @@ enum ovs_userspace_attr { #define OVS_USERSPACE_ATTR_MAX (__OVS_USERSPACE_ATTR_MAX - 1) +/** + * struct ovs_action_push_vlan - %OVS_ACTION_ATTR_PUSH_VLAN action argument. + * @vlan_tpid: Tag protocol identifier (TPID) to push. + * @vlan_tci: Tag control identifier (TCI) to push. The CFI bit must not be + * set. + * + * The @vlan_tpid value is typically %ETH_P_8021Q. The only acceptable TPID + * values are those that the kernel module also parses as 802.1Q headers, to + * prevent %OVS_ACTION_ATTR_PUSH_VLAN followed by %OVS_ACTION_ATTR_POP_VLAN + * from having surprising results. + */ +struct ovs_action_push_vlan { + __be16 vlan_tpid; /* 802.1Q TPID. */ + __be16 vlan_tci; /* 802.1Q TCI (VLAN ID and priority). */ +}; + /** * enum ovs_action_attr - Action types. * * @OVS_ACTION_ATTR_OUTPUT: Output packet to port. * @OVS_ACTION_ATTR_USERSPACE: Send packet to userspace according to nested * %OVS_USERSPACE_ATTR_* attributes. - * @OVS_ACTION_ATTR_PUSH: Push header onto head of packet. The single nested - * %OVS_KEY_ATTR_* attribute specifies a header to push and its value, e.g. a - * nested attribute of type %OVS_KEY_ATTR_8021Q, with struct ovs_key_8021q as - * argument, would push a VLAN header on the front of the packet. - * @OVS_ACTION_ATTR_POP: Pop header according to %OVS_KEY_ATTR_ sent as - * attribute data, e.g. %OVS_KEY_ATTR_8021Q as argument pops an outer VLAN - * header. - * @OVS_ACTION_ATTR_SET: Replaces the contents of an existing header. - * The argument takes the same form as %OVS_ACTION_ATTR_PUSH. + * @OVS_ACTION_ATTR_SET: Replaces the contents of an existing header. The + * single nested %OVS_KEY_ATTR_* attribute specifies a header to modify and its + * value. + * @OVS_ACTION_ATTR_PUSH_VLAN: Push a new outermost 802.1Q header onto the + * packet. + * @OVS_ACTION_ATTR_POP_VLAN: Pop the outermost 802.1Q header off the packet. * @OVS_ACTION_ATTR_SAMPLE: Probabilitically executes actions, as specified in * the nested %OVS_SAMPLE_ATTR_* attributes. * - * Only a single field can be set with a single %OVS_ACTION_ATTR_{SET,PUSH}. - * Not all fields are modifiable. + * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all + * fields within a header are modifiable, e.g. the IPv4 protocol and fragment + * type may not be changed. */ enum ovs_action_attr { OVS_ACTION_ATTR_UNSPEC, OVS_ACTION_ATTR_OUTPUT, /* u32 port number. */ OVS_ACTION_ATTR_USERSPACE, /* Nested OVS_USERSPACE_ATTR_*. */ - OVS_ACTION_ATTR_PUSH, /* One nested OVS_KEY_ATTR_*. */ - OVS_ACTION_ATTR_POP, /* u16 OVS_KEY_ATTR_*. */ OVS_ACTION_ATTR_SET, /* One nested OVS_KEY_ATTR_*. */ + OVS_ACTION_ATTR_PUSH_VLAN, /* struct ovs_action_push_vlan. */ + OVS_ACTION_ATTR_POP_VLAN, /* No argument. */ OVS_ACTION_ATTR_SAMPLE, /* Nested OVS_SAMPLE_ATTR_*. */ __OVS_ACTION_ATTR_MAX }; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 27125d31..eb8d66d9 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1274,10 +1274,11 @@ execute_set_action(struct ofpbuf *packet, const struct nlattr *a) break; case OVS_KEY_ATTR_UNSPEC: + case OVS_KEY_ATTR_ENCAP: case OVS_KEY_ATTR_ETHERTYPE: case OVS_KEY_ATTR_IPV6: case OVS_KEY_ATTR_IN_PORT: - case OVS_KEY_ATTR_8021Q: + case OVS_KEY_ATTR_VLAN: case OVS_KEY_ATTR_ICMP: case OVS_KEY_ATTR_ICMPV6: case OVS_KEY_ATTR_ARP: @@ -1298,8 +1299,7 @@ dp_netdev_execute_actions(struct dp_netdev *dp, unsigned int left; NL_ATTR_FOR_EACH_UNSAFE (a, left, actions, actions_len) { - const struct nlattr *nested; - const struct ovs_key_8021q *q_key; + const struct ovs_action_push_vlan *vlan; int type = nl_attr_type(a); switch ((enum ovs_action_attr) type) { @@ -1311,15 +1311,12 @@ dp_netdev_execute_actions(struct dp_netdev *dp, dp_netdev_action_userspace(dp, packet, key, a); break; - case OVS_ACTION_ATTR_PUSH: - nested = nl_attr_get(a); - assert(nl_attr_type(nested) == OVS_KEY_ATTR_8021Q); - q_key = nl_attr_get_unspec(nested, sizeof(*q_key)); - eth_push_vlan(packet, q_key->q_tci); + case OVS_ACTION_ATTR_PUSH_VLAN: + vlan = nl_attr_get(a); + eth_push_vlan(packet, vlan->vlan_tci); break; - case OVS_ACTION_ATTR_POP: - assert(nl_attr_get_u16(a) == OVS_KEY_ATTR_8021Q); + case OVS_ACTION_ATTR_POP_VLAN: dp_netdev_pop_vlan(packet); break; diff --git a/lib/odp-util.c b/lib/odp-util.c index c70ab119..b974dcc4 100644 --- a/lib/odp-util.c +++ b/lib/odp-util.c @@ -62,10 +62,10 @@ odp_action_len(uint16_t type) } switch ((enum ovs_action_attr) type) { - case OVS_ACTION_ATTR_OUTPUT: return 4; + case OVS_ACTION_ATTR_OUTPUT: return sizeof(uint32_t); case OVS_ACTION_ATTR_USERSPACE: return -2; - case OVS_ACTION_ATTR_PUSH: return -2; - case OVS_ACTION_ATTR_POP: return 2; + case OVS_ACTION_ATTR_PUSH_VLAN: return sizeof(struct ovs_action_push_vlan); + case OVS_ACTION_ATTR_POP_VLAN: return 0; case OVS_ACTION_ATTR_SET: return -2; case OVS_ACTION_ATTR_SAMPLE: return -2; @@ -84,11 +84,12 @@ ovs_key_attr_to_string(enum ovs_key_attr attr) switch (attr) { case OVS_KEY_ATTR_UNSPEC: return "unspec"; + case OVS_KEY_ATTR_ENCAP: return "encap"; case OVS_KEY_ATTR_PRIORITY: return "priority"; case OVS_KEY_ATTR_TUN_ID: return "tun_id"; case OVS_KEY_ATTR_IN_PORT: return "in_port"; case OVS_KEY_ATTR_ETHERNET: return "eth"; - case OVS_KEY_ATTR_8021Q: return "vlan"; + case OVS_KEY_ATTR_VLAN: return "vlan"; case OVS_KEY_ATTR_ETHERTYPE: return "eth_type"; case OVS_KEY_ATTR_IPV4: return "ipv4"; case OVS_KEY_ATTR_IPV6: return "ipv6"; @@ -201,6 +202,7 @@ format_odp_action(struct ds *ds, const struct nlattr *a) { int expected_len; enum ovs_action_attr type = nl_attr_type(a); + const struct ovs_action_push_vlan *vlan; expected_len = odp_action_len(nl_attr_type(a)); if (expected_len != -2 && nl_attr_get_size(a) != expected_len) { @@ -211,7 +213,6 @@ format_odp_action(struct ds *ds, const struct nlattr *a) } switch (type) { - case OVS_ACTION_ATTR_OUTPUT: ds_put_format(ds, "%"PRIu16, nl_attr_get_u32(a)); break; @@ -223,14 +224,18 @@ format_odp_action(struct ds *ds, const struct nlattr *a) format_odp_key_attr(nl_attr_get(a), ds); ds_put_cstr(ds, ")"); break; - case OVS_ACTION_ATTR_PUSH: - ds_put_cstr(ds, "push("); - format_odp_key_attr(nl_attr_get(a), ds); - ds_put_cstr(ds, ")"); + case OVS_ACTION_ATTR_PUSH_VLAN: + vlan = nl_attr_get(a); + ds_put_cstr(ds, "push_vlan("); + if (vlan->vlan_tpid != htons(ETH_TYPE_VLAN)) { + ds_put_format(ds, "tpid=0x%04"PRIx16",", ntohs(vlan->vlan_tpid)); + } + ds_put_format(ds, "vid=%"PRIu16",pcp=%d)", + vlan_tci_to_vid(vlan->vlan_tci), + vlan_tci_to_pcp(vlan->vlan_tci)); break; - case OVS_ACTION_ATTR_POP: - ds_put_format(ds, "pop(%s)", - ovs_key_attr_to_string(nl_attr_get_u16(a))); + case OVS_ACTION_ATTR_POP_VLAN: + ds_put_cstr(ds, "pop_vlan"); break; case OVS_ACTION_ATTR_SAMPLE: format_odp_sample_action(ds, a); @@ -269,7 +274,8 @@ format_odp_actions(struct ds *ds, const struct nlattr *actions, } /* Returns the correct length of the payload for a flow key attribute of the - * specified 'type', or -1 if 'type' is unknown. */ + * specified 'type', -1 if 'type' is unknown, or -2 if the attribute's payload + * is variable length. */ static int odp_flow_key_attr_len(uint16_t type) { @@ -278,11 +284,12 @@ odp_flow_key_attr_len(uint16_t type) } switch ((enum ovs_key_attr) type) { + case OVS_KEY_ATTR_ENCAP: return -2; case OVS_KEY_ATTR_PRIORITY: return 4; case OVS_KEY_ATTR_TUN_ID: return 8; case OVS_KEY_ATTR_IN_PORT: return 4; case OVS_KEY_ATTR_ETHERNET: return sizeof(struct ovs_key_ethernet); - case OVS_KEY_ATTR_8021Q: return sizeof(struct ovs_key_8021q); + case OVS_KEY_ATTR_VLAN: return sizeof(ovs_be16); case OVS_KEY_ATTR_ETHERTYPE: return 2; case OVS_KEY_ATTR_IPV4: return sizeof(struct ovs_key_ipv4); case OVS_KEY_ATTR_IPV6: return sizeof(struct ovs_key_ipv6); @@ -338,7 +345,6 @@ static void format_odp_key_attr(const struct nlattr *a, struct ds *ds) { const struct ovs_key_ethernet *eth_key; - const struct ovs_key_8021q *q_key; const struct ovs_key_ipv4 *ipv4_key; const struct ovs_key_ipv6 *ipv6_key; const struct ovs_key_tcp *tcp_key; @@ -348,9 +354,11 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds) const struct ovs_key_arp *arp_key; const struct ovs_key_nd *nd_key; enum ovs_key_attr attr = nl_attr_type(a); + int expected_len; ds_put_cstr(ds, ovs_key_attr_to_string(attr)); - if (nl_attr_get_size(a) != odp_flow_key_attr_len(nl_attr_type(a))) { + expected_len = odp_flow_key_attr_len(nl_attr_type(a)); + if (expected_len != -2 && nl_attr_get_size(a) != expected_len) { ds_put_format(ds, "(bad length %zu, expected %d)", nl_attr_get_size(a), odp_flow_key_attr_len(nl_attr_type(a))); @@ -359,6 +367,14 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds) } switch (attr) { + case OVS_KEY_ATTR_ENCAP: + ds_put_cstr(ds, "("); + if (nl_attr_get_size(a)) { + odp_flow_key_format(nl_attr_get(a), nl_attr_get_size(a), ds); + } + ds_put_char(ds, ')'); + break; + case OVS_KEY_ATTR_PRIORITY: ds_put_format(ds, "(%"PRIu32")", nl_attr_get_u32(a)); break; @@ -378,15 +394,10 @@ format_odp_key_attr(const struct nlattr *a, struct ds *ds) ETH_ADDR_ARGS(eth_key->eth_dst)); break; - case OVS_KEY_ATTR_8021Q: - q_key = nl_attr_get(a); - ds_put_cstr(ds, "("); - if (q_key->q_tpid != htons(ETH_TYPE_VLAN)) { - ds_put_format(ds, "tpid=0x%04"PRIx16",", ntohs(q_key->q_tpid)); - } - ds_put_format(ds, "vid=%"PRIu16",pcp=%d)", - vlan_tci_to_vid(q_key->q_tci), - vlan_tci_to_pcp(q_key->q_tci)); + case OVS_KEY_ATTR_VLAN: + ds_put_format(ds, "(vid=%"PRIu16",pcp=%d)", + vlan_tci_to_vid(nl_attr_get_be16(a)), + vlan_tci_to_pcp(nl_attr_get_be16(a))); break; case OVS_KEY_ATTR_ETHERTYPE: @@ -603,21 +614,15 @@ parse_odp_key_attr(const char *s, struct ofpbuf *key) } { - uint16_t tpid = ETH_TYPE_VLAN; uint16_t vid; int pcp; int n = -1; - if ((sscanf(s, "vlan(vid=%"SCNi16",pcp=%i)%n", - &vid, &pcp, &n) > 0 && n > 0) || - (sscanf(s, "vlan(tpid=%"SCNi16",vid=%"SCNi16",pcp=%i)%n", - &tpid, &vid, &pcp, &n) > 0 && n > 0)) { - struct ovs_key_8021q q_key; - - q_key.q_tpid = htons(tpid); - q_key.q_tci = htons((vid << VLAN_VID_SHIFT) | - (pcp << VLAN_PCP_SHIFT)); - nl_msg_put_unspec(key, OVS_KEY_ATTR_8021Q, &q_key, sizeof q_key); + if ((sscanf(s, "vlan(vid=%"SCNi16",pcp=%i)%n", &vid, &pcp, &n) > 0 + && n > 0)) { + nl_msg_put_be16(key, OVS_KEY_ATTR_VLAN, + htons((vid << VLAN_VID_SHIFT) | + (pcp << VLAN_PCP_SHIFT))); return n; } } @@ -816,6 +821,36 @@ parse_odp_key_attr(const char *s, struct ofpbuf *key) } } + if (!strncmp(s, "encap(", 6)) { + const char *start = s; + size_t encap; + + encap = nl_msg_start_nested(key, OVS_KEY_ATTR_ENCAP); + + s += 6; + for (;;) { + int retval; + + s += strspn(s, ", \t\r\n"); + if (!*s) { + return -EINVAL; + } else if (*s == ')') { + break; + } + + retval = parse_odp_key_attr(s, key); + if (retval < 0) { + return retval; + } + s += retval; + } + s++; + + nl_msg_end_nested(key, encap); + + return s - start; + } + return -EINVAL; } @@ -865,6 +900,7 @@ void odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow) { struct ovs_key_ethernet *eth_key; + size_t encap; if (flow->priority) { nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, flow->priority); @@ -885,16 +921,16 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow) memcpy(eth_key->eth_dst, flow->dl_dst, ETH_ADDR_LEN); if (flow->vlan_tci != htons(0)) { - struct ovs_key_8021q *q_key; - - q_key = nl_msg_put_unspec_uninit(buf, OVS_KEY_ATTR_8021Q, - sizeof *q_key); - q_key->q_tpid = htons(ETH_TYPE_VLAN); - q_key->q_tci = flow->vlan_tci & ~htons(VLAN_CFI); + nl_msg_put_be16(buf, OVS_KEY_ATTR_ETHERTYPE, htons(ETH_TYPE_VLAN)); + nl_msg_put_be16(buf, OVS_KEY_ATTR_VLAN, + flow->vlan_tci & ~htons(VLAN_CFI)); + encap = nl_msg_start_nested(buf, OVS_KEY_ATTR_ENCAP); + } else { + encap = 0; } if (ntohs(flow->dl_type) < ETH_TYPE_MIN) { - return; + goto unencap; } nl_msg_put_be16(buf, OVS_KEY_ATTR_ETHERTYPE, flow->dl_type); @@ -983,6 +1019,11 @@ odp_flow_key_from_flow(struct ofpbuf *buf, const struct flow *flow) } } } + +unencap: + if (encap) { + nl_msg_end_nested(buf, encap); + } } static void @@ -1032,33 +1073,22 @@ odp_to_ovs_frag(uint8_t odp_frag, struct flow *flow) return true; } -/* Converts the 'key_len' bytes of OVS_KEY_ATTR_* attributes in 'key' to a flow - * structure in 'flow'. Returns 0 if successful, otherwise EINVAL. */ -int -odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, - struct flow *flow) +static int +parse_flow_nlattrs(const struct nlattr *key, size_t key_len, + const struct nlattr *attrs[], uint64_t *present_attrsp) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); - const struct nlattr *attrs[OVS_KEY_ATTR_MAX + 1]; const struct nlattr *nla; - uint64_t expected_attrs; uint64_t present_attrs; - uint64_t missing_attrs; - uint64_t extra_attrs; size_t left; - memset(flow, 0, sizeof *flow); - - memset(attrs, 0, sizeof attrs); - present_attrs = 0; - expected_attrs = 0; NL_ATTR_FOR_EACH (nla, left, key, key_len) { uint16_t type = nl_attr_type(nla); size_t len = nl_attr_get_size(nla); int expected_len = odp_flow_key_attr_len(type); - if (len != expected_len) { + if (len != expected_len && expected_len != -2) { if (expected_len == -1) { VLOG_ERR_RL(&rl, "unknown attribute %"PRIu16" in flow key", type); @@ -1082,17 +1112,68 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, return EINVAL; } - if (attrs[OVS_KEY_ATTR_PRIORITY]) { + *present_attrsp = present_attrs; + return 0; +} + +static int +check_expectations(uint64_t present_attrs, uint64_t expected_attrs, + const struct nlattr *key, size_t key_len) +{ + uint64_t missing_attrs; + uint64_t extra_attrs; + + missing_attrs = expected_attrs & ~present_attrs; + if (missing_attrs) { + static struct vlog_rate_limit miss_rl = VLOG_RATE_LIMIT_INIT(10, 10); + log_odp_key_attributes(&miss_rl, "expected but not present", + missing_attrs, key, key_len); + return EINVAL; + } + + extra_attrs = present_attrs & ~expected_attrs; + if (extra_attrs) { + static struct vlog_rate_limit extra_rl = VLOG_RATE_LIMIT_INIT(10, 10); + log_odp_key_attributes(&extra_rl, "present but not expected", + extra_attrs, key, key_len); + return EINVAL; + } + + return 0; +} + +/* Converts the 'key_len' bytes of OVS_KEY_ATTR_* attributes in 'key' to a flow + * structure in 'flow'. Returns 0 if successful, otherwise EINVAL. */ +int +odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, + struct flow *flow) +{ + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); + const struct nlattr *attrs[OVS_KEY_ATTR_MAX + 1]; + uint64_t expected_attrs; + uint64_t present_attrs; + int error; + + memset(flow, 0, sizeof *flow); + + error = parse_flow_nlattrs(key, key_len, attrs, &present_attrs); + if (error) { + return error; + } + + expected_attrs = 0; + + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_PRIORITY)) { flow->priority = nl_attr_get_u32(attrs[OVS_KEY_ATTR_PRIORITY]); expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_PRIORITY; } - if (attrs[OVS_KEY_ATTR_TUN_ID]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TUN_ID)) { flow->tun_id = nl_attr_get_be64(attrs[OVS_KEY_ATTR_TUN_ID]); expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TUN_ID; } - if (attrs[OVS_KEY_ATTR_IN_PORT]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IN_PORT)) { uint32_t in_port = nl_attr_get_u32(attrs[OVS_KEY_ATTR_IN_PORT]); if (in_port >= UINT16_MAX || in_port >= OFPP_MAX) { VLOG_ERR_RL(&rl, "in_port %"PRIu32" out of supported range", @@ -1105,7 +1186,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, flow->in_port = OFPP_NONE; } - if (attrs[OVS_KEY_ATTR_ETHERNET]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ETHERNET)) { const struct ovs_key_ethernet *eth_key; eth_key = nl_attr_get(attrs[OVS_KEY_ATTR_ETHERNET]); @@ -1117,25 +1198,30 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, } expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_ETHERNET; - if (attrs[OVS_KEY_ATTR_8021Q]) { - const struct ovs_key_8021q *q_key; - - q_key = nl_attr_get(attrs[OVS_KEY_ATTR_8021Q]); - if (q_key->q_tpid != htons(ETH_TYPE_VLAN)) { - VLOG_ERR_RL(&rl, "unsupported 802.1Q TPID %"PRIu16" in flow key", - ntohs(q_key->q_tpid)); + if ((present_attrs & ~expected_attrs) + == ((UINT64_C(1) << OVS_KEY_ATTR_ETHERTYPE) | + (UINT64_C(1) << OVS_KEY_ATTR_VLAN) | + (UINT64_C(1) << OVS_KEY_ATTR_ENCAP)) + && (nl_attr_get_be16(attrs[OVS_KEY_ATTR_ETHERTYPE]) + == htons(ETH_TYPE_VLAN))) { + const struct nlattr *encap = attrs[OVS_KEY_ATTR_ENCAP]; + const struct nlattr *vlan = attrs[OVS_KEY_ATTR_VLAN]; + + flow->vlan_tci = nl_attr_get_be16(vlan); + if (flow->vlan_tci & htons(VLAN_CFI)) { return EINVAL; } - if (q_key->q_tci & htons(VLAN_CFI)) { - VLOG_ERR_RL(&rl, "invalid 802.1Q TCI %"PRIu16" in flow key", - ntohs(q_key->q_tci)); - return EINVAL; + flow->vlan_tci |= htons(VLAN_CFI); + + error = parse_flow_nlattrs(nl_attr_get(encap), nl_attr_get_size(encap), + attrs, &present_attrs); + if (error) { + return error; } - flow->vlan_tci = q_key->q_tci | htons(VLAN_CFI); - expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_8021Q; + expected_attrs = 0; } - if (attrs[OVS_KEY_ATTR_ETHERTYPE]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ETHERTYPE)) { flow->dl_type = nl_attr_get_be16(attrs[OVS_KEY_ATTR_ETHERTYPE]); if (ntohs(flow->dl_type) < 1536) { VLOG_ERR_RL(&rl, "invalid Ethertype %"PRIu16" in flow key", @@ -1149,7 +1235,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, if (flow->dl_type == htons(ETH_TYPE_IP)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV4; - if (attrs[OVS_KEY_ATTR_IPV4]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV4)) { const struct ovs_key_ipv4 *ipv4_key; ipv4_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV4]); @@ -1164,7 +1250,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, } } else if (flow->dl_type == htons(ETH_TYPE_IPV6)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_IPV6; - if (attrs[OVS_KEY_ATTR_IPV6]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_IPV6)) { const struct ovs_key_ipv6 *ipv6_key; ipv6_key = nl_attr_get(attrs[OVS_KEY_ATTR_IPV6]); @@ -1180,7 +1266,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, } } else if (flow->dl_type == htons(ETH_TYPE_ARP)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_ARP; - if (attrs[OVS_KEY_ATTR_ARP]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ARP)) { const struct ovs_key_arp *arp_key; arp_key = nl_attr_get(attrs[OVS_KEY_ATTR_ARP]); @@ -1202,7 +1288,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, flow->dl_type == htons(ETH_TYPE_IPV6)) && !(flow->nw_frag & FLOW_NW_FRAG_LATER)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_TCP; - if (attrs[OVS_KEY_ATTR_TCP]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_TCP)) { const struct ovs_key_tcp *tcp_key; tcp_key = nl_attr_get(attrs[OVS_KEY_ATTR_TCP]); @@ -1214,7 +1300,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, flow->dl_type == htons(ETH_TYPE_IPV6)) && !(flow->nw_frag & FLOW_NW_FRAG_LATER)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_UDP; - if (attrs[OVS_KEY_ATTR_UDP]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_UDP)) { const struct ovs_key_udp *udp_key; udp_key = nl_attr_get(attrs[OVS_KEY_ATTR_UDP]); @@ -1225,7 +1311,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, && flow->dl_type == htons(ETH_TYPE_IP) && !(flow->nw_frag & FLOW_NW_FRAG_LATER)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_ICMP; - if (attrs[OVS_KEY_ATTR_ICMP]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ICMP)) { const struct ovs_key_icmp *icmp_key; icmp_key = nl_attr_get(attrs[OVS_KEY_ATTR_ICMP]); @@ -1236,7 +1322,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, && flow->dl_type == htons(ETH_TYPE_IPV6) && !(flow->nw_frag & FLOW_NW_FRAG_LATER)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_ICMPV6; - if (attrs[OVS_KEY_ATTR_ICMPV6]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ICMPV6)) { const struct ovs_key_icmpv6 *icmpv6_key; icmpv6_key = nl_attr_get(attrs[OVS_KEY_ATTR_ICMPV6]); @@ -1246,7 +1332,7 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, if (flow->tp_src == htons(ND_NEIGHBOR_SOLICIT) || flow->tp_src == htons(ND_NEIGHBOR_ADVERT)) { expected_attrs |= UINT64_C(1) << OVS_KEY_ATTR_ND; - if (attrs[OVS_KEY_ATTR_ND]) { + if (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ND)) { const struct ovs_key_nd *nd_key; nd_key = nl_attr_get(attrs[OVS_KEY_ATTR_ND]); @@ -1259,21 +1345,5 @@ odp_flow_key_to_flow(const struct nlattr *key, size_t key_len, } } - missing_attrs = expected_attrs & ~present_attrs; - if (missing_attrs) { - static struct vlog_rate_limit miss_rl = VLOG_RATE_LIMIT_INIT(10, 10); - log_odp_key_attributes(&miss_rl, "expected but not present", - missing_attrs, key, key_len); - return EINVAL; - } - - extra_attrs = present_attrs & ~expected_attrs; - if (extra_attrs) { - static struct vlog_rate_limit extra_rl = VLOG_RATE_LIMIT_INIT(10, 10); - log_odp_key_attributes(&extra_rl, "present but not expected", - extra_attrs, key, key_len); - return EINVAL; - } - - return 0; + return check_expectations(present_attrs, expected_attrs, key, key_len); } diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index b2961546..9356800c 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -2885,10 +2885,10 @@ facet_account(struct ofproto_dpif *ofproto, struct facet *facet) * hash bucket.) */ vlan_tci = facet->flow.vlan_tci; NL_ATTR_FOR_EACH_UNSAFE (a, left, facet->actions, facet->actions_len) { + const struct ovs_action_push_vlan *vlan; struct ofport_dpif *port; switch (nl_attr_type(a)) { - const struct nlattr *nested; case OVS_ACTION_ATTR_OUTPUT: port = get_odp_port(ofproto, nl_attr_get_u32(a)); if (port && port->bundle && port->bundle->bond) { @@ -2897,20 +2897,13 @@ facet_account(struct ofproto_dpif *ofproto, struct facet *facet) } break; - case OVS_ACTION_ATTR_POP: - if (nl_attr_get_u16(a) == OVS_KEY_ATTR_8021Q) { - vlan_tci = htons(0); - } + case OVS_ACTION_ATTR_POP_VLAN: + vlan_tci = htons(0); break; - case OVS_ACTION_ATTR_PUSH: - nested = nl_attr_get(a); - if (nl_attr_type(nested) == OVS_KEY_ATTR_8021Q) { - const struct ovs_key_8021q *q_key; - - q_key = nl_attr_get_unspec(nested, sizeof(*q_key)); - vlan_tci = q_key->q_tci; - } + case OVS_ACTION_ATTR_PUSH_VLAN: + vlan = nl_attr_get(a); + vlan_tci = vlan->vlan_tci; break; } } @@ -3586,13 +3579,10 @@ fix_sflow_action(struct action_xlate_ctx *ctx) } static void -commit_action__(struct ofpbuf *odp_actions, - enum ovs_action_attr act_type, - enum ovs_key_attr key_type, - const void *key, size_t key_size) +commit_set_action(struct ofpbuf *odp_actions, enum ovs_key_attr key_type, + const void *key, size_t key_size) { - size_t offset = nl_msg_start_nested(odp_actions, act_type); - + size_t offset = nl_msg_start_nested(odp_actions, OVS_ACTION_ATTR_SET); nl_msg_put_unspec(odp_actions, key_type, key, key_size); nl_msg_end_nested(odp_actions, offset); } @@ -3606,8 +3596,8 @@ commit_set_tun_id_action(const struct flow *flow, struct flow *base, } base->tun_id = flow->tun_id; - commit_action__(odp_actions, OVS_ACTION_ATTR_SET, - OVS_KEY_ATTR_TUN_ID, &base->tun_id, sizeof(base->tun_id)); + commit_set_action(odp_actions, OVS_KEY_ATTR_TUN_ID, + &base->tun_id, sizeof(base->tun_id)); } static void @@ -3627,8 +3617,8 @@ commit_set_ether_addr_action(const struct flow *flow, struct flow *base, memcpy(eth_key.eth_src, base->dl_src, ETH_ADDR_LEN); memcpy(eth_key.eth_dst, base->dl_dst, ETH_ADDR_LEN); - commit_action__(odp_actions, OVS_ACTION_ATTR_SET, - OVS_KEY_ATTR_ETHERNET, ð_key, sizeof(eth_key)); + commit_set_action(odp_actions, OVS_KEY_ATTR_ETHERNET, + ð_key, sizeof(eth_key)); } static void @@ -3641,18 +3631,16 @@ commit_vlan_action(struct action_xlate_ctx *ctx, ovs_be16 new_tci) } if (base->vlan_tci & htons(VLAN_CFI)) { - nl_msg_put_u16(ctx->odp_actions, OVS_ACTION_ATTR_POP, - OVS_KEY_ATTR_8021Q); + nl_msg_put_flag(ctx->odp_actions, OVS_ACTION_ATTR_POP_VLAN); } if (new_tci & htons(VLAN_CFI)) { - struct ovs_key_8021q q_key; - - q_key.q_tpid = htons(ETH_TYPE_VLAN); - q_key.q_tci = new_tci & ~htons(VLAN_CFI); + struct ovs_action_push_vlan vlan; - commit_action__(ctx->odp_actions, OVS_ACTION_ATTR_PUSH, - OVS_KEY_ATTR_8021Q, &q_key, sizeof(q_key)); + vlan.vlan_tpid = htons(ETH_TYPE_VLAN); + vlan.vlan_tci = new_tci & ~htons(VLAN_CFI); + nl_msg_put_unspec(ctx->odp_actions, OVS_ACTION_ATTR_PUSH_VLAN, + &vlan, sizeof vlan); } base->vlan_tci = new_tci; } @@ -3685,8 +3673,8 @@ commit_set_nw_action(const struct flow *flow, struct flow *base, : base->nw_frag == FLOW_NW_FRAG_ANY ? OVS_FRAG_TYPE_FIRST : OVS_FRAG_TYPE_LATER); - commit_action__(odp_actions, OVS_ACTION_ATTR_SET, - OVS_KEY_ATTR_IPV4, &ipv4_key, sizeof(ipv4_key)); + commit_set_action(odp_actions, OVS_KEY_ATTR_IPV4, + &ipv4_key, sizeof(ipv4_key)); } static void @@ -3708,8 +3696,8 @@ commit_set_port_action(const struct flow *flow, struct flow *base, port_key.tcp_src = base->tp_src = flow->tp_src; port_key.tcp_dst = base->tp_dst = flow->tp_dst; - commit_action__(odp_actions, OVS_ACTION_ATTR_SET, - OVS_KEY_ATTR_TCP, &port_key, sizeof(port_key)); + commit_set_action(odp_actions, OVS_KEY_ATTR_TCP, + &port_key, sizeof(port_key)); } else if (flow->nw_proto == IPPROTO_UDP) { struct ovs_key_udp port_key; @@ -3717,8 +3705,8 @@ commit_set_port_action(const struct flow *flow, struct flow *base, port_key.udp_src = base->tp_src = flow->tp_src; port_key.udp_dst = base->tp_dst = flow->tp_dst; - commit_action__(odp_actions, OVS_ACTION_ATTR_SET, - OVS_KEY_ATTR_UDP, &port_key, sizeof(port_key)); + commit_set_action(odp_actions, OVS_KEY_ATTR_UDP, + &port_key, sizeof(port_key)); } } @@ -3731,9 +3719,8 @@ commit_set_priority_action(const struct flow *flow, struct flow *base, } base->priority = flow->priority; - commit_action__(odp_actions, OVS_ACTION_ATTR_SET, - OVS_KEY_ATTR_PRIORITY, &base->priority, - sizeof(base->priority)); + commit_set_action(odp_actions, OVS_KEY_ATTR_PRIORITY, + &base->priority, sizeof(base->priority)); } static void diff --git a/tests/odp.at b/tests/odp.at index e4c7425e..90a11920 100644 --- a/tests/odp.at +++ b/tests/odp.at @@ -34,7 +34,8 @@ in_port(1),eth(src=00:01:02:03:04:05,dst=10:11:12:13:14:15),eth_type(0x0806),arp echo echo '# Valid forms with VLAN header.' - sed 's/eth([[^)]]*)/&,vlan(vid=99,pcp=7)/' odp-base.txt + sed 's/\(eth([[^)]]*)\),*/\1,eth_type(0x8100),vlan(vid=99,pcp=7),encap(/ +s/$/)/' odp-base.txt echo echo '# Valid forms with QoS priority.' @@ -43,7 +44,8 @@ in_port(1),eth(src=00:01:02:03:04:05,dst=10:11:12:13:14:15),eth_type(0x0806),arp echo echo '# Valid forms with tun_id and VLAN headers.' sed 's/^/tun_id(0xfedcba9876543210),/ -s/eth([[^)]]*)/&,vlan(vid=99,pcp=7)/' odp-base.txt +s/\(eth([[^)]]*)\),*/\1,eth_type(0x8100),vlan(vid=99,pcp=7),encap(/ +s/$/)/' odp-base.txt echo echo '# Valid forms with IP first fragment.' diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index e6c2f922..f3f1738f 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -125,84 +125,84 @@ for tuple in \ "br0 none 0 drop" \ "br0 0 0 drop" \ "br0 0 1 drop" \ - "br0 10 0 p1,p5,p6,p7,p8,pop(vlan),p2" \ - "br0 10 1 p1,p5,p6,p7,p8,pop(vlan),push(vlan(vid=0,pcp=1)),p2" \ + "br0 10 0 p1,p5,p6,p7,p8,pop_vlan,p2" \ + "br0 10 1 p1,p5,p6,p7,p8,pop_vlan,push_vlan(vid=0,pcp=1),p2" \ "br0 11 0 p5,p7" \ "br0 11 1 p5,p7" \ - "br0 12 0 p1,p5,p6,pop(vlan),p3,p4,p7,p8" \ - "br0 12 1 p1,p5,p6,pop(vlan),push(vlan(vid=0,pcp=1)),p3,p4,p7,p8" \ + "br0 12 0 p1,p5,p6,pop_vlan,p3,p4,p7,p8" \ + "br0 12 1 p1,p5,p6,pop_vlan,push_vlan(vid=0,pcp=1),p3,p4,p7,p8" \ "p1 none 0 drop" \ "p1 0 0 drop" \ "p1 0 1 drop" \ - "p1 10 0 br0,p5,p6,p7,p8,pop(vlan),p2" \ - "p1 10 1 br0,p5,p6,p7,p8,pop(vlan),push(vlan(vid=0,pcp=1)),p2" \ + "p1 10 0 br0,p5,p6,p7,p8,pop_vlan,p2" \ + "p1 10 1 br0,p5,p6,p7,p8,pop_vlan,push_vlan(vid=0,pcp=1),p2" \ "p1 11 0 drop" \ "p1 11 1 drop" \ - "p1 12 0 br0,p5,p6,pop(vlan),p3,p4,p7,p8" \ - "p1 12 1 br0,p5,p6,pop(vlan),push(vlan(vid=0,pcp=1)),p3,p4,p7,p8" \ - "p2 none 0 push(vlan(vid=10,pcp=0)),br0,p1,p5,p6,p7,p8" \ - "p2 0 0 pop(vlan),push(vlan(vid=10,pcp=0)),br0,p1,p5,p6,p7,p8" \ - "p2 0 1 pop(vlan),push(vlan(vid=10,pcp=1)),br0,p1,p5,p6,p7,p8" \ + "p1 12 0 br0,p5,p6,pop_vlan,p3,p4,p7,p8" \ + "p1 12 1 br0,p5,p6,pop_vlan,push_vlan(vid=0,pcp=1),p3,p4,p7,p8" \ + "p2 none 0 push_vlan(vid=10,pcp=0),br0,p1,p5,p6,p7,p8" \ + "p2 0 0 pop_vlan,push_vlan(vid=10,pcp=0),br0,p1,p5,p6,p7,p8" \ + "p2 0 1 pop_vlan,push_vlan(vid=10,pcp=1),br0,p1,p5,p6,p7,p8" \ "p2 10 0 drop" \ "p2 10 1 drop" \ "p2 11 0 drop" \ "p2 11 1 drop" \ "p2 12 0 drop" \ "p2 12 1 drop" \ - "p3 none 0 p4,p7,p8,push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p3 0 0 p4,p7,p8,pop(vlan),push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p3 0 1 p4,p7,p8,pop(vlan),push(vlan(vid=12,pcp=1)),br0,p1,p5,p6" \ + "p3 none 0 p4,p7,p8,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p3 0 0 p4,p7,p8,pop_vlan,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p3 0 1 p4,p7,p8,pop_vlan,push_vlan(vid=12,pcp=1),br0,p1,p5,p6" \ "p3 10 0 drop" \ "p3 10 1 drop" \ "p3 11 0 drop" \ "p3 11 1 drop" \ "p3 12 0 drop" \ "p3 12 1 drop" \ - "p4 none 0 p3,p7,p8,push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p4 0 0 p3,p7,p8,pop(vlan),push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p4 0 1 p3,p7,p8,pop(vlan),push(vlan(vid=12,pcp=1)),br0,p1,p5,p6" \ + "p4 none 0 p3,p7,p8,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p4 0 0 p3,p7,p8,pop_vlan,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p4 0 1 p3,p7,p8,pop_vlan,push_vlan(vid=12,pcp=1),br0,p1,p5,p6" \ "p4 10 0 drop" \ "p4 10 1 drop" \ "p4 11 0 drop" \ "p4 11 1 drop" \ "p4 12 0 drop" \ "p4 12 1 drop" \ - "p5 none 0 p2,push(vlan(vid=10,pcp=0)),br0,p1,p6,p7,p8" \ - "p5 0 0 p2,pop(vlan),push(vlan(vid=10,pcp=0)),br0,p1,p6,p7,p8" \ - "p5 0 1 p2,pop(vlan),push(vlan(vid=10,pcp=1)),br0,p1,p6,p7,p8" \ - "p5 10 0 br0,p1,p6,p7,p8,pop(vlan),p2" \ - "p5 10 1 br0,p1,p6,p7,p8,pop(vlan),push(vlan(vid=0,pcp=1)),p2" \ + "p5 none 0 p2,push_vlan(vid=10,pcp=0),br0,p1,p6,p7,p8" \ + "p5 0 0 p2,pop_vlan,push_vlan(vid=10,pcp=0),br0,p1,p6,p7,p8" \ + "p5 0 1 p2,pop_vlan,push_vlan(vid=10,pcp=1),br0,p1,p6,p7,p8" \ + "p5 10 0 br0,p1,p6,p7,p8,pop_vlan,p2" \ + "p5 10 1 br0,p1,p6,p7,p8,pop_vlan,push_vlan(vid=0,pcp=1),p2" \ "p5 11 0 br0,p7" \ "p5 11 1 br0,p7" \ - "p5 12 0 br0,p1,p6,pop(vlan),p3,p4,p7,p8" \ - "p5 12 1 br0,p1,p6,pop(vlan),push(vlan(vid=0,pcp=1)),p3,p4,p7,p8" \ - "p6 none 0 p2,push(vlan(vid=10,pcp=0)),br0,p1,p5,p7,p8" \ - "p6 0 0 p2,pop(vlan),push(vlan(vid=10,pcp=0)),br0,p1,p5,p7,p8" \ - "p6 0 1 p2,pop(vlan),push(vlan(vid=10,pcp=1)),br0,p1,p5,p7,p8" \ - "p6 10 0 br0,p1,p5,p7,p8,pop(vlan),p2" \ - "p6 10 1 br0,p1,p5,p7,p8,pop(vlan),push(vlan(vid=0,pcp=1)),p2" \ + "p5 12 0 br0,p1,p6,pop_vlan,p3,p4,p7,p8" \ + "p5 12 1 br0,p1,p6,pop_vlan,push_vlan(vid=0,pcp=1),p3,p4,p7,p8" \ + "p6 none 0 p2,push_vlan(vid=10,pcp=0),br0,p1,p5,p7,p8" \ + "p6 0 0 p2,pop_vlan,push_vlan(vid=10,pcp=0),br0,p1,p5,p7,p8" \ + "p6 0 1 p2,pop_vlan,push_vlan(vid=10,pcp=1),br0,p1,p5,p7,p8" \ + "p6 10 0 br0,p1,p5,p7,p8,pop_vlan,p2" \ + "p6 10 1 br0,p1,p5,p7,p8,pop_vlan,push_vlan(vid=0,pcp=1),p2" \ "p6 11 0 drop" \ "p6 11 1 drop" \ - "p6 12 0 br0,p1,p5,pop(vlan),p3,p4,p7,p8" \ - "p6 12 1 br0,p1,p5,pop(vlan),push(vlan(vid=0,pcp=1)),p3,p4,p7,p8" \ - "p7 none 0 p3,p4,p8,push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p7 0 0 p3,p4,p8,pop(vlan),push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p7 0 1 p3,p4,p8,pop(vlan),push(vlan(vid=12,pcp=1)),br0,p1,p5,p6" \ - "p7 10 0 br0,p1,p5,p6,p8,pop(vlan),p2" \ - "p7 10 1 br0,p1,p5,p6,p8,pop(vlan),push(vlan(vid=0,pcp=1)),p2" \ + "p6 12 0 br0,p1,p5,pop_vlan,p3,p4,p7,p8" \ + "p6 12 1 br0,p1,p5,pop_vlan,push_vlan(vid=0,pcp=1),p3,p4,p7,p8" \ + "p7 none 0 p3,p4,p8,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p7 0 0 p3,p4,p8,pop_vlan,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p7 0 1 p3,p4,p8,pop_vlan,push_vlan(vid=12,pcp=1),br0,p1,p5,p6" \ + "p7 10 0 br0,p1,p5,p6,p8,pop_vlan,p2" \ + "p7 10 1 br0,p1,p5,p6,p8,pop_vlan,push_vlan(vid=0,pcp=1),p2" \ "p7 11 0 br0,p5" \ "p7 11 1 br0,p5" \ - "p7 12 0 br0,p1,p5,p6,pop(vlan),p3,p4,p8" \ - "p7 12 1 br0,p1,p5,p6,pop(vlan),push(vlan(vid=0,pcp=1)),p3,p4,p8" \ - "p8 none 0 p3,p4,p7,push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p8 0 0 p3,p4,p7,pop(vlan),push(vlan(vid=12,pcp=0)),br0,p1,p5,p6" \ - "p8 0 1 p3,p4,p7,pop(vlan),push(vlan(vid=12,pcp=1)),br0,p1,p5,p6" \ - "p8 10 0 br0,p1,p5,p6,p7,pop(vlan),p2" \ - "p8 10 1 br0,p1,p5,p6,p7,pop(vlan),push(vlan(vid=0,pcp=1)),p2" \ + "p7 12 0 br0,p1,p5,p6,pop_vlan,p3,p4,p8" \ + "p7 12 1 br0,p1,p5,p6,pop_vlan,push_vlan(vid=0,pcp=1),p3,p4,p8" \ + "p8 none 0 p3,p4,p7,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p8 0 0 p3,p4,p7,pop_vlan,push_vlan(vid=12,pcp=0),br0,p1,p5,p6" \ + "p8 0 1 p3,p4,p7,pop_vlan,push_vlan(vid=12,pcp=1),br0,p1,p5,p6" \ + "p8 10 0 br0,p1,p5,p6,p7,pop_vlan,p2" \ + "p8 10 1 br0,p1,p5,p6,p7,pop_vlan,push_vlan(vid=0,pcp=1),p2" \ "p8 11 0 drop" \ "p8 11 1 drop" \ - "p8 12 0 br0,p1,p5,p6,pop(vlan),p3,p4,p7" \ - "p8 12 1 br0,p1,p5,p6,pop(vlan),push(vlan(vid=0,pcp=1)),p3,p4,p7" + "p8 12 0 br0,p1,p5,p6,pop_vlan,p3,p4,p7" \ + "p8 12 1 br0,p1,p5,p6,pop_vlan,push_vlan(vid=0,pcp=1),p3,p4,p7" do set $tuple in_port=$1 @@ -214,7 +214,7 @@ do if test $vlan = none; then flow="in_port($n_in_port),eth(src=50:54:00:00:00:01,dst=ff:ff:ff:ff:ff:ff),eth_type(0xabcd)" else - flow="in_port($n_in_port),eth(src=50:54:00:00:00:01,dst=ff:ff:ff:ff:ff:ff),vlan(vid=$vlan,pcp=$pcp),eth_type(0xabcd)" + flow="in_port($n_in_port),eth(src=50:54:00:00:00:01,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8100),vlan(vid=$vlan,pcp=$pcp),encap(eth_type(0xabcd))" fi AT_CHECK([ovs-appctl ofproto/trace br0 "$flow"], [0], [stdout])