Provided by: ovn-common_2.5.9-0ubuntu0.16.04.3_amd64 bug

NAME

       ovn-architecture - Open Virtual Network architecture

DESCRIPTION

       OVN,  the  Open Virtual Network, is a system to support virtual network abstraction.  OVN complements the
       existing capabilities of OVS to add native support for virtual network abstractions, such as  virtual  L2
       and  L3 overlays and security groups.  Services such as DHCP are also desirable features.  Just like OVS,
       OVN’s design goal is to have a production-quality implementation that can operate at significant scale.

       An OVN deployment consists of several components:

              •      A Cloud Management System (CMS),  which  is  OVN’s  ultimate  client  (via  its  users  and
                     administrators).   OVN  integration  requires  installing a CMS-specific plugin and related
                     software (see below).  OVN initially targets OpenStack as CMS.

                     We generally speak of ``the’’ CMS, but one can imagine scenarios in  which  multiple  CMSes
                     manage different parts of an OVN deployment.

              •      An  OVN  Database physical or virtual node (or, eventually, cluster) installed in a central
                     location.

              •      One or more (usually many) hypervisors.  Hypervisors must run Open  vSwitch  and  implement
                     the  interface  described  in  IntegrationGuide.md  in the OVS source tree.  Any hypervisor
                     platform supported by Open vSwitch is acceptable.

              •      Zero or more gateways.  A gateway extends a tunnel-based logical network  into  a  physical
                     network by bidirectionally forwarding packets between tunnels and a physical Ethernet port.
                     This  allows non-virtualized machines to participate in logical networks.  A gateway may be
                     a physical host, a virtual machine, or an ASIC-based  hardware  switch  that  supports  the
                     vtep(5) schema.  (Support for the latter will come later in OVN implementation.)

                     Hypervisors and gateways are together called transport node or chassis.

       The  diagram  below shows how the major components of OVN and related software interact.  Starting at the
       top of the diagram, we have:

              •      The Cloud Management System, as defined above.

              •      The OVN/CMS Plugin is the component of the CMS that interfaces to OVN.  In OpenStack,  this
                     is a Neutron plugin.  The plugin’s main purpose is to translate the CMS’s notion of logical
                     network configuration, stored in the CMS’s configuration database in a CMS-specific format,
                     into an intermediate representation understood by OVN.

                     This  component is necessarily CMS-specific, so a new plugin needs to be developed for each
                     CMS that is integrated with OVN.  All of the components below this one in the  diagram  are
                     CMS-independent.

              •      The  OVN  Northbound  Database  receives the intermediate representation of logical network
                     configuration passed down by the OVN/CMS Plugin.   The  database  schema  is  meant  to  be
                     ``impedance matched’’ with the concepts used in a CMS, so that it directly supports notions
                     of logical switches, routers, ACLs, and so on.  See ovn-nb(5) for details.

                     The  OVN  Northbound  Database  has  only  two  clients:  the  OVN/CMS  Plugin above it and
                     ovn-northd below it.

              •      ovn-northd(8) connects to the OVN Northbound Database  above  it  and  the  OVN  Southbound
                     Database   below  it.   It  translates  the  logical  network  configuration  in  terms  of
                     conventional network concepts,  taken  from  the  OVN  Northbound  Database,  into  logical
                     datapath flows in the OVN Southbound Database below it.

              •      The  OVN  Southbound  Database  is the center of the system.  Its clients are ovn-northd(8)
                     above it and ovn-controller(8) on every transport node below it.

                     The OVN Southbound Database contains three kinds of data: Physical Network (PN) tables that
                     specify how to reach hypervisor and other nodes, Logical Network (LN) tables that  describe
                     the  logical  network  in terms of ``logical datapath flows,’’ and Binding tables that link
                     logical network components’ locations to the physical network.   The  hypervisors  populate
                     the PN and Port_Binding tables, whereas ovn-northd(8) populates the LN tables.

                     OVN  Southbound  Database  performance must scale with the number of transport nodes.  This
                     will likely require some work on ovsdb-server(1) as we encounter  bottlenecks.   Clustering
                     for availability may be needed.

       The remaining components are replicated onto each hypervisor:

              •      ovn-controller(8)  is  OVN’s agent on each hypervisor and software gateway.  Northbound, it
                     connects to the OVN Southbound Database to learn about OVN configuration and status and  to
                     populate the PN table and the Chassis column in Binding table with the hypervisor’s status.
                     Southbound,  it  connects  to  ovs-vswitchd(8)  as an OpenFlow controller, for control over
                     network traffic, and to the local ovsdb-server(1) to allow it to monitor and  control  Open
                     vSwitch configuration.

              •      ovs-vswitchd(8) and ovsdb-server(1) are conventional components of Open vSwitch.

                                         CMS
                                          |
                                          |
                              +-----------|-----------+
                              |           |           |
                              |     OVN/CMS Plugin    |
                              |           |           |
                              |           |           |
                              |   OVN Northbound DB   |
                              |           |           |
                              |           |           |
                              |       ovn-northd      |
                              |           |           |
                              +-----------|-----------+
                                          |
                                          |
                                +-------------------+
                                | OVN Southbound DB |
                                +-------------------+
                                          |
                                          |
                       +------------------+------------------+
                       |                  |                  |
         HV 1          |                  |    HV n          |
       +---------------|---------------+  .  +---------------|---------------+
       |               |               |  .  |               |               |
       |        ovn-controller         |  .  |        ovn-controller         |
       |         |          |          |  .  |         |          |          |
       |         |          |          |     |         |          |          |
       |  ovs-vswitchd   ovsdb-server  |     |  ovs-vswitchd   ovsdb-server  |
       |                               |     |                               |
       +-------------------------------+     +-------------------------------+

   Chassis Setup
       Each chassis in an OVN deployment must be configured with an Open vSwitch bridge dedicated for OVN’s use,
       called  the  integration  bridge.   System  startup  scripts  may  create  this  bridge prior to starting
       ovn-controller if desired.  If this bridge does not exist when ovn-controller starts, it will be  created
       automatically  with  the  default  configuration  suggested  below.   The ports on the integration bridge
       include:

              •      On any chassis, tunnel ports that  OVN  uses  to  maintain  logical  network  connectivity.
                     ovn-controller adds, updates, and removes these tunnel ports.

              •      On  a  hypervisor,  any  VIFs  that are to be attached to logical networks.  The hypervisor
                     itself,  or  the  integration  between  Open  vSwitch  and  the  hypervisor  (described  in
                     IntegrationGuide.md)  takes  care of this.  (This is not part of OVN or new to OVN; this is
                     pre-existing integration work that has already been done on hypervisors that support OVS.)

              •      On a gateway, the physical port used for  logical  network  connectivity.   System  startup
                     scripts  add this port to the bridge prior to starting ovn-controller.  This can be a patch
                     port to another bridge, instead of a physical port, in more sophisticated setups.

       Other ports should not be attached to the integration bridge.  In particular, physical ports attached  to
       the underlay network (as opposed to gateway ports, which are physical ports attached to logical networks)
       must  not be attached to the integration bridge.  Underlay physical ports should instead be attached to a
       separate Open vSwitch bridge (they need not be attached to any bridge at all, in fact).

       The integration bridge should be configured as described below.  The effect of each of these settings  is
       documented in ovs-vswitchd.conf.db(5):

              fail-mode=secure
                     Avoids switching packets between isolated logical networks before ovn-controller starts up.
                     See Controller Failure Settings in ovs-vsctl(8) for more information.

              other-config:disable-in-band=true
                     Suppresses  in-band control flows for the integration bridge.  It would be unusual for such
                     flows to show up anyway, because OVN uses a local controller (over a  Unix  domain  socket)
                     instead  of a remote controller.  It’s possible, however, for some other bridge in the same
                     system to have an in-band remote controller, and in that case  this  suppresses  the  flows
                     that  in-band  control  would ordinarily set up.  See In-Band Control in DESIGN.md for more
                     information.

       The customary name for the integration bridge is br-int, but another name may be used.

   Logical Networks
       A logical network implements the same concepts as physical networks, but  they  are  insulated  from  the
       physical  network with tunnels or other encapsulations.  This allows logical networks to have separate IP
       and other address spaces that overlap, without  conflicting,  with  those  used  for  physical  networks.
       Logical  network topologies can be arranged without regard for the topologies of the physical networks on
       which they run.

       Logical network concepts in OVN include:

              •      Logical switches, the logical version of Ethernet switches.

              •      Logical routers, the logical version of IP routers.  Logical switches and  routers  can  be
                     connected into sophisticated topologies.

              •      Logical  datapaths  are  the  logical  version of an OpenFlow switch.  Logical switches and
                     routers are both implemented as logical datapaths.

   Life Cycle of a VIF
       Tables and their schemas presented in isolation are difficult to understand.  Here’s an example.

       A VIF on a hypervisor is a virtual network interface attached either to  a  VM  or  a  container  running
       directly on that hypervisor (This is different from the interface of a container running inside a VM).

       The  steps in this example refer often to details of the OVN and OVN Northbound database schemas.  Please
       see ovn-sb(5) and ovn-nb(5), respectively, for the full story on these databases.

              1.
                A VIF’s life cycle begins when a CMS  administrator  creates  a  new  VIF  using  the  CMS  user
                interface  or API and adds it to a switch (one implemented by OVN as a logical switch).  The CMS
                updates its own configuration.  This includes associating unique, persistent  identifier  vif-id
                and Ethernet address mac with the VIF.

              2.
                The  CMS  plugin  updates the OVN Northbound database to include the new VIF, by adding a row to
                the Logical_Port table.  In the new row, name is vif-id, mac is mac, switch points  to  the  OVN
                logical switch’s Logical_Switch record, and other columns are initialized appropriately.

              3.
                ovn-northd  receives  the  OVN  Northbound database update.  In turn, it makes the corresponding
                updates to the  OVN  Southbound  database,  by  adding  rows  to  the  OVN  Southbound  database
                Logical_Flow  table  to reflect the new port, e.g. add a flow to recognize that packets destined
                to the new port’s MAC address should be delivered to it,  and  update  the  flow  that  delivers
                broadcast  and  multicast  packets  to  include  the  new port.  It also creates a record in the
                Binding table and populates all its columns except the column that identifies the chassis.

              4.
                On every hypervisor, ovn-controller receives the Logical_Flow table updates that ovn-northd made
                in the previous step.  As long as the VM that owns the VIF is powered off, ovn-controller cannot
                do much; it cannot, for example, arrange to send packets to or receive  packets  from  the  VIF,
                because the VIF does not actually exist anywhere.

              5.
                Eventually,  a  user  powers  on  the  VM  that owns the VIF.  On the hypervisor where the VM is
                powered  on,  the  integration  between  the  hypervisor  and   Open   vSwitch   (described   in
                IntegrationGuide.md)  adds  the  VIF  to  the  OVN  integration  bridge  and  stores  vif-id  in
                external-ids:iface-id to indicate that the interface is an instantiation of the new VIF.   (None
                of  this code is new in OVN; this is pre-existing integration work that has already been done on
                hypervisors that support OVS.)

              6.
                On the hypervisor where the VM is powered on, ovn-controller  notices  external-ids:iface-id  in
                the  new  Interface.   In  response,  it  updates the local hypervisor’s OpenFlow tables so that
                packets to and from the VIF are properly handled.  Afterward,  in  the  OVN  Southbound  DB,  it
                updates  the  Binding  table’s  chassis  column  for  the  row  that links the logical port from
                external-ids:iface-id to the hypervisor.

              7.
                Some CMS systems, including OpenStack, fully start a VM only when its networking is  ready.   To
                support  this,  ovn-northd  notices  the chassis column updated for the row in Binding table and
                pushes this upward by updating the up column in the OVN Northbound database’s Logical_Port table
                to indicate that the VIF is now up.  The CMS, if  it  uses  this  feature,  can  then  react  by
                allowing the VM’s execution to proceed.

              8.
                On  every  hypervisor  but  the one where the VIF resides, ovn-controller notices the completely
                populated row in the Binding table.  This provides ovn-controller the physical location  of  the
                logical  port,  so  each  instance  updates  the OpenFlow tables of its switch (based on logical
                datapath flows in the OVN DB Logical_Flow table) so that packets to and  from  the  VIF  can  be
                properly handled via tunnels.

              9.
                Eventually,  a  user  powers  off  the VM that owns the VIF.  On the hypervisor where the VM was
                powered off, the VIF is deleted from the OVN integration bridge.

              10.
                On the hypervisor where the VM was powered off, ovn-controller notices that the VIF was deleted.
                In response, it removes the Chassis column content in the Binding table for the logical port.

              11.
                On every hypervisor, ovn-controller notices the empty Chassis column in the Binding table’s  row
                for  the  logical port.  This means that ovn-controller no longer knows the physical location of
                the logical port, so each instance updates its OpenFlow table to reflect that.

              12.
                Eventually, when the VIF (or its entire VM) is no longer  needed  by  anyone,  an  administrator
                deletes the VIF using the CMS user interface or API.  The CMS updates its own configuration.

              13.
                The  CMS  plugin  removes  the  VIF from the OVN Northbound database, by deleting its row in the
                Logical_Port table.

              14.
                ovn-northd receives the OVN Northbound update and in turn updates the  OVN  Southbound  database
                accordingly,  by  removing  or  updating  the rows from the OVN Southbound database Logical_Flow
                table and Binding table that were related to the now-destroyed VIF.

              15.
                On every hypervisor, ovn-controller receives the Logical_Flow table updates that ovn-northd made
                in the previous step.  ovn-controller updates OpenFlow tables to reflect  the  update,  although
                there  may  not  be much to do, since the VIF had already become unreachable when it was removed
                from the Binding table in a previous step.

   Life Cycle of a Container Interface Inside a VM
       OVN provides virtual network abstractions  by  converting  information  written  in  OVN_NB  database  to
       OpenFlow  flows  in each hypervisor.  Secure virtual networking for multi-tenants can only be provided if
       OVN controller is the only entity that  can  modify  flows  in  Open  vSwitch.   When  the  Open  vSwitch
       integration  bridge  resides  in  the  hypervisor,  it is a fair assumption to make that tenant workloads
       running inside VMs cannot make any changes to Open vSwitch flows.

       If the infrastructure provider trusts the applications inside the containers not to break out and  modify
       the Open vSwitch flows, then containers can be run in hypervisors.  This is also the case when containers
       are  run inside the VMs and Open vSwitch integration bridge with flows added by OVN controller resides in
       the same VM.  For both the above cases, the workflow is the same as explained  with  an  example  in  the
       previous section ("Life Cycle of a VIF").

       This section talks about the life cycle of a container interface (CIF) when containers are created in the
       VMs  and  the  Open  vSwitch  integration  bridge resides inside the hypervisor.  In this case, even if a
       container application breaks out, other tenants are not affected because the  containers  running  inside
       the VMs cannot modify the flows in the Open vSwitch integration bridge.

       When  multiple  containers  are  created  inside a VM, there are multiple CIFs associated with them.  The
       network traffic associated with these CIFs need to reach the Open vSwitch integration bridge  running  in
       the  hypervisor  for OVN to support virtual network abstractions.  OVN should also be able to distinguish
       network traffic coming from different CIFs.  There are two ways to distinguish network traffic of CIFs.

       One way is to provide one VIF for every CIF (1:1 model).  This means that there could be a lot of network
       devices in the hypervisor.  This would slow down OVS because of all the additional CPU cycles needed  for
       the  management  of  all  the  VIFs.   It would also mean that the entity creating the containers in a VM
       should also be able to create the corresponding VIFs in the hypervisor.

       The second way is to provide a single VIF for all the CIFs (1:many model).  OVN  could  then  distinguish
       network  traffic  coming  from different CIFs via a tag written in every packet.  OVN uses this mechanism
       and uses VLAN as the tagging mechanism.

              1.
                A CIF’s life cycle begins when a container is spawned inside a VM by the  either  the  same  CMS
                that  created the VM or a tenant that owns that VM or even a container Orchestration System that
                is different than the CMS that initially created the VM.  Whoever the entity is, it will need to
                know the vif-id that is associated with the network  interface  of  the  VM  through  which  the
                container  interface’s  network  traffic is expected to go through.  The entity that creates the
                container interface will also need to choose an unused VLAN inside that VM.

              2.
                The container spawning entity (either directly or through the CMS that  manages  the  underlying
                infrastructure)  updates  the OVN Northbound database to include the new CIF, by adding a row to
                the Logical_Port table.  In the new row, name is any unique identifier, parent_name is the  vif-
                id  of  the  VM through which the CIF’s network traffic is expected to go through and the tag is
                the VLAN tag that identifies the network traffic of that CIF.

              3.
                ovn-northd receives the OVN Northbound database update.  In turn,  it  makes  the  corresponding
                updates  to  the  OVN  Southbound  database,  by  adding  rows  to the OVN Southbound database’s
                Logical_Flow table to reflect the new port and also by creating a new row in the  Binding  table
                and populating all its columns except the column that identifies the chassis.

              4.
                On  every hypervisor, ovn-controller subscribes to the changes in the Binding table.  When a new
                row is created by ovn-northd that includes a value in parent_port column of Binding  table,  the
                ovn-controller  in  the hypervisor whose OVN integration bridge has that same value in vif-id in
                external-ids:iface-id updates the local hypervisor’s OpenFlow tables so that packets to and from
                the VIF with the particular VLAN tag are properly handled.  Afterward  it  updates  the  chassis
                column of the Binding to reflect the physical location.

              5.
                One  can  only start the application inside the container after the underlying network is ready.
                To support this, ovn-northd notices the updated chassis column in Binding table and updates  the
                up  column  in  the OVN Northbound database’s Logical_Port table to indicate that the CIF is now
                up.  The entity responsible to start the container application queries this value and starts the
                application.

              6.
                Eventually the entity that created and started the container, stops it.  The entity, through the
                CMS (or directly) deletes its row in the Logical_Port table.

              7.
                ovn-northd receives the OVN Northbound update and in turn updates the  OVN  Southbound  database
                accordingly,  by  removing  or  updating  the rows from the OVN Southbound database Logical_Flow
                table that were related to the now-destroyed CIF.  It also deletes the row in the Binding  table
                for that CIF.

              8.
                On every hypervisor, ovn-controller receives the Logical_Flow table updates that ovn-northd made
                in the previous step.  ovn-controller updates OpenFlow tables to reflect the update.

   Architectural Physical Life Cycle of a Packet
       This section describes how a packet travels from one virtual machine or container to another through OVN.
       This  description  focuses  on  the physical treatment of a packet; for a description of the logical life
       cycle of a packet, please refer to the Logical_Flow table in ovn-sb(5).

       This section mentions several data and metadata fields, for clarity summarized here:

              tunnel key
                     When OVN encapsulates a packet in Geneve or another tunnel, it attaches extra data to it to
                     allow the receiving OVN instance to process  it  correctly.   This  takes  different  forms
                     depending  on  the  particular  encapsulation,  but in each case we refer to it here as the
                     ``tunnel key.’’  See Tunnel Encapsulations, below, for details.

              logical datapath field
                     A field that denotes the logical datapath through which a packet is being  processed.   OVN
                     uses  the field that OpenFlow 1.1+ simply (and confusingly) calls ``metadata’’ to store the
                     logical datapath.  (This field is passed across tunnels as part of the tunnel key.)

              logical input port field
                     A field that denotes the logical port from which the packet entered the  logical  datapath.
                     OVN stores this in Nicira extension register number 6.

                     Geneve  and  STT tunnels pass this field as part of the tunnel key.  Although VXLAN tunnels
                     do not explicitly carry a logical input port, OVN  only  uses  VXLAN  to  communicate  with
                     gateways that from OVN’s perspective consist of only a single logical port, so that OVN can
                     set the logical input port field to this one on ingress to the OVN logical pipeline.

              logical output port field
                     A  field  that  denotes  the  logical  port  from  which  the packet will leave the logical
                     datapath.  This is initialized to 0 at the beginning of the logical ingress pipeline.   OVN
                     stores this in Nicira extension register number 7.

                     Geneve  and  STT  tunnels  pass this field as part of the tunnel key.  VXLAN tunnels do not
                     transmit the logical output port field.

              conntrack zone field
                     A field that denotes the connection tracking zone.  The value only has  local  significance
                     and  is  not  meaningful between chassis.  This is initialized to 0 at the beginning of the
                     logical ingress pipeline.  OVN stores this in Nicira extension register number 5.

              VLAN ID
                     The VLAN ID is used as an interface between OVN and containers nested inside a VM (see Life
                     Cycle of a container interface inside a VM, above, for more information).

       Initially, a VM or container on the ingress hypervisor sends a packet on  a  port  attached  to  the  OVN
       integration bridge.  Then:

              1.
                OpenFlow  table  0  performs  physical-to-logical  translation.  It matches the packet’s ingress
                port.  Its actions annotate the packet with logical metadata, by setting  the  logical  datapath
                field  to identify the logical datapath that the packet is traversing and the logical input port
                field to identify the ingress port.  Then it resubmits to table 16 to enter the logical  ingress
                pipeline.

                It’s  possible that a single ingress physical port maps to multiple logical ports with a type of
                localnet. The logical datapath and logical input port fields will be reset and the  packet  will
                be resubmitted to table 16 multiple times.

                Packets  that  originate from a container nested within a VM are treated in a slightly different
                way.  The originating container can be distinguished based on the VIF-specific VLAN ID,  so  the
                physical-to-logical  translation  flows  additionally match on VLAN ID and the actions strip the
                VLAN header.  Following this step, OVN treats  packets  from  containers  just  like  any  other
                packets.

                Table 0 also processes packets that arrive from other chassis.  It distinguishes them from other
                packets by ingress port, which is a tunnel.  As with packets just entering the OVN pipeline, the
                actions  annotate  these  packets  with  logical datapath and logical ingress port metadata.  In
                addition, the actions set the logical output port field,  which  is  available  because  in  OVN
                tunneling  occurs after the logical output port is known.  These three pieces of information are
                obtained from  the  tunnel  encapsulation  metadata  (see  Tunnel  Encapsulations  for  encoding
                details).  Then the actions resubmit to table 33 to enter the logical egress pipeline.

              2.
                OpenFlow  tables  16 through 31 execute the logical ingress pipeline from the Logical_Flow table
                in the OVN Southbound database.  These  tables  are  expressed  entirely  in  terms  of  logical
                concepts  like  logical  ports  and logical datapaths.  A big part of ovn-controller’s job is to
                translate them into  equivalent  OpenFlow  (in  particular  it  translates  the  table  numbers:
                Logical_Flow tables 0 through 15 become OpenFlow tables 16 through 31).  For a given packet, the
                logical ingress pipeline eventually executes zero or more output actions:

                •      If the pipeline executes no output actions at all, the packet is effectively dropped.

                •      Most  commonly,  the pipeline executes one output action, which ovn-controller implements
                       by resubmitting the packet to table 32.

                •      If the pipeline can execute more than one output action,  then  each  one  is  separately
                       resubmitted  to  table  32.   This  can  be used to send multiple copies of the packet to
                       multiple ports.  (If the packet was not modified between the output actions, and some  of
                       the  copies  are  destined  to the same hypervisor, then using a logical multicast output
                       port would save bandwidth between hypervisors.)

              3.
                OpenFlow tables 32 through 47 implement the output  action  in  the  logical  ingress  pipeline.
                Specifically,  table  32  handles packets to remote hypervisors, table 33 handles packets to the
                local hypervisor, and table 34 discards packets whose logical ingress and egress  port  are  the
                same.

                Logical patch ports are a special case.  Logical patch ports do not have a physical location and
                effectively  reside  on every hypervisor.  Thus, flow table 33, for output to ports on the local
                hypervisor, naturally implements output to unicast logical patch ports too.   However,  applying
                the  same  logic to a logical patch port that is part of a logical multicast group yields packet
                duplication, because each hypervisor that contains a logical port in the  multicast  group  will
                also  output  the  packet to the logical patch port.  Thus, multicast groups implement output to
                logical patch ports in table 32.

                Each flow in table 32 matches on a logical output port for unicast or  multicast  logical  ports
                that  include  a  logical  port on a remote hypervisor.  Each flow’s actions implement sending a
                packet to the port it matches.  For unicast logical output  ports  on  remote  hypervisors,  the
                actions  set the tunnel key to the correct value, then send the packet on the tunnel port to the
                correct hypervisor.  (When the remote  hypervisor  receives  the  packet,  table  0  there  will
                recognize  it as a tunneled packet and pass it along to table 33.)  For multicast logical output
                ports, the actions send one copy of the packet to each remote hypervisor, in the same way as for
                unicast destinations.  If a multicast group includes a  logical  port  or  ports  on  the  local
                hypervisor,  then its actions also resubmit to table 33.  Table 32 also includes a fallback flow
                that resubmits to table 33 if there is no other match.

                Flows in table 33 resemble those in table 32 but for logical ports that  reside  locally  rather
                than  remotely.   For  unicast  logical  output  ports on the local hypervisor, the actions just
                resubmit to table 34.  For multicast output ports that include one or more logical ports on  the
                local hypervisor, for each such logical port P, the actions change the logical output port to P,
                then resubmit to table 34.

                Table  34  matches  and drops packets for which the logical input and output ports are the same.
                It resubmits other packets to table 48.

              4.
                OpenFlow tables 48 through 63 execute the logical egress pipeline from the Logical_Flow table in
                the OVN Southbound database.  The egress pipeline can perform a final stage of validation before
                packet delivery.  Eventually, it may execute an output action, which  ovn-controller  implements
                by  resubmitting  to  table  64.   A  packet  for  which  the  pipeline never executes output is
                effectively dropped (although it may have been transmitted through a tunnel  across  a  physical
                network).

                The egress pipeline cannot change the logical output port or cause further tunneling.

              5.
                OpenFlow table 64 performs logical-to-physical translation, the opposite of table 0.  It matches
                the packet’s logical egress port.  Its actions output the packet to the port attached to the OVN
                integration bridge that represents that logical port.  If the logical egress port is a container
                nested  with  a  VM,  then  before  sending the packet the actions push on a VLAN header with an
                appropriate VLAN ID.

                If the logical egress port is a logical patch port, then table 64 outputs to an OVS  patch  port
                that  represents  the logical patch port.  The packet re-enters the OpenFlow flow table from the
                OVS patch port’s peer in table 0, which identifies the logical datapath and logical  input  port
                based on the OVS patch port’s OpenFlow port number.

   Life Cycle of a VTEP gateway
       A  gateway  is  a  chassis  that forwards traffic between the OVN-managed part of a logical network and a
       physical VLAN,  extending a tunnel-based logical network into a physical network.

       The steps below refer often to details of the OVN and  VTEP  database  schemas.   Please  see  ovn-sb(5),
       ovn-nb(5) and vtep(5), respectively, for the full story on these databases.

              1.
                A  VTEP  gateway’s  life  cycle  begins with the administrator registering the VTEP gateway as a
                Physical_Switch table entry in the VTEP database.  The  ovn-controller-vtep  connected  to  this
                VTEP  database,  will recognize the new VTEP gateway and create a new Chassis table entry for it
                in the OVN_Southbound database.

              2.
                The administrator can then create a new Logical_Switch table entry, and bind a  particular  vlan
                on  a  VTEP gateway’s port to any VTEP logical switch.  Once a VTEP logical switch is bound to a
                VTEP  gateway,  the  ovn-controller-vtep  will   detect   it   and   add   its   name   to   the
                vtep_logical_switches  column  of  the  Chassis table in the OVN_Southbound database.  Note, the
                tunnel_key column of VTEP logical switch is not filled  at  creation.   The  ovn-controller-vtep
                will  set  the  column  when  the  correponding  vtep  logical switch is bound to an OVN logical
                network.

              3.
                Now, the administrator can use the CMS to add a VTEP logical switch to the OVN logical  network.
                To  do  that,  the  CMS  must  first create a new Logical_Port table entry in the OVN_Northbound
                database.  Then, the type column of this entry must be set to "vtep".  Next,  the  vtep-logical-
                switch  and  vtep-physical-switch  keys  in  the  options  column  must also be specified, since
                multiple VTEP gateways can attach to the same VTEP logical switch.

              4.
                The newly created logical port in the OVN_Northbound database  and  its  configuration  will  be
                passed   down   to  the  OVN_Southbound  database  as  a  new  Port_Binding  table  entry.   The
                ovn-controller-vtep will recognize the change and bind the logical  port  to  the  corresponding
                VTEP  gateway chassis.  Configuration of binding the same VTEP logical switch to a different OVN
                logical networks is not allowed and a warning will be generated in the log.

              5.
                Beside binding to the VTEP gateway chassis, the ovn-controller-vtep will update  the  tunnel_key
                column of the VTEP logical switch to the corresponding Datapath_Binding table entry’s tunnel_key
                for the bound OVN logical network.

              6.
                Next, the ovn-controller-vtep will keep reacting to the configuration change in the Port_Binding
                in  the  OVN_Northbound database, and updating the Ucast_Macs_Remote table in the VTEP database.
                This allows the VTEP gateway to understand where to forward the unicast traffic coming from  the
                extended external network.

              7.
                Eventually,  the  VTEP  gateway’s  life  cycle  ends when the administrator unregisters the VTEP
                gateway from the VTEP database.  The ovn-controller-vtep will recognize the event and remove all
                related configurations (Chassis table entry and port bindings) in the OVN_Southbound database.

              8.
                When the ovn-controller-vtep is terminated, all related  configurations  in  the  OVN_Southbound
                database  and  the  VTEP  database  will  be  cleaned,  including  Chassis table entries for all
                registered VTEP gateways and their port bindings, and all Ucast_Macs_Remote  table  entries  and
                the Logical_Switch tunnel keys.

DESIGN DECISIONS

   Tunnel Encapsulations
       OVN  annotates  logical  network  packets that it sends from one hypervisor to another with the following
       three pieces of metadata, which are encoded in an encapsulation-specific fashion:

              •      24-bit logical datapath identifier, from  the  tunnel_key  column  in  the  OVN  Southbound
                     Datapath_Binding table.

              •      15-bit logical ingress port identifier.  ID 0 is reserved for internal use within OVN.  IDs
                     1  through 32767, inclusive, may be assigned to logical ports (see the tunnel_key column in
                     the OVN Southbound Port_Binding table).

              •      16-bit logical egress port identifier.  IDs 0 through 32767 have the same  meaning  as  for
                     logical  ingress  ports.   IDs  32768  through 65535, inclusive, may be assigned to logical
                     multicast groups (see the tunnel_key column in the OVN Southbound Multicast_Group table).

       For hypervisor-to-hypervisor traffic, OVN supports only Geneve and STT encapsulations, for the  following
       reasons:

              •      Only  STT  and  Geneve support the large amounts of metadata (over 32 bits per packet) that
                     OVN uses (as described above).

              •      STT and Geneve use randomized UDP or TCP source ports that  allows  efficient  distribution
                     among multiple paths in environments that use ECMP in their underlay.

              •      NICs are available to offload STT and Geneve encapsulation and decapsulation.

       Due  to  its  flexibility,  the  preferred  encapsulation  between  hypervisors  is  Geneve.   For Geneve
       encapsulation, OVN transmits the logical datapath identifier  in  the  Geneve  VNI.   OVN  transmits  the
       logical  ingress  and logical egress ports in a TLV with class 0x0102, type 0, and a 32-bit value encoded
       as follows, from MSB to LSB:

              •      1 bits: rsv (0)

              •      15 bits: ingress port

              •      16 bits: egress port

       Environments whose NICs lack Geneve offload may prefer STT encapsulation for  performance  reasons.   For
       STT  encapsulation,  OVN  encodes  all  three  pieces  of logical metadata in the STT 64-bit tunnel ID as
       follows, from MSB to LSB:

              •      9 bits: reserved (0)

              •      15 bits: ingress port

              •      16 bits: egress port

              •      24 bits: datapath

       For connecting to gateways, in addition to Geneve and STT, OVN supports VXLAN, because only VXLAN support
       is common on top-of-rack (ToR) switches.  Currently,  gateways  have  a  feature  set  that  matches  the
       capabilities  as  defined  by  the  VTEP schema, so fewer bits of metadata are necessary.  In the future,
       gateways that do not support encapsulations with large amounts of metadata may continue to have a reduced
       feature set.

Open vSwitch 2.5.9                              OVN Architecture                             ovn-architecture(7)