Provided by: freebsd-manpages_10.1~RC1-1_all bug

NAME

       tcp — Internet Transmission Control Protocol

SYNOPSIS

       #include <sys/types.h>
       #include <sys/socket.h>
       #include <netinet/in.h>
       #include <netinet/tcp.h>

       int
       socket(AF_INET, SOCK_STREAM, 0);

DESCRIPTION

       The  TCP  protocol provides reliable, flow-controlled, two-way transmission of data.  It is a byte-stream
       protocol used to support the SOCK_STREAM abstraction.  TCP uses the standard Internet address format and,
       in addition, provides a per-host collection of “port addresses”.  Thus, each address is  composed  of  an
       Internet  address  specifying  the host and network, with a specific TCP port on the host identifying the
       peer entity.

       Sockets utilizing the TCP protocol are either “active” or “passive”.  Active sockets initiate connections
       to passive sockets.  By default, TCP sockets  are  created  active;  to  create  a  passive  socket,  the
       listen(2)  system  call must be used after binding the socket with the bind(2) system call.  Only passive
       sockets may use the accept(2) call to accept incoming connections.   Only  active  sockets  may  use  the
       connect(2) call to initiate connections.

       Passive  sockets  may  “underspecify”  their location to match incoming connection requests from multiple
       networks.  This technique, termed “wildcard addressing”, allows a single server  to  provide  service  to
       clients  on  multiple  networks.   To create a socket which listens on all networks, the Internet address
       INADDR_ANY must be bound.  The TCP port may still  be  specified  at  this  time;  if  the  port  is  not
       specified,  the  system will assign one.  Once a connection has been established, the socket's address is
       fixed by the peer entity's location.  The address assigned to the socket is the address  associated  with
       the  network  interface through which packets are being transmitted and received.  Normally, this address
       corresponds to the peer entity's network.

       TCP supports  a  number  of  socket  options  which  can  be  set  with  setsockopt(2)  and  tested  with
       getsockopt(2):

       TCP_INFO        Information about a socket's underlying TCP session may be retrieved by passing the read-
                       only  option  TCP_INFO  to  getsockopt(2).  It accepts a single argument: a pointer to an
                       instance of struct tcp_info.

                       This API is subject to change; consult the source to determine which fields are currently
                       filled out by this option.  FreeBSD specific additions include send window size,  receive
                       window size, and bandwidth-controlled window space.

       TCP_CONGESTION  Select  or  query  the congestion control algorithm that TCP will use for the connection.
                       See mod_cc(4) for details.

       TCP_KEEPINIT    This setsockopt(2) option accepts a per-socket timeout argument of u_int in seconds,  for
                       new,  non-established  TCP  connections.   For  the  global  default  in milliseconds see
                       keepinit in the “MIB Variables” section further down.

       TCP_KEEPIDLE    This setsockopt(2) option accepts an argument  of  u_int  for  the  amount  of  time,  in
                       seconds,  that  the connection must be idle before keepalive probes (if enabled) are sent
                       for the connection of this socket.  If set on a listening socket, the value is  inherited
                       by  the  newly created socket upon accept(2).  For the global default in milliseconds see
                       keepidle in the “MIB Variables” section further down.

       TCP_KEEPINTVL   This setsockopt(2) option accepts an argument of u_int to set the per-socket interval, in
                       seconds, between keepalive probes sent to a peer.  If set  on  a  listening  socket,  the
                       value is inherited by the newly created socket upon accept(2).  For the global default in
                       milliseconds see keepintvl in the “MIB Variables” section further down.

       TCP_KEEPCNT     This  setsockopt(2) option accepts an argument of u_int and allows a per-socket tuning of
                       the number of probes sent, with no response, before the connection will be  dropped.   If
                       set  on  a  listening  socket,  the  value  is inherited by the newly created socket upon
                       accept(2).  For the global default see the keepcnt in the “MIB Variables” section further
                       down.

       TCP_NODELAY     Under most circumstances, TCP sends data when it is presented; when outstanding data  has
                       not  yet  been  acknowledged,  it  gathers small amounts of output to be sent in a single
                       packet once an acknowledgement is received.  For a  small  number  of  clients,  such  as
                       window  systems  that  send  a  stream  of  mouse  events  which receive no replies, this
                       packetization may cause significant delays.  The boolean option TCP_NODELAY defeats  this
                       algorithm.

       TCP_MAXSEG      By  default,  a sender- and receiver-TCP will negotiate among themselves to determine the
                       maximum segment size to be used for each connection.  The TCP_MAXSEG  option  allows  the
                       user to determine the result of this negotiation, and to reduce it if desired.

       TCP_NOOPT       TCP  usually  sends  a  number  of  options  in each packet, corresponding to various TCP
                       extensions which are provided in this implementation.  The boolean  option  TCP_NOOPT  is
                       provided to disable TCP option use on a per-connection basis.

       TCP_NOPUSH      By convention, the sender-TCP will set the “push” bit, and begin transmission immediately
                       (if  permitted) at the end of every user call to write(2) or writev(2).  When this option
                       is set to a non-zero value, TCP will delay sending any  data  at  all  until  either  the
                       socket is closed, or the internal send buffer is filled.

       TCP_MD5SIG      This  option  enables  the  use  of  MD5 digests (also known as TCP-MD5) on writes to the
                       specified socket.  Outgoing traffic is digested; digests on incoming traffic are verified
                       if the  net.inet.tcp.signature_verify_input  sysctl  is  nonzero.   The  current  default
                       behavior  for  the system is to respond to a system advertising this option with TCP-MD5;
                       this may change.

                       One common use for this in a FreeBSD router deployment is  to  enable  based  routers  to
                       interwork  with  Cisco equipment at peering points.  Support for this feature conforms to
                       RFC 2385.  Only IPv4 (AF_INET) sessions are supported.

                       In order for this option to function correctly, it is necessary for the administrator  to
                       add  a  tcp-md5 key entry to the system's security associations database (SADB) using the
                       setkey(8) utility.  This entry must have an SPI of  0x1000  and  can  therefore  only  be
                       specified on a per-host basis at this time.

                       If  an  SADB entry cannot be found for the destination, the outgoing traffic will have an
                       invalid digest option prepended, and the following error message will be visible  on  the
                       system console: tcp_signature_compute: SADB lookup failed for %d.%d.%d.%d.

       The   option  level  for  the  setsockopt(2)  call  is  the  protocol  number  for  TCP,  available  from
       getprotobyname(3), or IPPROTO_TCP.  All options are declared in <netinet/tcp.h>.

       Options at the IP transport level may be used with TCP; see ip(4).  Incoming connection requests that are
       source-routed are noted, and the reverse source route is used in responding.

       The default congestion control algorithm for TCP is cc_newreno(4).  Other congestion  control  algorithms
       can be made available using the mod_cc(4) framework.

   MIB Variables
       The TCP protocol implements a number of variables in the net.inet.tcp branch of the sysctl(3) MIB.

       TCPCTL_DO_RFC1323  (rfc1323)  Implement  the window scaling and timestamp options of RFC 1323 (default is
                          true).

       TCPCTL_MSSDFLT     (mssdflt) The default value used for the maximum segment size (“MSS”) when  no  advice
                          to the contrary is received from MSS negotiation.

       TCPCTL_SENDSPACE   (sendspace) Maximum TCP send window.

       TCPCTL_RECVSPACE   (recvspace) Maximum TCP receive window.

       log_in_vain        Log  any  connection  attempts  to  ports  where  there  is  not  a  socket  accepting
                          connections.  The value of 1 limits the  logging  to  SYN  (connection  establishment)
                          packets only.  That of 2 results in any TCP packets to closed ports being logged.  Any
                          value  unlisted  above  disables  the  logging  (default  is  0,  i.e., the logging is
                          disabled).

       msl                The Maximum Segment Lifetime, in milliseconds, for a packet.

       keepinit           Timeout, in milliseconds, for new, non-established TCP connections.   The  default  is
                          75000 msec.

       keepidle           Amount  of  time,  in  milliseconds, that the connection must be idle before keepalive
                          probes (if enabled) are sent.  The default is 7200000 msec (2 hours).

       keepintvl          The interval, in milliseconds, between keepalive probes sent to remote machines,  when
                          no response is received on a keepidle probe.  The default is 75000 msec.

       keepcnt            Number  of probes sent, with no response, before a connection is dropped.  The default
                          is 8 packets.

       always_keepalive   Assume that SO_KEEPALIVE is set on all TCP connections, the kernel  will  periodically
                          send a packet to the remote host to verify the connection is still up.

       icmp_may_rst       Certain ICMP unreachable messages may abort connections in SYN-SENT state.

       do_tcpdrain        Flush packets in the TCP reassembly queue if the system is low on mbufs.

       blackhole          If  enabled,  disable  sending  of  RST when a connection is attempted to a port where
                          there is not a socket accepting connections.  See blackhole(4).

       delayed_ack        Delay ACK to try and piggyback it onto a data packet.

       delacktime         Maximum amount of time, in milliseconds, before a delayed ACK is sent.

       path_mtu_discovery
                          Enable Path MTU Discovery.

       tcbhashsize        Size of the TCP control-block hash table (read-only).  This may  be  tuned  using  the
                          kernel option TCBHASHSIZE or by setting net.inet.tcp.tcbhashsize in the loader(8).

       pcbcount           Number of active process control blocks (read-only).

       syncookies         Determines  whether  or  not  SYN  cookies  should  be  generated for outbound SYN-ACK
                          packets.  SYN cookies are a great help during SYN flood attacks, and  are  enabled  by
                          default.  (See syncookies(4).)

       isn_reseed_interval
                          The  interval  (in  seconds)  specifying  how  often  the secret data used in RFC 1948
                          initial sequence number calculations should be reseeded.  By default, this variable is
                          set to zero, indicating that  no  reseeding  will  occur.   Reseeding  should  not  be
                          necessary, and will break TIME_WAIT recycling for a few minutes.

       rexmit_min, rexmit_slop
                          Adjust  the  retransmit timer calculation for TCP.  The slop is typically added to the
                          raw calculation to take into account occasional  variances  that  the  SRTT  (smoothed
                          round-trip  time)  is  unable  to accommodate, while the minimum specifies an absolute
                          minimum.  While a number of TCP RFCs suggest a 1 second minimum, these  RFCs  tend  to
                          focus  on  streaming  behavior, and fail to deal with the fact that a 1 second minimum
                          has severe detrimental effects over lossy interactive connections, such as  a  802.11b
                          wireless link, and over very fast but lossy connections for those cases not covered by
                          the fast retransmit code.  For this reason, we use 200ms of slop and a near-0 minimum,
                          which gives us an effective minimum of 200ms (similar to Linux).

       rfc3042            Enable  the  Limited  Transmit  algorithm  as  described  in RFC 3042.  It helps avoid
                          timeouts on lossy links and also when the congestion window is small,  as  happens  on
                          short transfers.

       rfc3390            Enable  support  for  RFC  3390, which allows for a variable-sized starting congestion
                          window on new  connections,  depending  on  the  maximum  segment  size.   This  helps
                          throughput  in  general,  but  particularly affects short transfers and high-bandwidth
                          large propagation-delay connections.

       sack.enable        Enable support for RFC 2018, TCP Selective Acknowledgment  option,  which  allows  the
                          receiver  to  inform  the sender about all successfully arrived segments, allowing the
                          sender to retransmit the missing segments only.

       sack.maxholes      Maximum number of SACK holes per connection.  Defaults to 128.

       sack.globalmaxholes
                          Maximum number of SACK holes per system, across all connections.  Defaults to 65536.

       maxtcptw           When a TCP connection enters the TIME_WAIT state, its associated socket  structure  is
                          freed,  since  it  is  of negligible size and use, and a new structure is allocated to
                          contain a minimal amount of information necessary for sustaining a connection in  this
                          state,  called  the  compressed  TCP TIME_WAIT state.  Since this structure is smaller
                          than a socket structure, it can save a  significant  amount  of  system  memory.   The
                          net.inet.tcp.maxtcptw  MIB  variable  controls  the maximum number of these structures
                          allocated.  By default, it is initialized to kern.ipc.maxsockets / 5.

       nolocaltimewait    Suppress creating of compressed TCP TIME_WAIT states for  connections  in  which  both
                          endpoints are local.

       fast_finwait2_recycle
                          Recycle TCP FIN_WAIT_2 connections faster when the socket is marked as SBS_CANTRCVMORE
                          (no  user  process  has  the socket open, data received on the socket cannot be read).
                          The timeout used here is finwait2_timeout.

       finwait2_timeout   Timeout to use for fast recycling of  TCP  FIN_WAIT_2  connections.   Defaults  to  60
                          seconds.

       ecn.enable         Enable  support  for  TCP  Explicit  Congestion  Notification (ECN).  ECN allows a TCP
                          sender to reduce the transmission rate in order to avoid packet drops.

       ecn.maxretries     Number of retries (SYN or SYN/ACK retransmits) before  disabling  ECN  on  a  specific
                          connection.  This  is  needed  to  help  with  connection  establishment when a broken
                          firewall is in the network path.

ERRORS

       A socket operation may fail with one of the following errors returned:

       [EISCONN]          when trying to establish a connection on a socket which already has one;

       [ENOBUFS]          when the system runs out of memory for an internal data structure;

       [ETIMEDOUT]        when a connection was dropped due to excessive retransmissions;

       [ECONNRESET]       when the remote peer forces the connection to be closed;

       [ECONNREFUSED]     when the remote peer actively refuses connection  establishment  (usually  because  no
                          process is listening to the port);

       [EADDRINUSE]       when  an  attempt  is  made  to  create  a  socket  with a port which has already been
                          allocated;

       [EADDRNOTAVAIL]    when an attempt is made to create a socket with a network address for which no network
                          interface exists;

       [EAFNOSUPPORT]     when an attempt is made to bind or connect a socket to a multicast address.

SEE ALSO

       getsockopt(2),  socket(2),  sysctl(3),  blackhole(4),  inet(4),  intro(4),  ip(4),  mod_cc(4),  siftr(4),
       syncache(4), setkey(8)

       V. Jacobson, R. Braden, and D. Borman, TCP Extensions for High Performance, RFC 1323.

       A. Heffernan, Protection of BGP Sessions via the TCP MD5 Signature Option, RFC 2385.

       K.  Ramakrishnan,  S.  Floyd, and D. Black, The Addition of Explicit Congestion Notification (ECN) to IP,
       RFC 3168.

HISTORY

       The TCP protocol appeared in 4.2BSD.  The RFC 1323 extensions for  window  scaling  and  timestamps  were
       added in 4.4BSD.  The TCP_INFO option was introduced in Linux 2.6 and is subject to change.

Debian                                          November 8, 2013                                          TCP(4)