Provided by: freebsd-manpages_9.2+1-1_all bug

NAME

       tcp — Internet Transmission Control Protocol

SYNOPSIS

       #include <sys/types.h>
       #include <sys/socket.h>
       #include <netinet/in.h>

       int
       socket(AF_INET, SOCK_STREAM, 0);

DESCRIPTION

       The  TCP  protocol provides reliable, flow-controlled, two-way transmission of data.  It is a byte-stream
       protocol used to support the SOCK_STREAM abstraction.  TCP uses the standard Internet address format and,
       in addition, provides a per-host collection of “port addresses”.  Thus, each address is  composed  of  an
       Internet  address  specifying  the host and network, with a specific TCP port on the host identifying the
       peer entity.

       Sockets utilizing the TCP protocol are either “active” or “passive”.  Active sockets initiate connections
       to passive sockets.  By default, TCP sockets  are  created  active;  to  create  a  passive  socket,  the
       listen(2)  system  call must be used after binding the socket with the bind(2) system call.  Only passive
       sockets may use the accept(2) call to accept incoming connections.   Only  active  sockets  may  use  the
       connect(2) call to initiate connections.

       Passive  sockets  may  “underspecify”  their location to match incoming connection requests from multiple
       networks.  This technique, termed “wildcard addressing”, allows a single server  to  provide  service  to
       clients  on  multiple  networks.   To create a socket which listens on all networks, the Internet address
       INADDR_ANY must be bound.  The TCP port may still  be  specified  at  this  time;  if  the  port  is  not
       specified,  the  system will assign one.  Once a connection has been established, the socket's address is
       fixed by the peer entity's location.  The address assigned to the socket is the address  associated  with
       the  network  interface through which packets are being transmitted and received.  Normally, this address
       corresponds to the peer entity's network.

       TCP supports  a  number  of  socket  options  which  can  be  set  with  setsockopt(2)  and  tested  with
       getsockopt(2):

       TCP_INFO        Information about a socket's underlying TCP session may be retrieved by passing the read-
                       only  option  TCP_INFO  to  getsockopt(2).  It accepts a single argument: a pointer to an
                       instance of struct tcp_info.

                       This API is subject to change; consult the source to determine which fields are currently
                       filled out by this option.  FreeBSD specific additions include send window size,  receive
                       window size, and bandwidth-controlled window space.

       TCP_CONGESTION  Select  or  query  the congestion control algorithm that TCP will use for the connection.
                       See mod_cc(4) for details.

       TCP_KEEPINIT    This write-only setsockopt(2) option accepts a per-socket timeout argument  of  u_int  in
                       seconds,   for   new,  non-established  TCP  connections.   For  the  global  default  in
                       milliseconds see keepinit in the “MIB Variables” section further down.

       TCP_KEEPIDLE    This write-only setsockopt(2) option accepts an argument of u_int for the amount of time,
                       in seconds, that the connection must be idle before keepalive  probes  (if  enabled)  are
                       sent  for  the  connection  of  this  socket.  If set on a listening socket, the value is
                       inherited by the newly  created  socket  upon  accept(2).   For  the  global  default  in
                       milliseconds see keepidle in the “MIB Variables” section further down.

       TCP_KEEPINTVL   This  write-only  setsockopt(2) option accepts an argument of u_int to set the per-socket
                       interval, in seconds, between keepalive probes sent to a peer.  If  set  on  a  listening
                       socket,  the  value  is  inherited  by  the newly created socket upon accept(2).  For the
                       global default in milliseconds see keepintvl in the “MIB Variables” section further down.

       TCP_KEEPCNT     This write-only setsockopt(2) option accepts an argument of u_int and allows a per-socket
                       tuning of the number of probes sent, with no response,  before  the  connection  will  be
                       dropped.   If  set  on  a  listening  socket, the value is inherited by the newly created
                       socket upon accept(2).  For the global default see the keepcnt  in  the  “MIB  Variables”
                       section further down.

       TCP_NODELAY     Under  most circumstances, TCP sends data when it is presented; when outstanding data has
                       not yet been acknowledged, it gathers small amounts of output to  be  sent  in  a  single
                       packet  once  an  acknowledgement  is  received.   For a small number of clients, such as
                       window systems that send a  stream  of  mouse  events  which  receive  no  replies,  this
                       packetization  may cause significant delays.  The boolean option TCP_NODELAY defeats this
                       algorithm.

       TCP_MAXSEG      By default, a sender- and receiver-TCP will negotiate among themselves to  determine  the
                       maximum  segment  size  to be used for each connection.  The TCP_MAXSEG option allows the
                       user to determine the result of this negotiation, and to reduce it if desired.

       TCP_NOOPT       TCP usually sends a number of options  in  each  packet,  corresponding  to  various  TCP
                       extensions  which  are  provided in this implementation.  The boolean option TCP_NOOPT is
                       provided to disable TCP option use on a per-connection basis.

       TCP_NOPUSH      By convention, the sender-TCP will set the “push” bit, and begin transmission immediately
                       (if permitted) at the end of every user call to write(2) or writev(2).  When this  option
                       is  set  to  a  non-zero  value,  TCP will delay sending any data at all until either the
                       socket is closed, or the internal send buffer is filled.

       TCP_MD5SIG      This option enables the use of MD5 digests (also known  as  TCP-MD5)  on  writes  to  the
                       specified socket.  Outgoing traffic is digested; digests on incoming traffic are verified
                       if  the  net.inet.tcp.signature_verify_input  sysctl  is  nonzero.   The  current default
                       behavior for the system is to respond to a system advertising this option  with  TCP-MD5;
                       this may change.

                       One  common  use  for  this  in a FreeBSD router deployment is to enable based routers to
                       interwork with Cisco equipment at peering points.  Support for this feature  conforms  to
                       RFC 2385.  Only IPv4 (AF_INET) sessions are supported.

                       In  order for this option to function correctly, it is necessary for the administrator to
                       add a tcp-md5 key entry to the system's security associations database (SADB)  using  the
                       setkey(8)  utility.   This  entry  must  have  an SPI of 0x1000 and can therefore only be
                       specified on a per-host basis at this time.

                       If an SADB entry cannot be found for the destination, the outgoing traffic will  have  an
                       invalid  digest  option prepended, and the following error message will be visible on the
                       system console: tcp_signature_compute: SADB lookup failed for %d.%d.%d.%d.

       The  option  level  for  the  setsockopt(2)  call  is  the  protocol  number  for  TCP,  available   from
       getprotobyname(3), or IPPROTO_TCP.  All options are declared in <netinet/tcp.h>.

       Options at the IP transport level may be used with TCP; see ip(4).  Incoming connection requests that are
       source-routed are noted, and the reverse source route is used in responding.

       The  default  congestion control algorithm for TCP is cc_newreno(4).  Other congestion control algorithms
       can be made available using the mod_cc(4) framework.

   MIB Variables
       The TCP protocol implements a number of variables in the net.inet.tcp branch of the sysctl(3) MIB.

       TCPCTL_DO_RFC1323  (rfc1323) Implement the window scaling and timestamp options of RFC 1323  (default  is
                          true).

       TCPCTL_MSSDFLT     (mssdflt)  The  default value used for the maximum segment size (“MSS”) when no advice
                          to the contrary is received from MSS negotiation.

       TCPCTL_SENDSPACE   (sendspace) Maximum TCP send window.

       TCPCTL_RECVSPACE   (recvspace) Maximum TCP receive window.

       log_in_vain        Log  any  connection  attempts  to  ports  where  there  is  not  a  socket  accepting
                          connections.   The  value  of  1  limits the logging to SYN (connection establishment)
                          packets only.  That of 2 results in any TCP packets to closed ports being logged.  Any
                          value unlisted above disables  the  logging  (default  is  0,  i.e.,  the  logging  is
                          disabled).

       slowstart_flightsize
                          The  number  of  packets  allowed to be in-flight during the TCP slow-start phase on a
                          non-local network.

       local_slowstart_flightsize
                          The number of packets allowed to be in-flight during the TCP slow-start phase to local
                          machines in the same subnet.

       msl                The Maximum Segment Lifetime, in milliseconds, for a packet.

       keepinit           Timeout, in milliseconds, for new, non-established TCP connections.   The  default  is
                          75000 msec.

       keepidle           Amount  of  time,  in  milliseconds, that the connection must be idle before keepalive
                          probes (if enabled) are sent.  The default is 7200000 msec (2 hours).

       keepintvl          The interval, in milliseconds, between keepalive probes sent to remote machines,  when
                          no response is received on a keepidle probe.  The default is 75000 msec.

       keepcnt            Number  of probes sent, with no response, before a connection is dropped.  The default
                          is 8 packets.

       always_keepalive   Assume that SO_KEEPALIVE is set on all TCP connections, the kernel  will  periodically
                          send a packet to the remote host to verify the connection is still up.

       icmp_may_rst       Certain ICMP unreachable messages may abort connections in SYN-SENT state.

       do_tcpdrain        Flush packets in the TCP reassembly queue if the system is low on mbufs.

       blackhole          If  enabled,  disable  sending  of  RST when a connection is attempted to a port where
                          there is not a socket accepting connections.  See blackhole(4).

       delayed_ack        Delay ACK to try and piggyback it onto a data packet.

       delacktime         Maximum amount of time, in milliseconds, before a delayed ACK is sent.

       path_mtu_discovery
                          Enable Path MTU Discovery.

       tcbhashsize        Size of the TCP control-block hash table (read-only).  This may  be  tuned  using  the
                          kernel option TCBHASHSIZE or by setting net.inet.tcp.tcbhashsize in the loader(8).

       pcbcount           Number of active process control blocks (read-only).

       syncookies         Determines  whether  or  not  SYN  cookies  should  be  generated for outbound SYN-ACK
                          packets.  SYN cookies are a great help during SYN flood attacks, and  are  enabled  by
                          default.  (See syncookies(4).)

       isn_reseed_interval
                          The  interval  (in  seconds)  specifying  how  often  the secret data used in RFC 1948
                          initial sequence number calculations should be reseeded.  By default, this variable is
                          set to zero, indicating that  no  reseeding  will  occur.   Reseeding  should  not  be
                          necessary, and will break TIME_WAIT recycling for a few minutes.

       rexmit_min, rexmit_slop
                          Adjust  the  retransmit timer calculation for TCP.  The slop is typically added to the
                          raw calculation to take into account occasional  variances  that  the  SRTT  (smoothed
                          round-trip  time)  is  unable  to accommodate, while the minimum specifies an absolute
                          minimum.  While a number of TCP RFCs suggest a 1 second minimum, these  RFCs  tend  to
                          focus  on  streaming  behavior, and fail to deal with the fact that a 1 second minimum
                          has severe detrimental effects over lossy interactive connections, such as  a  802.11b
                          wireless link, and over very fast but lossy connections for those cases not covered by
                          the fast retransmit code.  For this reason, we use 200ms of slop and a near-0 minimum,
                          which gives us an effective minimum of 200ms (similar to Linux).

       rfc3042            Enable  the  Limited  Transmit  algorithm  as  described  in RFC 3042.  It helps avoid
                          timeouts on lossy links and also when the congestion window is small,  as  happens  on
                          short transfers.

       rfc3390            Enable  support  for  RFC  3390, which allows for a variable-sized starting congestion
                          window on new  connections,  depending  on  the  maximum  segment  size.   This  helps
                          throughput  in  general,  but  particularly affects short transfers and high-bandwidth
                          large propagation-delay connections.

                          When this feature is enabled, the slowstart_flightsize and  local_slowstart_flightsize
                          settings  are not observed for new connection slow starts, but they are still used for
                          slow starts that occur when the connection has been idle and starts sending again.

       sack.enable        Enable support for RFC 2018, TCP Selective Acknowledgment  option,  which  allows  the
                          receiver  to  inform  the sender about all successfully arrived segments, allowing the
                          sender to retransmit the missing segments only.

       sack.maxholes      Maximum number of SACK holes per connection.  Defaults to 128.

       sack.globalmaxholes
                          Maximum number of SACK holes per system, across all connections.  Defaults to 65536.

       maxtcptw           When a TCP connection enters the TIME_WAIT state, its associated socket  structure  is
                          freed,  since  it  is  of negligible size and use, and a new structure is allocated to
                          contain a minimal amount of information necessary for sustaining a connection in  this
                          state,  called  the  compressed  TCP TIME_WAIT state.  Since this structure is smaller
                          than a socket structure, it can save a  significant  amount  of  system  memory.   The
                          net.inet.tcp.maxtcptw  MIB  variable  controls  the maximum number of these structures
                          allocated.  By default, it is initialized to kern.ipc.maxsockets / 5.

       nolocaltimewait    Suppress creating of compressed TCP TIME_WAIT states for  connections  in  which  both
                          endpoints are local.

       fast_finwait2_recycle
                          Recycle TCP FIN_WAIT_2 connections faster when the socket is marked as SBS_CANTRCVMORE
                          (no  user  process  has  the socket open, data received on the socket cannot be read).
                          The timeout used here is finwait2_timeout.

       finwait2_timeout   Timeout to use for fast recycling of  TCP  FIN_WAIT_2  connections.   Defaults  to  60
                          seconds.

       ecn.enable         Enable  support  for  TCP  Explicit  Congestion  Notification (ECN).  ECN allows a TCP
                          sender to reduce the transmission rate in order to avoid packet drops.

       ecn.maxretries     Number of retries (SYN or SYN/ACK retransmits) before  disabling  ECN  on  a  specific
                          connection.  This  is  needed  to  help  with  connection  establishment when a broken
                          firewall is in the network path.

ERRORS

       A socket operation may fail with one of the following errors returned:

       [EISCONN]          when trying to establish a connection on a socket which already has one;

       [ENOBUFS]          when the system runs out of memory for an internal data structure;

       [ETIMEDOUT]        when a connection was dropped due to excessive retransmissions;

       [ECONNRESET]       when the remote peer forces the connection to be closed;

       [ECONNREFUSED]     when the remote peer actively refuses connection  establishment  (usually  because  no
                          process is listening to the port);

       [EADDRINUSE]       when  an  attempt  is  made  to  create  a  socket  with a port which has already been
                          allocated;

       [EADDRNOTAVAIL]    when an attempt is made to create a socket with a network address for which no network
                          interface exists;

       [EAFNOSUPPORT]     when an attempt is made to bind or connect a socket to a multicast address.

SEE ALSO

       getsockopt(2), socket(2), sysctl(3), blackhole(4),  inet(4),  intro(4),  ip(4),  mod_cc(4),  syncache(4),
       setkey(8)

       V. Jacobson, R. Braden, and D. Borman, TCP Extensions for High Performance, RFC 1323.

       A. Heffernan, Protection of BGP Sessions via the TCP MD5 Signature Option, RFC 2385.

       K.  Ramakrishnan,  S.  Floyd, and D. Black, The Addition of Explicit Congestion Notification (ECN) to IP,
       RFC 3168.

HISTORY

       The TCP protocol appeared in 4.2BSD.  The RFC 1323 extensions for  window  scaling  and  timestamps  were
       added in 4.4BSD.  The TCP_INFO option was introduced in Linux 2.6 and is subject to change.

Debian                                          November 19, 2012                                         TCP(4)