Difference between revisions of "RFC1036"

From RFC-Wiki
 
Line 1: Line 1:
 
 
 
 
 
 
 
Network Working Group                                          M. Horton
 
Network Working Group                                          M. Horton
 
Request for Comments:  1036                      AT&T Bell Laboratories
 
Request for Comments:  1036                      AT&T Bell Laboratories
 
Obsoletes: RFC-850                                              R. Adams
 
Obsoletes: RFC-850                                              R. Adams
                                              Center for Seismic Studies
+
                                          Center for Seismic Studies
                                                          December 1987
+
                                                        December 1987
 
 
 
 
              Standard for Interchange of USENET Messages
 
 
 
  
 +
          Standard for Interchange of USENET Messages
  
 
STATUS OF THIS MEMO
 
STATUS OF THIS MEMO
  
    This document defines the standard format for the interchange of
+
This document defines the standard format for the interchange of
    network News messages among USENET hosts.  It updates and replaces
+
network News messages among USENET hosts.  It updates and replaces
    RFC-850, reflecting version B2.11 of the News program.  This memo is
+
RFC-850, reflecting version B2.11 of the News program.  This memo is
    disributed as an RFC to make this information easily accessible to
+
disributed as an RFC to make this information easily accessible to
    the Internet community.  It does not specify an Internet standard.
+
the Internet community.  It does not specify an Internet standard.
    Distribution of this memo is unlimited.
+
Distribution of this memo is unlimited.
 
 
1.  Introduction
 
 
 
    This document defines the standard format for the interchange of
 
    network News messages among USENET hosts.  It describes the format
 
    for messages themselves and gives partial standards for transmission
 
    of news.  The news transmission is not entirely in order to give a
 
    good deal of flexibility to the hosts to choose transmission
 
    hardware and software, to batch news, and so on.
 
 
 
    There are five sections to this document.  Section two defines the
 
    format.  Section three defines the valid control messages.  Section
 
    four specifies some valid transmission methods.  Section five
 
    describes the overall news propagation algorithm.
 
 
 
2.  Message Format
 
 
 
    The primary consideration in choosing a message format is that it
 
    fit in with existing tools as well as possible.  Existing tools
 
    include implementations of both mail and news.  (The notesfiles
 
    system from the University of Illinois is considered a news
 
    implementation.)  A standard format for mail messages has existed
 
    for many years on the Internet, and this format meets most of the
 
    needs of USENET.  Since the Internet format is extensible,
 
    extensions to meet the additional needs of USENET are easily made
 
    within the Internet standard.  Therefore, the rule is adopted that
 
    all USENET news messages must be formatted as valid Internet mail
 
    messages, according to the Internet standard RFC-822.  The USENET
 
    News standard is more restrictive than the Internet standard,
 
 
 
 
 
 
 
Horton & Adams                                                  [Page 1]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
    placing additional requirements on each message and forbidding use
 
    of certain Internet features.  However, it should always be possible
 
    to use a tool expecting an Internet message to process a news
 
    message.  In any situation where this standard conflicts with the
 
    Internet standard, RFC-822 should be considered correct and this
 
    standard in error.
 
 
 
    Here is an example USENET message to illustrate the fields.
 
 
 
              From: [email protected] (Jerry Schwarz)
 
              Path: cbosgd!mhuxj!mhuxt!eagle!jerry
 
              Newsgroups: news.announce
 
              Subject: Usenet Etiquette -- Please Read
 
              Message-ID: <[email protected]>
 
              Date: Fri, 19 Nov 82 16:14:55 GMT
 
              Followup-To: news.misc
 
              Expires: Sat, 1 Jan 83 00:00:00 -0500
 
              Organization: AT&T Bell Laboratories, Murray Hill
 
 
 
              The body of the message comes here, after a blank line.
 
 
 
      Here is an example of a message in the old format (before the
 
      existence of this standard). It is recommended that
 
      implementations also accept messages in this format to ease upward
 
      conversion.
 
 
 
              From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
 
              Newsgroups: news.misc
 
              Title: Usenet Etiquette -- Please Read
 
              Article-I.D.: eagle.642
 
              Posted: Fri Nov 19 16:14:55 1982
 
              Received: Fri Nov 19 16:59:30 1982
 
              Expires: Mon Jan 1 00:00:00 1990
 
 
 
              The body of the message comes here, after a blank line.
 
 
 
      Some news systems transmit news in the A format, which looks like
 
      this:
 
 
 
                Aeagle.642
 
                news.misc
 
                cbosgd!mhuxj!mhuxt!eagle!jerry
 
                Fri Nov 19 16:14:55 1982
 
                Usenet Etiquette - Please Read
 
                The body of the message comes here, with no blank line.
 
 
 
    A standard USENET message consists of several header lines, followed
 
    by a blank line, followed by the body of the message.  Each header
 
 
 
 
 
 
 
Horton & Adams                                                  [Page 2]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
    line consist of a keyword, a colon, a blank, and some additional
 
    information.  This is a subset of the Internet standard, simplified
 
    to allow simpler software to handle it.  The "From" line may
 
    optionally include a full name, in the format above, or use the
 
    Internet angle bracket syntax.  To keep the implementations simple,
 
    other formats (for example, with part of the machine address after
 
    the close parenthesis) are not allowed.  The Internet convention of
 
    continuation header lines (beginning with a blank or tab) is
 
    allowed.
 
 
 
    Certain headers are required, and certain other headers are
 
    optional.  Any unrecognized headers are allowed, and will be passed
 
    through unchanged.  The required header lines are "From", "Date",
 
    "Newsgroups", "Subject", "Message-ID", and "Path".  The optional
 
    header lines are "Followup-To", "Expires", "Reply-To", "Sender",
 
    "References", "Control", "Distribution", "Keywords", "Summary",
 
    "Approved", "Lines", "Xref", and "Organization".  Each of these
 
    header lines will be described below.
 
 
 
2.1.  Required Header lines
 
 
 
2.1.1.  From
 
 
 
    The "From" line contains the electronic mailing address of the
 
    person who sent the message, in the Internet syntax.  It may
 
    optionally also contain the full name of the person, in parentheses,
 
    after the electronic address.  The electronic address is the same as
 
    the entity responsible for originating the message, unless the
 
    "Sender" header is present, in which case the "From" header might
 
    not be verified.  Note that in all host and domain names, upper and
 
    lower case are considered the same, thus "[email protected]",
 
    "[email protected]", and "[email protected]" are all equivalent.
 
    User names may or may not be case sensitive, for example,
 
    "[email protected]" might be different from
 
    "[email protected]".  Programs should avoid changing the case of
 
    electronic addresses when forwarding news or mail.
 
 
 
    RFC-822 specifies that all text in parentheses is to be interpreted
 
    as a comment.  It is common in Internet mail to place the full name
 
    of the user in a comment at the end of the "From" line.  This
 
    standard specifies a more rigid syntax.  The full name is not
 
    considered a comment, but an optional part of the header line.
 
    Either the full name is omitted, or it appears in parentheses after
 
    the electronic address of the person posting the message, or it
 
    appears before an electronic address which is enclosed in angle
 
    brackets.  Thus, the three permissible forms are:
 
 
 
 
 
 
 
 
 
 
 
Horton & Adams                                                  [Page 3]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
              From: [email protected]
 
              From: [email protected] (Mark Horton)
 
              From: Mark Horton <[email protected]>
 
 
 
    Full names may contain any printing ASCII characters from space
 
    through tilde, except that they may not contain "(" (left
 
    parenthesis), ")" (right parenthesis), "<" (left angle bracket), or
 
    ">" (right angle bracket).  Additional restrictions may be placed on
 
    full names by the mail standard, in particular, the characters ","
 
    (comma), ":" (colon), "@" (at), "!" (bang), "/" (slash), "="
 
    (equal), and ";" (semicolon) are inadvisable in full names.
 
 
 
2.1.2.  Date
 
 
 
    The "Date" line (formerly "Posted") is the date that the message was
 
    originally posted to the network.  Its format must be acceptable
 
    both in RFC-822 and to the getdate(3) routine that is provided with
 
    the Usenet software.  This date remains unchanged as the message is
 
    propagated throughout the network.  One format that is acceptable to
 
    both is:
 
 
 
                      Wdy, DD Mon YY HH:MM:SS TIMEZONE
 
 
 
    Several examples of valid dates appear in the sample message above.
 
    Note in particular that ctime(3) format:
 
 
 
                          Wdy Mon DD HH:MM:SS YYYY
 
 
 
    is not acceptable because it is not a valid RFC-822 date.  However,
 
    since older software still generates this format, news
 
    implementations are encouraged to accept this format and translate
 
    it into an acceptable format.
 
 
 
    There is no hope of having a complete list of timezones.  Universal
 
    Time (GMT), the North American timezones (PST, PDT, MST, MDT, CST,
 
    CDT, EST, EDT) and the +/-hhmm offset specifed in RFC-822 should be
 
    supported.  It is recommended that times in message headers be
 
    transmitted in GMT and displayed in the local time zone.
 
 
 
2.1.3.  Newsgroups
 
 
 
    The "Newsgroups" line specifies the newsgroup or newsgroups in which
 
    the message belongs.  Multiple newsgroups may be specified,
 
    separated by a comma.  Newsgroups specified must all be the names of
 
    existing newsgroups, as no new newsgroups will be created by simply
 
    posting to them.
 
 
 
 
 
 
 
 
 
 
 
Horton & Adams                                                  [Page 4]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
  
 +
== Introduction ==
  
    Wildcards (e.g., the word "all") are never allowed in a "News-
+
This document defines the standard format for the interchange of
    groups" line. For example, a newsgroup comp.all is illegal,
+
network News messages among USENET hosts. It describes the format
    although a newsgroup rec.sport.football is permitted.
+
for messages themselves and gives partial standards for transmission
 +
of news. The news transmission is not entirely in order to give a
 +
  good deal of flexibility to the hosts to choose transmission
 +
hardware and software, to batch news, and so on.
  
    If a message is received with a "Newsgroups" line listing some valid
+
There are five sections to this documentSection two defines the
    newsgroups and some invalid newsgroups, a host should not remove
+
formatSection three defines the valid control messages. Section
    invalid newsgroups from the listInstead, the invalid newsgroups
+
four specifies some valid transmission methodsSection five
    should be ignoredFor example, suppose host A subscribes to the
+
describes the overall news propagation algorithm.
    classes btl.all and comp.all, and exchanges news messages with host
 
    B, which subscribes to comp.all but not btl.allSuppose A receives
 
    a message with Newsgroups: comp.unix,btl.general.
 
  
    This message is passed on to B because B receives comp.unix, but B
+
== Message Format ==
    does not receive btl.general.  A must leave the "Newsgroups" line
 
    unchanged.  If it were to remove btl.general, the edited header
 
    could eventually re-enter the btl.all class, resulting in a message
 
    that is not shown to users subscribing to btl.general.  Also,
 
    follow-ups from outside btl.all would not be shown to such users.
 
  
2.1.4Subject
+
The primary consideration in choosing a message format is that it
 +
fit in with existing tools as well as possible.  Existing tools
 +
include implementations of both mail and news.  (The notesfiles
 +
system from the University of Illinois is considered a news
 +
implementation.)  A standard format for mail messages has existed
 +
for many years on the Internet, and this format meets most of the
 +
needs of USENET. Since the Internet format is extensible,
 +
extensions to meet the additional needs of USENET are easily made
 +
within the Internet standard. Therefore, the rule is adopted that
 +
all USENET news messages must be formatted as valid Internet mail
 +
messages, according to the Internet standard RFC-822The USENET
 +
News standard is more restrictive than the Internet standard,
  
    The "Subject" line (formerly "Title") tells what the message is
+
placing additional requirements on each message and forbidding use
    aboutIt should be suggestive enough of the contents of the
+
of certain Internet featuresHowever, it should always be possible
    message to enable a reader to make a decision whether to read the
+
to use a tool expecting an Internet message to process a news
    message based on the subject aloneIf the message is submitted in
+
message.  In any situation where this standard conflicts with the
    response to another message (e.g., is a follow-up) the default
+
Internet standard, RFC-822 should be considered correct and this
    subject should begin with the four characters "Re:", and the
+
  standard in error.
    "References" line is required. For follow-ups, the use of the
 
    "Summary" line is encouraged.
 
  
2.1.5. Message-ID
+
Here is an example USENET message to illustrate the fields.
  
    The "Message-ID" line gives the message a unique identifier. The
+
          From: [email protected] (Jerry Schwarz)
    Message-ID may not be reused during the lifetime of any previous
+
          Path: cbosgd!mhuxj!mhuxt!eagle!jerry
    message with the same Message-ID. (It is recommended that no
+
          Newsgroups: news.announce
    Message-ID be reused for at least two years.)  Message-ID's have the
+
          Subject: Usenet Etiquette -- Please Read
    syntax:
+
          Message-ID: <642@eagle.ATT.COM>
 +
          Date: Fri, 19 Nov 82 16:14:55 GMT
 +
          Followup-To: news.misc
 +
          Expires: Sat, 1 Jan 83 00:00:00 -0500
 +
          Organization: AT&T Bell Laboratories, Murray Hill
  
                    <string not containing blank or ">">
+
          The body of the message comes here, after a blank line.
  
    In order to conform to RFC-822, the Message-ID must have the format:
+
  Here is an example of a message in the old format (before the
 +
  existence of this standard). It is recommended that
 +
  implementations also accept messages in this format to ease upward
 +
  conversion.
  
                          <unique@full_domain_name>
+
            From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
 +
            Newsgroups: news.misc
 +
            Title: Usenet Etiquette -- Please Read
 +
            Article-I.D.: eagle.642
 +
            Posted: Fri Nov 19 16:14:55 1982
 +
            Received: Fri Nov 19 16:59:30 1982
 +
            Expires: Mon Jan 1 00:00:00 1990
  
    where full_domain_name is the full name of the host at which the
+
            The body of the message comes here, after a blank line.
    message entered the network, including a domain that host is in, and
 
    unique is any string of printing ASCII characters, not including "<"
 
    (left angle bracket), ">" (right angle bracket), or "@" (at sign).
 
  
 +
  Some news systems transmit news in the A format, which looks like
 +
  this:
  
 +
            Aeagle.642
 +
            news.misc
 +
            cbosgd!mhuxj!mhuxt!eagle!jerry
 +
            Fri Nov 19 16:14:55 1982
 +
            Usenet Etiquette - Please Read
 +
            The body of the message comes here, with no blank line.
  
Horton & Adams                                                  [Page 5]
+
A standard USENET message consists of several header lines, followed
 +
by a blank line, followed by the body of the message.  Each header
  
RFC 1036              Standard for USENET Messages        December 1987
+
line consist of a keyword, a colon, a blank, and some additional
 +
information.  This is a subset of the Internet standard, simplified
 +
to allow simpler software to handle it.  The "From" line may
 +
optionally include a full name, in the format above, or use the
 +
Internet angle bracket syntax.  To keep the implementations simple,
 +
other formats (for example, with part of the machine address after
 +
the close parenthesis) are not allowed.  The Internet convention of
 +
continuation header lines (beginning with a blank or tab) is
 +
allowed.
  
 +
Certain headers are required, and certain other headers are
 +
optional.  Any unrecognized headers are allowed, and will be passed
 +
through unchanged.  The required header lines are "From", "Date",
 +
"Newsgroups", "Subject", "Message-ID", and "Path".  The optional
 +
header lines are "Followup-To", "Expires", "Reply-To", "Sender",
 +
"References", "Control", "Distribution", "Keywords", "Summary",
 +
"Approved", "Lines", "Xref", and "Organization".  Each of these
 +
header lines will be described below.
  
    For example, the unique part could be an integer representing a
+
=== Required Header lines ===
    sequence number for messages submitted to the network, or a short
 
    string derived from the date and time the message was created.  For
 
    example, a valid Message-ID for a message submitted from host ucbvax
 
    in domain "Berkeley.EDU" would be "<[email protected]>".
 
    Programmers are urged not to make assumptions about the content of
 
    Message-ID fields from other hosts, but to treat them as unknown
 
    character strings.  It is not safe, for example, to assume that a
 
    Message-ID will be under 14 characters, that it is unique in the
 
    first 14 characters, nor that is does not contain a "/".
 
  
    The angle brackets are considered part of the Message-ID.  Thus, in
+
==== From ====
    references to the Message-ID, such as the ihave/sendme and cancel
 
    control messages, the angle brackets are included.  White space
 
    characters (e.g., blank and tab) are not allowed in a Message-ID.
 
    Slashes ("/") are strongly discouraged.  All characters between the
 
    angle brackets must be printing ASCII characters.
 
  
2.1.6Path
+
The "From" line contains the electronic mailing address of the
 +
person who sent the message, in the Internet syntax. It may
 +
optionally also contain the full name of the person, in parentheses,
 +
after the electronic address. The electronic address is the same as
 +
the entity responsible for originating the message, unless the
 +
"Sender" header is present, in which case the "From" header might
 +
not be verifiedNote that in all host and domain names, upper and
 +
lower case are considered the same, thus "[email protected]",
 +
"[email protected]", and "[email protected]" are all equivalent.
 +
User names may or may not be case sensitive, for example,
 +
"[email protected]" might be different from
 +
"[email protected]".  Programs should avoid changing the case of
 +
electronic addresses when forwarding news or mail.
  
    This line shows the path the message took to reach the current
+
RFC-822 specifies that all text in parentheses is to be interpreted
    systemWhen a system forwards the message, it should add its own
+
as a commentIt is common in Internet mail to place the full name
    name to the list of systems in the "Path" line.  The names may be
+
of the user in a comment at the end of the "From" line.  This
    separated by any punctuation character or characters (except "."
+
standard specifies a more rigid syntax.  The full name is not
    which is considered part of the hostname).  Thus, the following are
+
considered a comment, but an optional part of the header line.
    valid entries:
+
Either the full name is omitted, or it appears in parentheses after
 +
the electronic address of the person posting the message, or it
 +
appears before an electronic address which is enclosed in angle
 +
brackets.  Thus, the three permissible forms are:
  
                  cbosgd!mhuxj!mhuxt
+
          From: mark@cbosgd.ATT.COM
                  cbosgd, mhuxj, mhuxt
+
          From: mark@cbosgd.ATT.COM (Mark Horton)
                  @cbosgd.ATT.COM,@mhuxj.ATT.COM,@mhuxt.ATT.COM
+
          From: Mark Horton <mark@cbosgd.ATT.COM>
                  teklabs, zehntel, sri-unix@cca!decvax
 
  
    (The latter path indicates a message that passed through decvax,
+
Full names may contain any printing ASCII characters from space
    cca, sri-unix, zehntel, and teklabs, in that order.) Additional
+
through tilde, except that they may not contain "(" (left
    names should be added from the left.  For example, the most recently
+
parenthesis), ")" (right parenthesis), "<" (left angle bracket), or
    added name in the fourth example was teklabs. Letters, digits,
+
">" (right angle bracket). Additional restrictions may be placed on
    periods and hyphens are considered part of host names; other
+
full names by the mail standard, in particular, the characters ","
    punctuation, including blanks, are considered separators.
+
  (comma), ":" (colon), "@" (at), "!" (bang), "/" (slash), "="
 +
(equal), and ";" (semicolon) are inadvisable in full names.
  
    Normally, the rightmost name will be the name of the originating
+
==== Date ====
    system.  However, it is also permissible to include an extra entry
 
    on the right, which is the name of the sender.  This is for upward
 
    compatibility with older systems.
 
  
    The "Path" line is not used for replies, and should not be taken as
+
The "Date" line (formerly "Posted") is the date that the message was
    a mailing address. It is intended to show the route the message
+
originally posted to the network.  Its format must be acceptable
    traveled to reach the local hostThere are several uses for this
+
  both in RFC-822 and to the getdate(3) routine that is provided with
    information.  One is to monitor USENET routing for performance
+
the Usenet softwareThis date remains unchanged as the message is
 +
propagated throughout the network.  One format that is acceptable to
 +
both is:
  
 +
                  Wdy, DD Mon YY HH:MM:SS TIMEZONE
  
 +
Several examples of valid dates appear in the sample message above.
 +
Note in particular that ctime(3) format:
  
Horton & Adams                                                  [Page 6]
+
                      Wdy Mon DD HH:MM:SS YYYY
  
RFC 1036              Standard for USENET Messages        December 1987
+
is not acceptable because it is not a valid RFC-822 date.  However,
 +
since older software still generates this format, news
 +
implementations are encouraged to accept this format and translate
 +
it into an acceptable format.
  
 +
There is no hope of having a complete list of timezones.  Universal
 +
Time (GMT), the North American timezones (PST, PDT, MST, MDT, CST,
 +
CDT, EST, EDT) and the +/-hhmm offset specifed in RFC-822 should be
 +
supported.  It is recommended that times in message headers be
 +
transmitted in GMT and displayed in the local time zone.
  
    reasons.  Another is to establish a path to reach new hosts.
+
==== Newsgroups ====
    Perhaps the most important use is to cut down on redundant USENET
 
    traffic by failing to forward a message to a host that is known to
 
    have already received it.  In particular, when host A sends a
 
    message to host B, the "Path" line includes A, so that host B will
 
    not immediately send the message back to host A.  The name each host
 
    uses to identify itself should be the same as the name by which its
 
    neighbors know it, in order to make this optimization possible.
 
  
    A host adds its own name to the front of a path when it receives a
+
  The "Newsgroups" line specifies the newsgroup or newsgroups in which
    message from another host. Thus, if a message with path "A!X!Y!Z"
+
the message belongs. Multiple newsgroups may be specified,
    is passed from host A to host B, B will add its own name to the path
+
separated by a commaNewsgroups specified must all be the names of
    when it receives the message from A, e.g., "B!A!X!Y!Z"If B then
+
existing newsgroups, as no new newsgroups will be created by simply
    passes the message on to C, the message sent to C will contain the
+
posting to them.
    path "B!A!X!Y!Z", and when C receives it, C will change it to
 
    "C!B!A!X!Y!Z".
 
  
    Special upward compatibility note: Since the "From", "Sender", and
+
  Wildcards (e.g., the word "all") are never allowed in a "News-
    "Reply-To" lines are in Internet format, and since many USENET hosts
+
groups" lineFor example, a newsgroup comp.all is illegal,
    do not yet have mailers capable of understanding Internet format, it
+
  although a newsgroup rec.sport.football is permitted.
    would break the reply capability to completely sever the connection
 
    between the "Path" header and the reply functionIt is recognized
 
    that the path is not always a valid reply string in older
 
    implementations, and no requirement to fix this problem is placed on
 
    implementations.  However, the existing convention of placing the
 
    host name and an "!" at the front of the path, and of starting the
 
    path with the host name, an "!", and the user name, should be
 
    maintained when possible.
 
  
2.2Optional Headers
+
If a message is received with a "Newsgroups" line listing some valid
 +
newsgroups and some invalid newsgroups, a host should not remove
 +
invalid newsgroups from the list. Instead, the invalid newsgroups
 +
should be ignoredFor example, suppose host A subscribes to the
 +
classes btl.all and comp.all, and exchanges news messages with host
 +
B, which subscribes to comp.all but not btl.all.  Suppose A receives
 +
a message with Newsgroups: comp.unix,btl.general.
  
2.2.1Reply-To
+
This message is passed on to B because B receives comp.unix, but B
 +
does not receive btl.general.  A must leave the "Newsgroups" line
 +
unchanged.  If it were to remove btl.general, the edited header
 +
could eventually re-enter the btl.all class, resulting in a message
 +
that is not shown to users subscribing to btl.generalAlso,
 +
follow-ups from outside btl.all would not be shown to such users.
  
    This line has the same format as "From".  If present, mailed replies
+
==== Subject ====
    to the author should be sent to the name given here.  Otherwise,
 
    replies are mailed to the name on the "From" line. (This does not
 
    prevent additional copies from being sent to recipients named by the
 
    replier, or on "To" or "Cc" lines.)  The full name may be optionally
 
    given, in parentheses, as in the "From" line.
 
  
2.2.2Sender
+
The "Subject" line (formerly "Title") tells what the message is
 +
about. It should be suggestive enough of the contents of the
 +
message to enable a reader to make a decision whether to read the
 +
message based on the subject alone. If the message is submitted in
 +
response to another message (e.g., is a follow-up) the default
 +
subject should begin with the four characters "Re:", and the
 +
"References" line is requiredFor follow-ups, the use of the
 +
"Summary" line is encouraged.
  
    This field is present only if the submitter manually enters a "From"
+
==== Message-ID ====
    line.  It is intended to record the entity responsible for
 
    submitting the message to the network.  It should be verified by the
 
    software at the submitting host.
 
  
 +
The "Message-ID" line gives the message a unique identifier.  The
 +
Message-ID may not be reused during the lifetime of any previous
 +
message with the same Message-ID.  (It is recommended that no
 +
Message-ID be reused for at least two years.)  Message-ID's have the
 +
syntax:
  
 +
                  <string not containing blank or ">">
  
 +
In order to conform to RFC-822, the Message-ID must have the format:
  
 +
                      <unique@full_domain_name>
  
Horton & Adams                                                  [Page 7]
+
where full_domain_name is the full name of the host at which the
 +
message entered the network, including a domain that host is in, and
 +
unique is any string of printing ASCII characters, not including "<"
 +
(left angle bracket), ">" (right angle bracket), or "@" (at sign).
  
RFC 1036              Standard for USENET Messages        December 1987
+
For example, the unique part could be an integer representing a
 +
sequence number for messages submitted to the network, or a short
 +
string derived from the date and time the message was created.  For
 +
example, a valid Message-ID for a message submitted from host ucbvax
 +
in domain "Berkeley.EDU" would be "<[email protected]>".
 +
Programmers are urged not to make assumptions about the content of
 +
Message-ID fields from other hosts, but to treat them as unknown
 +
character strings.  It is not safe, for example, to assume that a
 +
Message-ID will be under 14 characters, that it is unique in the
 +
first 14 characters, nor that is does not contain a "/".
  
 +
The angle brackets are considered part of the Message-ID.  Thus, in
 +
references to the Message-ID, such as the ihave/sendme and cancel
 +
control messages, the angle brackets are included.  White space
 +
characters (e.g., blank and tab) are not allowed in a Message-ID.
 +
Slashes ("/") are strongly discouraged.  All characters between the
 +
angle brackets must be printing ASCII characters.
  
    For example, if John Smith is visiting CCA and wishes to post a
+
==== Path ====
    message to the network, using friend Sarah Jones' account, the
 
    message might read:
 
  
              From: smith@ucbvax.Berkeley.EDU (John Smith)
+
This line shows the path the message took to reach the current
              Sender: [email protected] (Sarah Jones)
+
system. When a system forwards the message, it should add its own
 +
name to the list of systems in the "Path" line. The names may be
 +
separated by any punctuation character or characters (except "."
 +
which is considered part of the hostname).  Thus, the following are
 +
valid entries:
  
    If a gateway program enters a mail message into the network at host
+
                cbosgd!mhuxj!mhuxt
    unix.SRI.COM, the lines might read:
+
                cbosgd, mhuxj, mhuxt
 +
                @cbosgd.ATT.COM,@mhuxj.ATT.COM,@mhuxt.ATT.COM
 +
                teklabs, zehntel, sri-unix@cca!decvax
  
              From: John.Doe@A.CS.CMU.EDU
+
(The latter path indicates a message that passed through decvax,
              Sender: [email protected].COM
+
cca, sri-unix, zehntel, and teklabs, in that order.) Additional
 +
names should be added from the left. For example, the most recently
 +
added name in the fourth example was teklabs. Letters, digits,
 +
periods and hyphens are considered part of host names; other
 +
punctuation, including blanks, are considered separators.
  
    The primary purpose of this field is to be able to track down
+
Normally, the rightmost name will be the name of the originating
    messages to determine how they were entered into the networkThe
+
system.  However, it is also permissible to include an extra entry
    full name may be optionally given, in parentheses, as in the "From"
+
on the right, which is the name of the senderThis is for upward
    line.
+
compatibility with older systems.
  
2.2.3Followup-To
+
The "Path" line is not used for replies, and should not be taken as
 +
a mailing address. It is intended to show the route the message
 +
traveled to reach the local host. There are several uses for this
 +
informationOne is to monitor USENET routing for performance
  
    This line has the same format as "Newsgroups"If present, follow-
+
reasons.  Another is to establish a path to reach new hosts.
    up messages are to be posted to the newsgroup or newsgroups listed
+
Perhaps the most important use is to cut down on redundant USENET
    hereIf this line is not present, follow-ups are posted to the
+
traffic by failing to forward a message to a host that is known to
    newsgroup or newsgroups listed in the "Newsgroups" line.
+
have already received itIn particular, when host A sends a
 +
message to host B, the "Path" line includes A, so that host B will
 +
not immediately send the message back to host AThe name each host
 +
uses to identify itself should be the same as the name by which its
 +
neighbors know it, in order to make this optimization possible.
  
    If the keyword poster is present, follow-up messages are not
+
A host adds its own name to the front of a path when it receives a
    permittedThe message should be mailed to the submitter of the
+
message from another host.  Thus, if a message with path "A!X!Y!Z"
    message via mail.
+
is passed from host A to host B, B will add its own name to the path
 +
when it receives the message from A, e.g., "B!A!X!Y!Z".  If B then
 +
  passes the message on to C, the message sent to C will contain the
 +
path "B!A!X!Y!Z", and when C receives it, C will change it to
 +
"C!B!A!X!Y!Z".
  
2.2.4. Expires
+
Special upward compatibility note:  Since the "From", "Sender", and
 +
"Reply-To" lines are in Internet format, and since many USENET hosts
 +
do not yet have mailers capable of understanding Internet format, it
 +
would break the reply capability to completely sever the connection
 +
between the "Path" header and the reply function. It is recognized
 +
that the path is not always a valid reply string in older
 +
implementations, and no requirement to fix this problem is placed on
 +
implementations. However, the existing convention of placing the
 +
host name and an "!"  at the front of the path, and of starting the
 +
path with the host name, an "!", and the user name, should be
 +
maintained when possible.
  
    This line, if present, is in a legal USENET date format.  It
+
=== Optional Headers ===
    specifies a suggested expiration date for the message.  If not
 
    present, the local default expiration date is used.  This field is
 
    intended to be used to clean up messages with a limited usefulness,
 
    or to keep important messages around for longer than usual.  For
 
    example, a message announcing an upcoming seminar could have an
 
    expiration date the day after the seminar, since the message is not
 
    useful after the seminar is over.  Since local hosts have local
 
    policies for expiration of news (depending on available disk space,
 
    for instance), users are discouraged from providing expiration dates
 
    for messages unless there is a natural expiration date associated
 
    with the topic.  System software should almost never provide a
 
    default "Expires" line.  Leave it out and allow local policies to be
 
    used unless there is a good reason not to.
 
  
 +
==== Reply-To ====
  
 +
This line has the same format as "From".  If present, mailed replies
 +
to the author should be sent to the name given here.  Otherwise,
 +
replies are mailed to the name on the "From" line. (This does not
 +
prevent additional copies from being sent to recipients named by the
 +
replier, or on "To" or "Cc" lines.)  The full name may be optionally
 +
given, in parentheses, as in the "From" line.
  
 +
==== Sender ====
  
 +
This field is present only if the submitter manually enters a "From"
 +
line.  It is intended to record the entity responsible for
 +
submitting the message to the network.  It should be verified by the
 +
software at the submitting host.
  
 +
For example, if John Smith is visiting CCA and wishes to post a
 +
message to the network, using friend Sarah Jones' account, the
 +
message might read:
  
Horton & Adams                                                  [Page 8]
+
          From: [email protected] (John Smith)
 +
          Sender: [email protected] (Sarah Jones)
  
RFC 1036              Standard for USENET Messages        December 1987
+
If a gateway program enters a mail message into the network at host
 +
unix.SRI.COM, the lines might read:
  
 +
          From: [email protected]
 +
          Sender: [email protected]
  
2.2.5.  References
+
The primary purpose of this field is to be able to track down
 +
messages to determine how they were entered into the network. The
 +
full name may be optionally given, in parentheses, as in the "From"
 +
line.
  
    This field lists the Message-ID's of any messages prompting the
+
==== Followup-To ====
    submission of this message.  It is required for all follow-up
 
    messages, and forbidden when a new subject is raised.
 
    Implementations should provide a follow-up command, which allows a
 
    user to post a follow-up message.  This command should generate a
 
    "Subject" line which is the same as the original message, except
 
    that if the original subject does not begin with "Re:" or "re:", the
 
    four characters "Re:" are inserted before the subject.  If there is
 
    no "References" line on the original header, the "References" line
 
    should contain the Message-ID of the original message (including the
 
    angle brackets).  If the original message does have a "References"
 
    line, the follow-up message should have a "References" line
 
    containing the text of the original "References" line, a blank, and
 
    the Message-ID of the original message.
 
  
    The purpose of the "References" header is to allow messages to be
+
This line has the same format as "Newsgroups".  If present, follow-
    grouped into conversations by the user interface programThis
+
up messages are to be posted to the newsgroup or newsgroups listed
    allows conversations within a newsgroup to be kept together, and
+
hereIf this line is not present, follow-ups are posted to the
    potentially users might shut off entire conversations without
+
newsgroup or newsgroups listed in the "Newsgroups" line.
    unsubscribing to a newsgroup.  User interfaces need not make use of
 
    this header, but all automatically generated follow-ups should
 
    generate the "References" line for the benefit of systems that do
 
    use it, and manually generated follow-ups (e.g., typed in well after
 
    the original message has been printed by the machine) should be
 
    encouraged to include them as well.
 
  
    It is permissible to not include the entire previous "References"
+
If the keyword poster is present, follow-up messages are not
    line if it is too longAn attempt should be made to include a
+
permittedThe message should be mailed to the submitter of the
    reasonable number of backwards references.
+
message via mail.
  
2.2.6.  Control
+
==== Expires ====
  
    If a message contains a "Control" line, the message is a control
+
This line, if present, is in a legal USENET date format.  It
    messageControl messages are used for communication among USENET
+
specifies a suggested expiration date for the message.  If not
    host machines, not to be read by users. Control messages are
+
present, the local default expiration date is used.  This field is
    distributed by the same newsgroup mechanism as ordinary messages.
+
intended to be used to clean up messages with a limited usefulness,
    The body of the "Control" header line is the message to the host.
+
or to keep important messages around for longer than usual.  For
 +
example, a message announcing an upcoming seminar could have an
 +
expiration date the day after the seminar, since the message is not
 +
useful after the seminar is overSince local hosts have local
 +
policies for expiration of news (depending on available disk space,
 +
for instance), users are discouraged from providing expiration dates
 +
  for messages unless there is a natural expiration date associated
 +
with the topic. System software should almost never provide a
 +
default "Expires" line.  Leave it out and allow local policies to be
 +
used unless there is a good reason not to.
  
    For upward compatibility, messages that match the newsgroup pattern
+
==== References ====
    "all.all.ctl" should also be interpreted as control messages.  If no
 
    "Control" header is present on such messages, the subject is used as
 
    the control message.  However, messages on newsgroups matching this
 
    pattern do not conform to this standard.
 
  
 +
This field lists the Message-ID's of any messages prompting the
 +
submission of this message.  It is required for all follow-up
 +
messages, and forbidden when a new subject is raised.
 +
Implementations should provide a follow-up command, which allows a
 +
user to post a follow-up message.  This command should generate a
 +
"Subject" line which is the same as the original message, except
 +
that if the original subject does not begin with "Re:" or "re:", the
 +
four characters "Re:" are inserted before the subject.  If there is
 +
no "References" line on the original header, the "References" line
 +
should contain the Message-ID of the original message (including the
 +
angle brackets).  If the original message does have a "References"
 +
line, the follow-up message should have a "References" line
 +
containing the text of the original "References" line, a blank, and
 +
the Message-ID of the original message.
  
 +
The purpose of the "References" header is to allow messages to be
 +
grouped into conversations by the user interface program.  This
 +
allows conversations within a newsgroup to be kept together, and
 +
potentially users might shut off entire conversations without
 +
unsubscribing to a newsgroup.  User interfaces need not make use of
 +
this header, but all automatically generated follow-ups should
 +
generate the "References" line for the benefit of systems that do
 +
use it, and manually generated follow-ups (e.g., typed in well after
 +
the original message has been printed by the machine) should be
 +
encouraged to include them as well.
  
 +
It is permissible to not include the entire previous "References"
 +
line if it is too long.  An attempt should be made to include a
 +
reasonable number of backwards references.
  
 +
==== Control ====
  
 +
If a message contains a "Control" line, the message is a control
 +
message.  Control messages are used for communication among USENET
 +
host machines, not to be read by users.  Control messages are
 +
distributed by the same newsgroup mechanism as ordinary messages.
 +
The body of the "Control" header line is the message to the host.
  
Horton & Adams                                                  [Page 9]
+
For upward compatibility, messages that match the newsgroup pattern
 +
"all.all.ctl" should also be interpreted as control messages.  If no
 +
"Control" header is present on such messages, the subject is used as
 +
the control message.  However, messages on newsgroups matching this
 +
pattern do not conform to this standard.
  
RFC 1036              Standard for USENET Messages        December 1987
+
Also for upward compatibility, if the first 4 characters of the
 +
"Subject:" line are "cmsg", the rest of the "Subject:" line should
 +
be interpreted as a control message.
  
 +
==== Distribution ====
  
    Also for upward compatibility, if the first 4 characters of the
+
This line is used to alter the distribution scope of the message.
    "Subject:" line are "cmsg", the rest of the "Subject:" line should
+
It is a comma separated list similar to the "Newsgroups" line.  User
    be interpreted as a control message.
+
subscriptions are still controlled by "Newsgroups", but the message
 +
is sent to all systems subscribing to the newsgroups on the
 +
"Distribution" line in addition to the "Newsgroups" line.  For the
 +
message to be transmitted, the receiving site must normally receive
 +
one of the specified newsgroups AND must receive one of the
 +
specified distributions.  Thus, a message concerning a car for sale
 +
in New Jersey might have headers including:
  
2.2.7.  Distribution
+
                Newsgroups: rec.auto,misc.forsale
 +
                Distribution: nj,ny
  
    This line is used to alter the distribution scope of the message.
+
so that it would only go to persons subscribing to rec.auto or misc.
    It is a comma separated list similar to the "Newsgroups" lineUser
+
for sale within New Jersey or New York.  The intent of this header
    subscriptions are still controlled by "Newsgroups", but the message
+
is to restrict the distribution of a newsgroup further, not to
    is sent to all systems subscribing to the newsgroups on the
+
increase itA local newsgroup, such as nj.crazy-eddie, will
    "Distribution" line in addition to the "Newsgroups" line.  For the
+
probably not be propagated by hosts outside New Jersey that do not
    message to be transmitted, the receiving site must normally receive
+
show such a newsgroup as valid.  A follow-up message should default
    one of the specified newsgroups AND must receive one of the
+
to the same "Distribution" line as the original message, but the
    specified distributions. Thus, a message concerning a car for sale
+
user can change it to a more limited one, or escalate the
    in New Jersey might have headers including:
+
  distribution if it was originally restricted and a more widely
 +
distributed reply is appropriate.
  
                  Newsgroups: rec.auto,misc.forsale
+
==== Organization ====
                  Distribution: nj,ny
 
  
    so that it would only go to persons subscribing to rec.auto or misc.
+
  The text of this line is a short phrase describing the organization
    for sale within New Jersey or New York. The intent of this header
+
to which the sender belongs, or to which the machine belongsThe
    is to restrict the distribution of a newsgroup further, not to
+
  intent of this line is to help identify the person posting the
    increase itA local newsgroup, such as nj.crazy-eddie, will
+
message, since host names are often cryptic enough to make it hard
    probably not be propagated by hosts outside New Jersey that do not
+
to recognize the organization by the electronic address.
    show such a newsgroup as valid. A follow-up message should default
 
    to the same "Distribution" line as the original message, but the
 
    user can change it to a more limited one, or escalate the
 
    distribution if it was originally restricted and a more widely
 
    distributed reply is appropriate.
 
  
2.2.8.  Organization
+
==== Keywords ====
  
    The text of this line is a short phrase describing the organization
+
  A few well-selected keywords identifying the message should be on
    to which the sender belongs, or to which the machine belongs.  The
+
this line.  This is used as an aid in determining if this message is
    intent of this line is to help identify the person posting the
+
interesting to the reader.
    message, since host names are often cryptic enough to make it hard
 
    to recognize the organization by the electronic address.
 
 
 
2.2.9. Keywords
 
 
 
    A few well-selected keywords identifying the message should be on
 
    this line.  This is used as an aid in determining if this message is
 
    interesting to the reader.
 
  
 
2.2.10.  Summary
 
2.2.10.  Summary
  
    This line should contain a brief summary of the message.  It is
+
This line should contain a brief summary of the message.  It is
    usually used as part of a follow-up to another message.  Again, it
+
usually used as part of a follow-up to another message.  Again, it
  
 
+
is very useful to the reader in determining whether to read the
 
+
message.
Horton & Adams                                                [Page 10]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
    is very useful to the reader in determining whether to read the
 
    message.
 
  
 
2.2.11.  Approved
 
2.2.11.  Approved
  
    This line is required for any message posted to a moderated
+
This line is required for any message posted to a moderated
    newsgroup.  It should be added by the moderator and consist of his
+
newsgroup.  It should be added by the moderator and consist of his
    mail address.  It is also required with certain control messages.
+
mail address.  It is also required with certain control messages.
  
 
2.2.12.  Lines
 
2.2.12.  Lines
  
    This contains a count of the number of lines in the body of the
+
This contains a count of the number of lines in the body of the
    message.
+
message.
  
 
2.2.13.  Xref
 
2.2.13.  Xref
  
    This line contains the name of the host (with domains omitted) and a
+
This line contains the name of the host (with domains omitted) and a
    white space separated list of colon-separated pairs of newsgroup
+
white space separated list of colon-separated pairs of newsgroup
    names and message numbers.  These are the newsgroups listed in the
+
names and message numbers.  These are the newsgroups listed in the
    "Newsgroups" line and the corresponding message numbers from the
+
"Newsgroups" line and the corresponding message numbers from the
    spool directory.
+
spool directory.
  
    This is only of value to the local system, so it should not be
+
This is only of value to the local system, so it should not be
    transmitted.  For example, in:
+
transmitted.  For example, in:
  
              Path: seismo!lll-crg!lll-lcc!pyramid!decwrl!reid
+
            Path: seismo!lll-crg!lll-lcc!pyramid!decwrl!reid
              From: [email protected] (Brian Reid)
+
            From: [email protected] (Brian Reid)
              Newsgroups: news.lists,news.groups
+
            Newsgroups: news.lists,news.groups
              Subject: USENET READERSHIP SUMMARY REPORT FOR SEP 86
+
            Subject: USENET READERSHIP SUMMARY REPORT FOR SEP 86
              Message-ID: <[email protected]>
+
            Message-ID: <[email protected]>
              Date: 1 Oct 86 11:26:15 GMT
+
            Date: 1 Oct 86 11:26:15 GMT
              Organization: DEC Western Research Laboratory
+
            Organization: DEC Western Research Laboratory
              Lines: 441
+
            Lines: 441
              Approved: [email protected]
+
            Approved: [email protected]
              Xref: seismo news.lists:461 news.groups:6378
+
            Xref: seismo news.lists:461 news.groups:6378
  
    the "Xref" line shows that the message is message number 461 in the
+
the "Xref" line shows that the message is message number 461 in the
    newsgroup news.lists, and message number 6378 in the newsgroup
+
newsgroup news.lists, and message number 6378 in the newsgroup
    news.groups, on host seismo.  This information may be used by
+
news.groups, on host seismo.  This information may be used by
    certain user interfaces.
+
certain user interfaces.
  
3.  Control Messages
+
== Control Messages ==
  
    This section lists the control messages currently defined.  The body
+
This section lists the control messages currently defined.  The body
    of the "Control" header line is the control message.  Messages are a
+
of the "Control" header line is the control message.  Messages are a
    sequence of zero or more words, separated by white space (blanks or
+
sequence of zero or more words, separated by white space (blanks or
    tabs).  The first word is the name of the control message, remaining
+
tabs).  The first word is the name of the control message, remaining
    words are parameters to the message.  The remainder of the header
+
words are parameters to the message.  The remainder of the header
  
 +
and the body of the message are also potential parameters; for
 +
example, the "From" line might suggest an address to which a
 +
response is to be mailed.
  
 +
Implementors and administrators may choose to allow control messages
 +
to be carried out automatically, or to queue them for annual
 +
processing.  However, manually processed messages should be dealt
 +
with promptly.
  
Horton & Adams                                                [Page 11]
+
Failed control messages should NOT be mailed to the originator of
 +
the message, but to the local "usenet" account.
  
RFC 1036              Standard for USENET Messages        December 1987
+
=== Cancel ===
  
 +
                  cancel <Message-ID>
  
    and the body of the message are also potential parameters; for
+
If a message with the given Message-ID is present on the local
    example, the "From" line might suggest an address to which a
+
system, the message is cancelled.  This mechanism allows a user to
    response is to be mailed.
+
cancel a message after the message has been distributed over the
 +
network.
  
    Implementors and administrators may choose to allow control messages
+
If the system is unable to cancel the message as requested, it
    to be carried out automatically, or to queue them for annual
+
  should not forward the cancellation request to its neighbor systems.
    processing. However, manually processed messages should be dealt
 
    with promptly.
 
  
    Failed control messages should NOT be mailed to the originator of
+
Only the author of the message or the local news administrator is
    the message, but to the local "usenet" account.
+
allowed to send this message.  The verified sender of a message is
 +
the "Sender" line, or if no "Sender" line is present, the "From"
 +
line.  The verified sender of the cancel message must be the same as
 +
either the "Sender" or "From" field of the original message.  A
 +
verified sender in the cancel message is allowed to match an
 +
unverified "From" in the original message.
  
3.1.  Cancel
+
=== Ihave/Sendme ===
  
                    cancel <Message-ID>
+
                ihave <Message-ID list> [<remotesys>]
 +
                sendme <Message-ID list> [<remotesys>]
  
 +
This message is part of the ihave/sendme protocol, which allows one
 +
host (say A) to tell another host (B) that a particular message has
 +
been received on A.  Suppose that host A receives message
 +
"<[email protected]>", and wishes to transmit the message to
 +
host B.
  
    If a message with the given Message-ID is present on the local
+
A sends the control message "ihave <[email protected]> A" to
    system, the message is cancelledThis mechanism allows a user to
+
host B (by posting it to newsgroup to.B).  B responds with the
    cancel a message after the message has been distributed over the
+
control message "sendme <[email protected].edu> B" (on newsgroup
    network.
+
  to.A), if it has not already received the message. Upon receiving
  
    If the system is unable to cancel the message as requested, it
+
the sendme message, A sends the message to B.
    should not forward the cancellation request to its neighbor systems.
 
  
    Only the author of the message or the local news administrator is
+
This protocol can be used to cut down on redundant traffic between
    allowed to send this messageThe verified sender of a message is
+
hosts.  It is optional and should be used only if the particular
    the "Sender" line, or if no "Sender" line is present, the "From"
+
situation makes it worthwhileFrequently, the outcome is that,
    line. The verified sender of the cancel message must be the same as
+
since most original messages are short, and since there is a high
    either the "Sender" or "From" field of the original message. A
+
  overhead to start sending a new message with UUCP, it costs as much
    verified sender in the cancel message is allowed to match an
+
  to send the ihave as it would cost to send the message itself.
    unverified "From" in the original message.
 
  
3.2Ihave/Sendme
+
One possible solution to this overhead problem is to batch requests.
 +
Several Message-ID's may be announced or requested in one message.
 +
If no Message-ID's are listed in the control message, the body of
 +
  the message should be scanned for Message-ID's, one per line.
  
                  ihave <Message-ID list> [<remotesys>]
+
=== Newgroup ===
                  sendme <Message-ID list> [<remotesys>]
 
  
    This message is part of the ihave/sendme protocol, which allows one
+
                  newgroup <groupname> [moderated]
    host (say A) to tell another host (B) that a particular message has
 
    been received on A.  Suppose that host A receives message
 
    "<[email protected]>", and wishes to transmit the message to
 
    host B.
 
  
    A sends the control message "ihave <1234@ucbvax.Berkeley.edu> A" to
+
This control message creates a new newsgroup with the given name.
    host B (by posting it to newsgroup to.B)B responds with the
+
Since no messages may be posted or forwarded until a newsgroup is
    control message "sendme <[email protected]> B" (on newsgroup
+
created, this message is required before a newsgroup can be used.
    to.A), if it has not already received the message. Upon receiving
+
  The body of the message is expected to be a short paragraph
 +
describing the intended use of the newsgroup.
  
 +
If the second argument is present and it is the keyword moderated,
 +
the group should be created moderated instead of the default of
 +
unmoderated.  The newgroup message should be ignored unless there is
 +
an "Approved" line in the same message header.
  
 +
=== Rmgroup ===
  
Horton & Adams                                                [Page 12]
+
                        rmgroup <groupname>
  
RFC 1036              Standard for USENET Messages        December 1987
+
This message removes a newsgroup with the given name.  Since the
 +
newsgroup is removed from every host on the network, this command
 +
should be used carefully by a responsible administrator.  The
 +
rmgroup message should be ignored unless there is an "Approved:"
 +
line in the same message header.
  
 +
=== Sendsys ===
 +
                        sendsys (no arguments)
  
    the sendme message, A sends the message to B.
+
The sys file, listing all neighbors and the newsgroups to be sent to
 +
each neighbor, will be mailed to the author of the control message
 +
("Reply-To", if present, otherwise "From").  This information is
 +
considered public information, and it is a requirement of membership
 +
in USENET that this information be provided on request, either
 +
automatically in response to this control message, or manually, by
 +
mailing the requested information to the author of the message.
 +
This information is used to keep the map of USENET up to date, and
 +
to determine where netnews is sent.
  
    This protocol can be used to cut down on redundant traffic between
+
The format of the file mailed back to the author should be the same
    hostsIt is optional and should be used only if the particular
+
as that of the sys fileThis format has one line per neighboring
    situation makes it worthwhileFrequently, the outcome is that,
+
host (plus one line for the local host), containing four colon
    since most original messages are short, and since there is a high
+
separated fieldsThe first field has the host name of the
    overhead to start sending a new message with UUCP, it costs as much
+
neighbor, the second field has a newsgroup pattern describing the
    to send the ihave as it would cost to send the message itself.
+
newsgroups sent to the neighbor.  The third and fourth fields are
 +
not defined by this standard.  The sys file is not the same as the
 +
UUCP L.sys file. A sample response is:
  
    One possible solution to this overhead problem is to batch requests.
+
  From: cbosgd!mark  (Mark Horton)
    Several Message-ID's may be announced or requested in one message.
+
  Date: Sun, 27 Mar 83 20:39:37 -0500
    If no Message-ID's are listed in the control message, the body of
+
  Subject: response to your sendsys request
    the message should be scanned for Message-ID's, one per line.
+
  
3.3. Newgroup
+
  Responding-System: cbosgd.ATT.COM
 +
  cbosgd:osg,cb,btl,bell,world,comp,sci,rec,talk,misc,news,soc,to,
 +
        test
 +
  ucbvax:world,comp,to.ucbvax:L:
 +
  cbosg:world,comp,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews
 +
        /cbosg
 +
  cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
 +
  sescent:world,comp,bell,btl,cb,to.sescent:F:/usr/spool/outnews
 +
        /sescent
 +
  npois:world,comp,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
 +
  mhuxi:world,comp,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi
  
                      newgroup <groupname> [moderated]
+
=== Version ===
  
    This control message creates a new newsgroup with the given name.
+
                        version (no arguments)
    Since no messages may be posted or forwarded until a newsgroup is
 
    created, this message is required before a newsgroup can be used.
 
    The body of the message is expected to be a short paragraph
 
    describing the intended use of the newsgroup.
 
  
    If the second argument is present and it is the keyword moderated,
+
The name and version of the software running on the local system is
    the group should be created moderated instead of the default of
+
to be mailed back to the author of the message ("Reply-to" if
    unmoderated. The newgroup message should be ignored unless there is
+
  present, otherwise "From").
    an "Approved" line in the same message header.
 
  
3.4.  Rmgroup
+
=== Checkgroups ===
  
                            rmgroup <groupname>
+
The message body is a list of "official" newsgroups and their
 +
description, one group per line.  They are compared against the list
 +
of active newsgroups on the current host.  The names of any obsolete
 +
or new newsgroups are mailed to the user "usenet" and descriptions
 +
of the new newsgroups are added to the help file used when posting
 +
news.
  
    This message removes a newsgroup with the given name.  Since the
+
== Transmission Methods ==
    newsgroup is removed from every host on the network, this command
 
    should be used carefully by a responsible administrator.  The
 
    rmgroup message should be ignored unless there is an "Approved:"
 
    line in the same message header.
 
  
 +
USENET is not a physical network, but rather a logical network
 +
resting on top of several existing physical networks.  These
 +
networks include, but are not limited to, UUCP, the Internet, an
 +
Ethernet, the BLICN network, an NSC Hyperchannel, and a BERKNET.
 +
What is important is that two neighboring systems on USENET have
 +
some method to get a new message, in the format listed here, from
 +
one system to the other, and once on the receiving system, processed
 +
by the netnews software on that system.  (On UNIX systems, this
 +
usually means the rnews program being run with the message on the
 +
standard input. <1>)
  
 +
It is not a requirement that USENET hosts have mail systems capable
 +
of understanding the Internet mail syntax, but it is strongly
 +
recommended.  Since "From", "Reply-To", and "Sender" lines use the
 +
Internet syntax, replies will be difficult or impossible without an
 +
Internet mailer.  A host without an Internet mailer can attempt to
 +
use the "Path" header line for replies, but this field is not
 +
guaranteed to be a working path for replies.  In any event, any host
 +
generating or forwarding news messages must have an Internet address
 +
that allows them to receive mail from hosts with Internet mailers,
 +
and they must include their Internet address on their From line.
  
 +
=== Remote Execution ===
  
 +
Some networks permit direct remote command execution.  On these
 +
networks, news may be forwarded by spooling the rnews command with
 +
the message on the standard input.  For example, if the remote
 +
system is called remote, news would be sent over a UUCP link
 +
with the command:
  
 +
                          uux - remote!rnews
  
 +
and on a Berknet:
  
 +
                          net -mremote rnews
  
 +
It is important that the message be sent via a reliable mechanism,
 +
normally involving the possibility of spooling, rather than direct
 +
real-time remote execution.  This is because, if the remote system
 +
is down, a direct execution command will fail, and the message will
 +
never be delivered.  If the message is spooled, it will eventually
 +
be delivered when both systems are up.
  
 +
=== Transfer by Mail ===
  
 +
On some systems, direct remote spooled execution is not possible.
 +
However, most systems support electronic mail, and a news message
 +
can be sent as mail.  One approach is to send a mail message which
 +
is identical to the news message: the mail headers are the news
 +
headers, and the mail body is the news body.  By convention, this
 +
mail is sent to the user newsmail on the remote machine.
  
 +
One problem with this method is that it may not be possible to
 +
convince the mail system that the "From" line of the message is
 +
valid, since the mail message was generated by a program on a
 +
system different from the source of the news message.  Another
 +
problem is that error messages caused by the mail transmission
 +
would be sent to the originator of the news message, who has no
 +
control over news transmission between two cooperating hosts
 +
and does not know whom to contact.  Transmission error messages
 +
should be directed to a responsible contact person on the
 +
sending machine.
  
 +
A solution to this problem is to encapsulate the news message into a
 +
mail message, such that the entire message (headers and body) are
 +
part of the body of the mail message.  The convention here is that
 +
such mail is sent to user rnews on the remote system.  A mail
 +
message body is generated by prepending the letter N to each line of
 +
the news message, and then attaching whatever mail headers are
 +
convenient to generate.  The N's are attached to prevent any special
 +
lines in the news message from interfering with mail transmission,
 +
and to prevent any extra lines inserted by the mailer (headers,
 +
blank lines, etc.) from becoming part of the news message.  A
 +
program on the receiving machine receives mail to rnews, extracting
 +
the message itself and invoking the rnews program.  An example in
 +
this format might look like this:
  
Horton & Adams                                                [Page 13]
+
            Date: Mon, 3 Jan 83 08:33:47 MST
 +
            From: [email protected]
 +
            Subject: network news message
 +
            To: [email protected]
  
RFC 1036             Standard for USENET Messages        December 1987
+
             NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
 +
            NFrom: [email protected] (Derek Andrew)
 +
            NNewsgroups: misc.test
 +
            NSubject: necessary test
 +
            NMessage-ID: <[email protected]>
 +
            NDate: Mon, 3 Jan 83 00:59:15 MST
 +
            N
 +
            NThis really is a test.  If anyone out there more than 6
 +
            Nhops away would kindly confirm this note I would
 +
            Nappreciate it.  We suspect that our news postings
 +
            Nare not getting out into the world.
 +
            N
  
 +
Using mail solves the spooling problem, since mail must always be
 +
spooled if the destination host is down.  However, it adds more
 +
overhead to the transmission process (to encapsulate and extract the
 +
message) and makes it harder for software to give different
 +
priorities to news and mail.
  
3.5.  Sendsys
+
=== Batching ===
                          sendsys (no arguments)
 
  
    The sys file, listing all neighbors and the newsgroups to be sent to
+
Since news messages are usually short, and since a large number of
    each neighbor, will be mailed to the author of the control message
+
  messages are often sent between two hosts in a day, it may make
    ("Reply-To", if present, otherwise "From"). This information is
+
sense to batch news messages.  Several messages can be combined into
    considered public information, and it is a requirement of membership
+
one large message, using conventions agreed upon in advance by the
    in USENET that this information be provided on request, either
+
two hosts. One such batching scheme is described here; its use is
    automatically in response to this control message, or manually, by
+
highly recommended.
    mailing the requested information to the author of the message.
 
    This information is used to keep the map of USENET up to date, and
 
    to determine where netnews is sent.
 
  
    The format of the file mailed back to the author should be the same
+
  News messages are combined into a script, separated by a header of
    as that of the sys file. This format has one line per neighboring
+
  the form:
    host (plus one line for the local host), containing four colon
 
    separated fields.  The first field has the host name of the
 
    neighbor, the second field has a newsgroup pattern describing the
 
    newsgroups sent to the neighbor. The third and fourth fields are
 
    not defined by this standard.  The sys file is not the same as the
 
    UUCP L.sys file.  A sample response is:
 
  
      From: cbosgd!mark  (Mark Horton)
+
                #! rnews 1234
      Date: Sun, 27 Mar 83 20:39:37 -0500
 
      Subject: response to your sendsys request
 
      To: [email protected]
 
  
      Responding-System: cbosgd.ATT.COM
+
where 1234 is the length of the message in bytes. Each such line is
      cbosgd:osg,cb,btl,bell,world,comp,sci,rec,talk,misc,news,soc,to,
+
followed by a message containing the given number of bytes. (The
            test
+
newline at the end of each line of the message is counted as one
      ucbvax:world,comp,to.ucbvax:L:
+
byte, for purposes of this count, even if it is stored as <CARRIAGE
      cbosg:world,comp,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews
+
RETURN><LINE FEED>.)  For example, a batch of message might look
            /cbosg
+
like this:
      cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
 
      sescent:world,comp,bell,btl,cb,to.sescent:F:/usr/spool/outnews
 
            /sescent
 
      npois:world,comp,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
 
      mhuxi:world,comp,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi
 
  
3.6. Version
+
            #! rnews 239
 +
            From: jerry@eagle.ATT.COM (Jerry Schwarz)
 +
            Path: cbosgd!mhuxj!mhuxt!eagle!jerry
 +
            Newsgroups: news.announce
 +
            Subject: Usenet Etiquette -- Please Read
 +
            Message-ID: <[email protected]>
 +
            Date: Fri, 19 Nov 82 16:14:55 EST
 +
            Approved: [email protected]
  
                          version (no arguments)
+
            Here is an important message about USENET Etiquette.
 +
            #! rnews 234
 +
            From: [email protected] (Jerry Schwarz)
 +
            Path: cbosgd!mhuxj!mhuxt!eagle!jerry
 +
            Newsgroups: news.announce
 +
            Subject: Notes on Etiquette message
 +
            Message-ID: <[email protected]>
 +
            Date: Fri, 19 Nov 82 17:24:12 EST
 +
            Approved: [email protected]
  
    The name and version of the software running on the local system is
+
            There was something I forgot to mention in the last
    to be mailed back to the author of the message ("Reply-to" if
+
            message.
    present, otherwise "From").
 
  
3.7. Checkgroups
+
Batched news is recognized because the first character in the
 +
message is #. The message is then passed to the unbatcher for
 +
interpretation.
  
 +
The second argument (in this example rnews) determines which
 +
batching scheme is being used.  Cooperating hosts may use whatever
 +
scheme is appropriate for them.
  
 +
== The News Propagation Algorithm ==
  
Horton & Adams                                                [Page 14]
+
This section describes the overall scheme of USENET and the
 +
algorithm followed by hosts in propagating news to the entire
 +
logical network.  Since all hosts are affected by incorrectly
 +
formatted messages and by propagation errors, it is important
 +
for the method to be standardized.
  
RFC 1036              Standard for USENET Messages        December 1987
+
USENET is a directed graph.  Each node in the graph is a host
 +
computer, and each arc in the graph is a transmission path from
 +
one host to another host.  Each arc is labeled with a newsgroup
 +
pattern, specifying which newsgroup classes are forwarded along
 +
that link.  Most arcs are bidirectional, that is, if host A
 +
sends a class of newsgroups to host B, then host B usually sends
 +
the same class of newsgroups to host A.  This bidirectionality
 +
is not, however, required.
  
 +
USENET is made up of many subnetworks.  Each subnet has a name, such
  
    The message body is a list of "official" newsgroups and their
+
as comp or btl.  Each subnet is a connected graph, that is, a path
    description, one group per lineThey are compared against the list
+
exists from every node to every other node in the subnetIn
    of active newsgroups on the current hostThe names of any obsolete
+
addition, the entire graph is (theoretically) connected(In
    or new newsgroups are mailed to the user "usenet" and descriptions
+
practice, some political considerations have caused some hosts to be
    of the new newsgroups are added to the help file used when posting
+
unable to post messages reaching the rest of the network.)
    news.
 
  
4Transmission Methods
+
A message is posted on one machine to a list of newsgroups. That
 +
machine accepts it locally, then forwards it to all its neighbors
 +
that are interested in at least one of the newsgroups of the
 +
message.  (Site A deems host B to be "interested" in a newsgroup if
 +
the newsgroup matches the pattern on the arc from A to B.  This
 +
pattern is stored in a file on the A machine.) The hosts receiving
 +
the incoming message examine it to make sure they really want the
 +
message, accept it locally, and then in turn forward the message to
 +
all their interested neighbors.  This process continues until the
 +
entire network has seen the message.
  
    USENET is not a physical network, but rather a logical network
+
An important part of the algorithm is the prevention of loops.  The
    resting on top of several existing physical networks. These
+
above process would cause a message to loop along a cycle forever.
    networks include, but are not limited to, UUCP, the Internet, an
+
  In particular, when host A sends a message to host B, host B will
    Ethernet, the BLICN network, an NSC Hyperchannel, and a BERKNET.
+
send it back to host A, which will send it to host B, and so on.
    What is important is that two neighboring systems on USENET have
+
One solution to this is the history mechanism.  Each host keeps
    some method to get a new message, in the format listed here, from
+
track of all messages it has seen (by their Message-ID) and
    one system to the other, and once on the receiving system, processed
+
whenever a message comes in that it has already seen, the incoming
    by the netnews software on that system. (On UNIX systems, this
+
message is discarded immediately.  This solution is sufficient to
    usually means the rnews program being run with the message on the
+
  prevent loops, but additional optimizations can be made to avoid
    standard input. <1>)
+
sending messages to hosts that will simply throw them away.
  
    It is not a requirement that USENET hosts have mail systems capable
+
  One optimization is that a message should never be sent to a machine
    of understanding the Internet mail syntax, but it is strongly
+
listed in the "Path" line of the header.  When a machine name is
    recommended.  Since "From", "Reply-To", and "Sender" lines use the
+
in the "Path" line, the message is known to have passed through the
    Internet syntax, replies will be difficult or impossible without an
+
machine.  Another optimization is that, if the message originated
    Internet mailer.  A host without an Internet mailer can attempt to
+
on host A, then host A has already seen the message.  Thus, if a
    use the "Path" header line for replies, but this field is not
+
message is posted to newsgroup misc.misc, it will match the pattern
    guaranteed to be a working path for replies.  In any event, any host
+
misc.all (where all is a metasymbol that matches any string), and
    generating or forwarding news messages must have an Internet address
+
will be forwarded to all hosts that subscribe to misc.all (as
    that allows them to receive mail from hosts with Internet mailers,
+
determined by what their neighbors send them).  These hosts make up
    and they must include their Internet address on their From line.
+
the misc subnetwork.  A message posted to btl.general will reach all
 
+
hosts receiving btl.all, but will not reach hosts that do not get
4.1.  Remote Execution
+
btl.all.  In effect, the messages reaches the btl subnetwork.  A
 
+
messages posted to newsgroups misc.misc,btl.general will reach all
    Some networks permit direct remote command execution.  On these
+
hosts subscribing to either of the two classes.
    networks, news may be forwarded by spooling the rnews command with
 
    the message on the standard input.  For example, if the remote
 
    system is called remote, news would be sent over a UUCP link
 
    with the command:
 
 
 
                              uux - remote!rnews
 
 
 
    and on a Berknet:
 
 
 
                              net -mremote rnews
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Horton & Adams                                                [Page 15]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
    It is important that the message be sent via a reliable mechanism,
 
    normally involving the possibility of spooling, rather than direct
 
    real-time remote execution.  This is because, if the remote system
 
    is down, a direct execution command will fail, and the message will
 
    never be delivered.  If the message is spooled, it will eventually
 
    be delivered when both systems are up.
 
 
 
4.2.  Transfer by Mail
 
 
 
    On some systems, direct remote spooled execution is not possible.
 
    However, most systems support electronic mail, and a news message
 
    can be sent as mail.  One approach is to send a mail message which
 
    is identical to the news message: the mail headers are the news
 
    headers, and the mail body is the news body.  By convention, this
 
    mail is sent to the user newsmail on the remote machine.
 
 
 
    One problem with this method is that it may not be possible to
 
    convince the mail system that the "From" line of the message is
 
    valid, since the mail message was generated by a program on a
 
    system different from the source of the news message.  Another
 
    problem is that error messages caused by the mail transmission
 
    would be sent to the originator of the news message, who has no
 
    control over news transmission between two cooperating hosts
 
    and does not know whom to contact.  Transmission error messages
 
    should be directed to a responsible contact person on the
 
    sending machine.
 
 
 
    A solution to this problem is to encapsulate the news message into a
 
    mail message, such that the entire message (headers and body) are
 
    part of the body of the mail message.  The convention here is that
 
    such mail is sent to user rnews on the remote system.  A mail
 
    message body is generated by prepending the letter N to each line of
 
    the news message, and then attaching whatever mail headers are
 
    convenient to generate.  The N's are attached to prevent any special
 
    lines in the news message from interfering with mail transmission,
 
    and to prevent any extra lines inserted by the mailer (headers,
 
    blank lines, etc.) from becoming part of the news message.  A
 
    program on the receiving machine receives mail to rnews, extracting
 
    the message itself and invoking the rnews program.  An example in
 
    this format might look like this:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Horton & Adams                                                [Page 16]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
                Date: Mon, 3 Jan 83 08:33:47 MST
 
                From: [email protected]
 
                Subject: network news message
 
                To: [email protected]
 
 
 
                NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
 
                NFrom: [email protected] (Derek Andrew)
 
                NNewsgroups: misc.test
 
                NSubject: necessary test
 
                NMessage-ID: <[email protected]>
 
                NDate: Mon, 3 Jan 83 00:59:15 MST
 
                N
 
                NThis really is a test.  If anyone out there more than 6
 
                Nhops away would kindly confirm this note I would
 
                Nappreciate it.  We suspect that our news postings
 
                Nare not getting out into the world.
 
                N
 
 
 
    Using mail solves the spooling problem, since mail must always be
 
    spooled if the destination host is down.  However, it adds more
 
    overhead to the transmission process (to encapsulate and extract the
 
    message) and makes it harder for software to give different
 
    priorities to news and mail.
 
 
 
4.3.  Batching
 
 
 
    Since news messages are usually short, and since a large number of
 
    messages are often sent between two hosts in a day, it may make
 
    sense to batch news messages.  Several messages can be combined into
 
    one large message, using conventions agreed upon in advance by the
 
    two hosts.  One such batching scheme is described here; its use is
 
    highly recommended.
 
 
 
    News messages are combined into a script, separated by a header of
 
    the form:
 
 
 
 
 
                  #! rnews 1234
 
 
 
    where 1234 is the length of the message in bytes. Each such line is
 
    followed by a message containing the given number of bytes.  (The
 
    newline at the end of each line of the message is counted as one
 
    byte, for purposes of this count, even if it is stored as <CARRIAGE
 
    RETURN><LINE FEED>.)  For example, a batch of message might look
 
    like this:
 
 
 
 
 
 
 
 
 
 
 
 
 
Horton & Adams                                                [Page 17]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
                #! rnews 239
 
                From: [email protected] (Jerry Schwarz)
 
                Path: cbosgd!mhuxj!mhuxt!eagle!jerry
 
                Newsgroups: news.announce
 
                Subject: Usenet Etiquette -- Please Read
 
                Message-ID: <[email protected]>
 
                Date: Fri, 19 Nov 82 16:14:55 EST
 
                Approved: [email protected]
 
 
 
                Here is an important message about USENET Etiquette.
 
                #! rnews 234
 
                From: [email protected] (Jerry Schwarz)
 
                Path: cbosgd!mhuxj!mhuxt!eagle!jerry
 
                Newsgroups: news.announce
 
                Subject: Notes on Etiquette message
 
                Message-ID: <[email protected]>
 
                Date: Fri, 19 Nov 82 17:24:12 EST
 
                Approved: [email protected]
 
 
 
                There was something I forgot to mention in the last
 
                message.
 
 
 
    Batched news is recognized because the first character in the
 
    message is #.  The message is then passed to the unbatcher for
 
    interpretation.
 
 
 
    The second argument (in this example rnews) determines which
 
    batching scheme is being used.  Cooperating hosts may use whatever
 
    scheme is appropriate for them.
 
 
 
5.  The News Propagation Algorithm
 
 
 
    This section describes the overall scheme of USENET and the
 
    algorithm followed by hosts in propagating news to the entire
 
    logical network.  Since all hosts are affected by incorrectly
 
    formatted messages and by propagation errors, it is important
 
    for the method to be standardized.
 
 
 
    USENET is a directed graph.  Each node in the graph is a host
 
    computer, and each arc in the graph is a transmission path from
 
    one host to another host.  Each arc is labeled with a newsgroup
 
    pattern, specifying which newsgroup classes are forwarded along
 
    that link.  Most arcs are bidirectional, that is, if host A
 
    sends a class of newsgroups to host B, then host B usually sends
 
    the same class of newsgroups to host A.  This bidirectionality
 
    is not, however, required.
 
 
 
    USENET is made up of many subnetworks.  Each subnet has a name, such
 
 
 
 
 
 
 
Horton & Adams                                                [Page 18]
 
 
 
RFC 1036              Standard for USENET Messages        December 1987
 
 
 
 
 
    as comp or btl.  Each subnet is a connected graph, that is, a path
 
    exists from every node to every other node in the subnet.  In
 
    addition, the entire graph is (theoretically) connected.  (In
 
    practice, some political considerations have caused some hosts to be
 
    unable to post messages reaching the rest of the network.)
 
 
 
    A message is posted on one machine to a list of newsgroups. That
 
    machine accepts it locally, then forwards it to all its neighbors
 
    that are interested in at least one of the newsgroups of the
 
    message.  (Site A deems host B to be "interested" in a newsgroup if
 
    the newsgroup matches the pattern on the arc from A to B.  This
 
    pattern is stored in a file on the A machine.)  The hosts receiving
 
    the incoming message examine it to make sure they really want the
 
    message, accept it locally, and then in turn forward the message to
 
    all their interested neighbors.  This process continues until the
 
    entire network has seen the message.
 
 
 
    An important part of the algorithm is the prevention of loops.  The
 
    above process would cause a message to loop along a cycle forever.
 
    In particular, when host A sends a message to host B, host B will
 
    send it back to host A, which will send it to host B, and so on.
 
    One solution to this is the history mechanism.  Each host keeps
 
    track of all messages it has seen (by their Message-ID) and
 
    whenever a message comes in that it has already seen, the incoming
 
    message is discarded immediately.  This solution is sufficient to
 
    prevent loops, but additional optimizations can be made to avoid
 
    sending messages to hosts that will simply throw them away.
 
 
 
    One optimization is that a message should never be sent to a machine
 
    listed in the "Path" line of the header.  When a machine name is
 
    in the "Path" line, the message is known to have passed through the
 
    machine.  Another optimization is that, if the message originated
 
    on host A, then host A has already seen the message.  Thus, if a
 
    message is posted to newsgroup misc.misc, it will match the pattern
 
    misc.all (where all is a metasymbol that matches any string), and
 
    will be forwarded to all hosts that subscribe to misc.all (as
 
    determined by what their neighbors send them).  These hosts make up
 
    the misc subnetwork.  A message posted to btl.general will reach all
 
    hosts receiving btl.all, but will not reach hosts that do not get
 
    btl.all.  In effect, the messages reaches the btl subnetwork.  A
 
    messages posted to newsgroups misc.misc,btl.general will reach all
 
    hosts subscribing to either of the two classes.
 
  
 
Notes
 
Notes
  
    <1>  UNIX is a registered trademark of AT&T.
+
<1>  UNIX is a registered trademark of AT&T.
 
 
 
 
 
 
 
 
 
 
Horton & Adams                                                [Page 19]
 

Latest revision as of 00:05, 28 September 2020

Network Working Group M. Horton Request for Comments: 1036 AT&T Bell Laboratories Obsoletes: RFC-850 R. Adams

                                          Center for Seismic Studies
                                                       December 1987
          Standard for Interchange of USENET Messages

STATUS OF THIS MEMO

This document defines the standard format for the interchange of
network News messages among USENET hosts.  It updates and replaces
RFC-850, reflecting version B2.11 of the News program.  This memo is
disributed as an RFC to make this information easily accessible to
the Internet community.  It does not specify an Internet standard.
Distribution of this memo is unlimited.

Introduction

This document defines the standard format for the interchange of
network News messages among USENET hosts.  It describes the format
for messages themselves and gives partial standards for transmission
of news.  The news transmission is not entirely in order to give a
good deal of flexibility to the hosts to choose transmission
hardware and software, to batch news, and so on.
There are five sections to this document.  Section two defines the
format.  Section three defines the valid control messages.  Section
four specifies some valid transmission methods.  Section five
describes the overall news propagation algorithm.

Message Format

The primary consideration in choosing a message format is that it
fit in with existing tools as well as possible.  Existing tools
include implementations of both mail and news.  (The notesfiles
system from the University of Illinois is considered a news
implementation.)  A standard format for mail messages has existed
for many years on the Internet, and this format meets most of the
needs of USENET.  Since the Internet format is extensible,
extensions to meet the additional needs of USENET are easily made
within the Internet standard.  Therefore, the rule is adopted that
all USENET news messages must be formatted as valid Internet mail
messages, according to the Internet standard RFC-822.  The USENET
News standard is more restrictive than the Internet standard,
placing additional requirements on each message and forbidding use
of certain Internet features.  However, it should always be possible
to use a tool expecting an Internet message to process a news
message.  In any situation where this standard conflicts with the
Internet standard, RFC-822 should be considered correct and this
standard in error.
Here is an example USENET message to illustrate the fields.
          From: [email protected] (Jerry Schwarz)
          Path: cbosgd!mhuxj!mhuxt!eagle!jerry
          Newsgroups: news.announce
          Subject: Usenet Etiquette -- Please Read
          Message-ID: <[email protected]>
          Date: Fri, 19 Nov 82 16:14:55 GMT
          Followup-To: news.misc
          Expires: Sat, 1 Jan 83 00:00:00 -0500
          Organization: AT&T Bell Laboratories, Murray Hill
          The body of the message comes here, after a blank line.
  Here is an example of a message in the old format (before the
  existence of this standard). It is recommended that
  implementations also accept messages in this format to ease upward
  conversion.
           From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
           Newsgroups: news.misc
           Title: Usenet Etiquette -- Please Read
           Article-I.D.: eagle.642
           Posted: Fri Nov 19 16:14:55 1982
           Received: Fri Nov 19 16:59:30 1982
           Expires: Mon Jan 1 00:00:00 1990
           The body of the message comes here, after a blank line.
  Some news systems transmit news in the A format, which looks like
  this:
            Aeagle.642
            news.misc
            cbosgd!mhuxj!mhuxt!eagle!jerry
            Fri Nov 19 16:14:55 1982
            Usenet Etiquette - Please Read
            The body of the message comes here, with no blank line.
A standard USENET message consists of several header lines, followed
by a blank line, followed by the body of the message.  Each header
line consist of a keyword, a colon, a blank, and some additional
information.  This is a subset of the Internet standard, simplified
to allow simpler software to handle it.  The "From" line may
optionally include a full name, in the format above, or use the
Internet angle bracket syntax.  To keep the implementations simple,
other formats (for example, with part of the machine address after
the close parenthesis) are not allowed.  The Internet convention of
continuation header lines (beginning with a blank or tab) is
allowed.
Certain headers are required, and certain other headers are
optional.  Any unrecognized headers are allowed, and will be passed
through unchanged.  The required header lines are "From", "Date",
"Newsgroups", "Subject", "Message-ID", and "Path".  The optional
header lines are "Followup-To", "Expires", "Reply-To", "Sender",
"References", "Control", "Distribution", "Keywords", "Summary",
"Approved", "Lines", "Xref", and "Organization".  Each of these
header lines will be described below.

Required Header lines

From

The "From" line contains the electronic mailing address of the
person who sent the message, in the Internet syntax.  It may
optionally also contain the full name of the person, in parentheses,
after the electronic address.  The electronic address is the same as
the entity responsible for originating the message, unless the
"Sender" header is present, in which case the "From" header might
not be verified.  Note that in all host and domain names, upper and
lower case are considered the same, thus "[email protected]",
"[email protected]", and "[email protected]" are all equivalent.
User names may or may not be case sensitive, for example,
"[email protected]" might be different from
"[email protected]".  Programs should avoid changing the case of
electronic addresses when forwarding news or mail.
RFC-822 specifies that all text in parentheses is to be interpreted
as a comment.  It is common in Internet mail to place the full name
of the user in a comment at the end of the "From" line.  This
standard specifies a more rigid syntax.  The full name is not
considered a comment, but an optional part of the header line.
Either the full name is omitted, or it appears in parentheses after
the electronic address of the person posting the message, or it
appears before an electronic address which is enclosed in angle
brackets.  Thus, the three permissible forms are:
          From: [email protected]
          From: [email protected] (Mark Horton)
          From: Mark Horton <[email protected]>
Full names may contain any printing ASCII characters from space
through tilde, except that they may not contain "(" (left
parenthesis), ")" (right parenthesis), "<" (left angle bracket), or
">" (right angle bracket).  Additional restrictions may be placed on
full names by the mail standard, in particular, the characters ","
(comma), ":" (colon), "@" (at), "!" (bang), "/" (slash), "="
(equal), and ";" (semicolon) are inadvisable in full names.

Date

The "Date" line (formerly "Posted") is the date that the message was
originally posted to the network.  Its format must be acceptable
both in RFC-822 and to the getdate(3) routine that is provided with
the Usenet software.  This date remains unchanged as the message is
propagated throughout the network.  One format that is acceptable to
both is:
                  Wdy, DD Mon YY HH:MM:SS TIMEZONE
Several examples of valid dates appear in the sample message above.
Note in particular that ctime(3) format:
                      Wdy Mon DD HH:MM:SS YYYY
is not acceptable because it is not a valid RFC-822 date.  However,
since older software still generates this format, news
implementations are encouraged to accept this format and translate
it into an acceptable format.
There is no hope of having a complete list of timezones.  Universal
Time (GMT), the North American timezones (PST, PDT, MST, MDT, CST,
CDT, EST, EDT) and the +/-hhmm offset specifed in RFC-822 should be
supported.  It is recommended that times in message headers be
transmitted in GMT and displayed in the local time zone.

Newsgroups

The "Newsgroups" line specifies the newsgroup or newsgroups in which
the message belongs.  Multiple newsgroups may be specified,
separated by a comma.  Newsgroups specified must all be the names of
existing newsgroups, as no new newsgroups will be created by simply
posting to them.
Wildcards (e.g., the word "all") are never allowed in a "News-
groups" line.  For example, a newsgroup comp.all is illegal,
although a newsgroup rec.sport.football is permitted.
If a message is received with a "Newsgroups" line listing some valid
newsgroups and some invalid newsgroups, a host should not remove
invalid newsgroups from the list.  Instead, the invalid newsgroups
should be ignored.  For example, suppose host A subscribes to the
classes btl.all and comp.all, and exchanges news messages with host
B, which subscribes to comp.all but not btl.all.  Suppose A receives
a message with Newsgroups: comp.unix,btl.general.
This message is passed on to B because B receives comp.unix, but B
does not receive btl.general.  A must leave the "Newsgroups" line
unchanged.  If it were to remove btl.general, the edited header
could eventually re-enter the btl.all class, resulting in a message
that is not shown to users subscribing to btl.general.  Also,
follow-ups from outside btl.all would not be shown to such users.

Subject

The "Subject" line (formerly "Title") tells what the message is
about.  It should be suggestive enough of the contents of the
message to enable a reader to make a decision whether to read the
message based on the subject alone.  If the message is submitted in
response to another message (e.g., is a follow-up) the default
subject should begin with the four characters "Re:", and the
"References" line is required.  For follow-ups, the use of the
"Summary" line is encouraged.

Message-ID

The "Message-ID" line gives the message a unique identifier.  The
Message-ID may not be reused during the lifetime of any previous
message with the same Message-ID.  (It is recommended that no
Message-ID be reused for at least two years.)  Message-ID's have the
syntax:
                 <string not containing blank or ">">
In order to conform to RFC-822, the Message-ID must have the format:
                      <unique@full_domain_name>
where full_domain_name is the full name of the host at which the
message entered the network, including a domain that host is in, and
unique is any string of printing ASCII characters, not including "<"
(left angle bracket), ">" (right angle bracket), or "@" (at sign).
For example, the unique part could be an integer representing a
sequence number for messages submitted to the network, or a short
string derived from the date and time the message was created.  For
example, a valid Message-ID for a message submitted from host ucbvax
in domain "Berkeley.EDU" would be "<[email protected]>".
Programmers are urged not to make assumptions about the content of
Message-ID fields from other hosts, but to treat them as unknown
character strings.  It is not safe, for example, to assume that a
Message-ID will be under 14 characters, that it is unique in the
first 14 characters, nor that is does not contain a "/".
The angle brackets are considered part of the Message-ID.  Thus, in
references to the Message-ID, such as the ihave/sendme and cancel
control messages, the angle brackets are included.  White space
characters (e.g., blank and tab) are not allowed in a Message-ID.
Slashes ("/") are strongly discouraged.  All characters between the
angle brackets must be printing ASCII characters.

Path

This line shows the path the message took to reach the current
system.  When a system forwards the message, it should add its own
name to the list of systems in the "Path" line.  The names may be
separated by any punctuation character or characters (except "."
which is considered part of the hostname).  Thus, the following are
valid entries:
               cbosgd!mhuxj!mhuxt
               cbosgd, mhuxj, mhuxt
               @cbosgd.ATT.COM,@mhuxj.ATT.COM,@mhuxt.ATT.COM
               teklabs, zehntel, sri-unix@cca!decvax
(The latter path indicates a message that passed through decvax,
cca, sri-unix, zehntel, and teklabs, in that order.) Additional
names should be added from the left.  For example, the most recently
added name in the fourth example was teklabs.  Letters, digits,
periods and hyphens are considered part of host names; other
punctuation, including blanks, are considered separators.
Normally, the rightmost name will be the name of the originating
system.  However, it is also permissible to include an extra entry
on the right, which is the name of the sender.  This is for upward
compatibility with older systems.
The "Path" line is not used for replies, and should not be taken as
a mailing address.  It is intended to show the route the message
traveled to reach the local host.  There are several uses for this
information.  One is to monitor USENET routing for performance
reasons.  Another is to establish a path to reach new hosts.
Perhaps the most important use is to cut down on redundant USENET
traffic by failing to forward a message to a host that is known to
have already received it.  In particular, when host A sends a
message to host B, the "Path" line includes A, so that host B will
not immediately send the message back to host A.  The name each host
uses to identify itself should be the same as the name by which its
neighbors know it, in order to make this optimization possible.
A host adds its own name to the front of a path when it receives a
message from another host.  Thus, if a message with path "A!X!Y!Z"
is passed from host A to host B, B will add its own name to the path
when it receives the message from A, e.g., "B!A!X!Y!Z".  If B then
passes the message on to C, the message sent to C will contain the
path "B!A!X!Y!Z", and when C receives it, C will change it to
"C!B!A!X!Y!Z".
Special upward compatibility note:  Since the "From", "Sender", and
"Reply-To" lines are in Internet format, and since many USENET hosts
do not yet have mailers capable of understanding Internet format, it
would break the reply capability to completely sever the connection
between the "Path" header and the reply function.  It is recognized
that the path is not always a valid reply string in older
implementations, and no requirement to fix this problem is placed on
implementations.  However, the existing convention of placing the
host name and an "!"  at the front of the path, and of starting the
path with the host name, an "!", and the user name, should be
maintained when possible.

Optional Headers

Reply-To

This line has the same format as "From".  If present, mailed replies
to the author should be sent to the name given here.  Otherwise,
replies are mailed to the name on the "From" line. (This does not
prevent additional copies from being sent to recipients named by the
replier, or on "To" or "Cc" lines.)  The full name may be optionally
given, in parentheses, as in the "From" line.

Sender

This field is present only if the submitter manually enters a "From"
line.  It is intended to record the entity responsible for
submitting the message to the network.  It should be verified by the
software at the submitting host.
For example, if John Smith is visiting CCA and wishes to post a
message to the network, using friend Sarah Jones' account, the
message might read:
          From: [email protected] (John Smith)
          Sender: [email protected] (Sarah Jones)
If a gateway program enters a mail message into the network at host
unix.SRI.COM, the lines might read:
          From: [email protected]
          Sender: [email protected]
The primary purpose of this field is to be able to track down
messages to determine how they were entered into the network.  The
full name may be optionally given, in parentheses, as in the "From"
line.

Followup-To

This line has the same format as "Newsgroups".  If present, follow-
up messages are to be posted to the newsgroup or newsgroups listed
here.  If this line is not present, follow-ups are posted to the
newsgroup or newsgroups listed in the "Newsgroups" line.
If the keyword poster is present, follow-up messages are not
permitted.  The message should be mailed to the submitter of the
message via mail.

Expires

This line, if present, is in a legal USENET date format.  It
specifies a suggested expiration date for the message.  If not
present, the local default expiration date is used.  This field is
intended to be used to clean up messages with a limited usefulness,
or to keep important messages around for longer than usual.  For
example, a message announcing an upcoming seminar could have an
expiration date the day after the seminar, since the message is not
useful after the seminar is over.  Since local hosts have local
policies for expiration of news (depending on available disk space,
for instance), users are discouraged from providing expiration dates
for messages unless there is a natural expiration date associated
with the topic.  System software should almost never provide a
default "Expires" line.  Leave it out and allow local policies to be
used unless there is a good reason not to.

References

This field lists the Message-ID's of any messages prompting the
submission of this message.  It is required for all follow-up
messages, and forbidden when a new subject is raised.
Implementations should provide a follow-up command, which allows a
user to post a follow-up message.  This command should generate a
"Subject" line which is the same as the original message, except
that if the original subject does not begin with "Re:" or "re:", the
four characters "Re:" are inserted before the subject.  If there is
no "References" line on the original header, the "References" line
should contain the Message-ID of the original message (including the
angle brackets).  If the original message does have a "References"
line, the follow-up message should have a "References" line
containing the text of the original "References" line, a blank, and
the Message-ID of the original message.
The purpose of the "References" header is to allow messages to be
grouped into conversations by the user interface program.  This
allows conversations within a newsgroup to be kept together, and
potentially users might shut off entire conversations without
unsubscribing to a newsgroup.  User interfaces need not make use of
this header, but all automatically generated follow-ups should
generate the "References" line for the benefit of systems that do
use it, and manually generated follow-ups (e.g., typed in well after
the original message has been printed by the machine) should be
encouraged to include them as well.
It is permissible to not include the entire previous "References"
line if it is too long.  An attempt should be made to include a
reasonable number of backwards references.

Control

If a message contains a "Control" line, the message is a control
message.  Control messages are used for communication among USENET
host machines, not to be read by users.  Control messages are
distributed by the same newsgroup mechanism as ordinary messages.
The body of the "Control" header line is the message to the host.
For upward compatibility, messages that match the newsgroup pattern
"all.all.ctl" should also be interpreted as control messages.  If no
"Control" header is present on such messages, the subject is used as
the control message.  However, messages on newsgroups matching this
pattern do not conform to this standard.
Also for upward compatibility, if the first 4 characters of the
"Subject:" line are "cmsg", the rest of the "Subject:" line should
be interpreted as a control message.

Distribution

This line is used to alter the distribution scope of the message.
It is a comma separated list similar to the "Newsgroups" line.  User
subscriptions are still controlled by "Newsgroups", but the message
is sent to all systems subscribing to the newsgroups on the
"Distribution" line in addition to the "Newsgroups" line.  For the
message to be transmitted, the receiving site must normally receive
one of the specified newsgroups AND must receive one of the
specified distributions.  Thus, a message concerning a car for sale
in New Jersey might have headers including:
               Newsgroups: rec.auto,misc.forsale
               Distribution: nj,ny
so that it would only go to persons subscribing to rec.auto or misc.
for sale within New Jersey or New York.  The intent of this header
is to restrict the distribution of a newsgroup further, not to
increase it.  A local newsgroup, such as nj.crazy-eddie, will
probably not be propagated by hosts outside New Jersey that do not
show such a newsgroup as valid.  A follow-up message should default
to the same "Distribution" line as the original message, but the
user can change it to a more limited one, or escalate the
distribution if it was originally restricted and a more widely
distributed reply is appropriate.

Organization

The text of this line is a short phrase describing the organization
to which the sender belongs, or to which the machine belongs.  The
intent of this line is to help identify the person posting the
message, since host names are often cryptic enough to make it hard
to recognize the organization by the electronic address.

Keywords

A few well-selected keywords identifying the message should be on
this line.  This is used as an aid in determining if this message is
interesting to the reader.

2.2.10. Summary

This line should contain a brief summary of the message.  It is
usually used as part of a follow-up to another message.  Again, it
is very useful to the reader in determining whether to read the
message.

2.2.11. Approved

This line is required for any message posted to a moderated
newsgroup.  It should be added by the moderator and consist of his
mail address.  It is also required with certain control messages.

2.2.12. Lines

This contains a count of the number of lines in the body of the
message.

2.2.13. Xref

This line contains the name of the host (with domains omitted) and a
white space separated list of colon-separated pairs of newsgroup
names and message numbers.  These are the newsgroups listed in the
"Newsgroups" line and the corresponding message numbers from the
spool directory.
This is only of value to the local system, so it should not be
transmitted.  For example, in:
           Path: seismo!lll-crg!lll-lcc!pyramid!decwrl!reid
           From: [email protected] (Brian Reid)
           Newsgroups: news.lists,news.groups
           Subject: USENET READERSHIP SUMMARY REPORT FOR SEP 86
           Message-ID: <[email protected]>
           Date: 1 Oct 86 11:26:15 GMT
           Organization: DEC Western Research Laboratory
           Lines: 441
           Approved: [email protected]
           Xref: seismo news.lists:461 news.groups:6378
the "Xref" line shows that the message is message number 461 in the
newsgroup news.lists, and message number 6378 in the newsgroup
news.groups, on host seismo.  This information may be used by
certain user interfaces.

Control Messages

This section lists the control messages currently defined.  The body
of the "Control" header line is the control message.  Messages are a
sequence of zero or more words, separated by white space (blanks or
tabs).  The first word is the name of the control message, remaining
words are parameters to the message.  The remainder of the header
and the body of the message are also potential parameters; for
example, the "From" line might suggest an address to which a
response is to be mailed.
Implementors and administrators may choose to allow control messages
to be carried out automatically, or to queue them for annual
processing.  However, manually processed messages should be dealt
with promptly.
Failed control messages should NOT be mailed to the originator of
the message, but to the local "usenet" account.

Cancel

                 cancel <Message-ID>
If a message with the given Message-ID is present on the local
system, the message is cancelled.  This mechanism allows a user to
cancel a message after the message has been distributed over the
network.
If the system is unable to cancel the message as requested, it
should not forward the cancellation request to its neighbor systems.
Only the author of the message or the local news administrator is
allowed to send this message.  The verified sender of a message is
the "Sender" line, or if no "Sender" line is present, the "From"
line.  The verified sender of the cancel message must be the same as
either the "Sender" or "From" field of the original message.  A
verified sender in the cancel message is allowed to match an
unverified "From" in the original message.

Ihave/Sendme

               ihave <Message-ID list> [<remotesys>]
               sendme <Message-ID list> [<remotesys>]
This message is part of the ihave/sendme protocol, which allows one
host (say A) to tell another host (B) that a particular message has
been received on A.  Suppose that host A receives message
"<[email protected]>", and wishes to transmit the message to
host B.
A sends the control message "ihave <[email protected]> A" to
host B (by posting it to newsgroup to.B).  B responds with the
control message "sendme <[email protected]> B" (on newsgroup
to.A), if it has not already received the message.  Upon receiving
the sendme message, A sends the message to B.
This protocol can be used to cut down on redundant traffic between
hosts.  It is optional and should be used only if the particular
situation makes it worthwhile.  Frequently, the outcome is that,
since most original messages are short, and since there is a high
overhead to start sending a new message with UUCP, it costs as much
to send the ihave as it would cost to send the message itself.
One possible solution to this overhead problem is to batch requests.
Several Message-ID's may be announced or requested in one message.
If no Message-ID's are listed in the control message, the body of
the message should be scanned for Message-ID's, one per line.

Newgroup

                  newgroup <groupname> [moderated]
This control message creates a new newsgroup with the given name.
Since no messages may be posted or forwarded until a newsgroup is
created, this message is required before a newsgroup can be used.
The body of the message is expected to be a short paragraph
describing the intended use of the newsgroup.
If the second argument is present and it is the keyword moderated,
the group should be created moderated instead of the default of
unmoderated.  The newgroup message should be ignored unless there is
an "Approved" line in the same message header.

Rmgroup

                        rmgroup <groupname>
This message removes a newsgroup with the given name.  Since the
newsgroup is removed from every host on the network, this command
should be used carefully by a responsible administrator.  The
rmgroup message should be ignored unless there is an "Approved:"
line in the same message header.

Sendsys

                       sendsys (no arguments)
The sys file, listing all neighbors and the newsgroups to be sent to
each neighbor, will be mailed to the author of the control message
("Reply-To", if present, otherwise "From").  This information is
considered public information, and it is a requirement of membership
in USENET that this information be provided on request, either
automatically in response to this control message, or manually, by
mailing the requested information to the author of the message.
This information is used to keep the map of USENET up to date, and
to determine where netnews is sent.
The format of the file mailed back to the author should be the same
as that of the sys file.  This format has one line per neighboring
host (plus one line for the local host), containing four colon
separated fields.  The first field has the host name of the
neighbor, the second field has a newsgroup pattern describing the
newsgroups sent to the neighbor.  The third and fourth fields are
not defined by this standard.  The sys file is not the same as the
UUCP L.sys file.  A sample response is:
  From: cbosgd!mark  (Mark Horton)
  Date: Sun, 27 Mar 83 20:39:37 -0500
  Subject: response to your sendsys request
  To: [email protected]
  Responding-System: cbosgd.ATT.COM
  cbosgd:osg,cb,btl,bell,world,comp,sci,rec,talk,misc,news,soc,to,
        test
  ucbvax:world,comp,to.ucbvax:L:
  cbosg:world,comp,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews
        /cbosg
  cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
  sescent:world,comp,bell,btl,cb,to.sescent:F:/usr/spool/outnews
        /sescent
  npois:world,comp,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
  mhuxi:world,comp,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi

Version

                       version (no arguments)
The name and version of the software running on the local system is
to be mailed back to the author of the message ("Reply-to" if
present, otherwise "From").

Checkgroups

The message body is a list of "official" newsgroups and their
description, one group per line.  They are compared against the list
of active newsgroups on the current host.  The names of any obsolete
or new newsgroups are mailed to the user "usenet" and descriptions
of the new newsgroups are added to the help file used when posting
news.

Transmission Methods

USENET is not a physical network, but rather a logical network
resting on top of several existing physical networks.  These
networks include, but are not limited to, UUCP, the Internet, an
Ethernet, the BLICN network, an NSC Hyperchannel, and a BERKNET.
What is important is that two neighboring systems on USENET have
some method to get a new message, in the format listed here, from
one system to the other, and once on the receiving system, processed
by the netnews software on that system.  (On UNIX systems, this
usually means the rnews program being run with the message on the
standard input. <1>)
It is not a requirement that USENET hosts have mail systems capable
of understanding the Internet mail syntax, but it is strongly
recommended.  Since "From", "Reply-To", and "Sender" lines use the
Internet syntax, replies will be difficult or impossible without an
Internet mailer.  A host without an Internet mailer can attempt to
use the "Path" header line for replies, but this field is not
guaranteed to be a working path for replies.  In any event, any host
generating or forwarding news messages must have an Internet address
that allows them to receive mail from hosts with Internet mailers,
and they must include their Internet address on their From line.

Remote Execution

Some networks permit direct remote command execution.  On these
networks, news may be forwarded by spooling the rnews command with
the message on the standard input.  For example, if the remote
system is called remote, news would be sent over a UUCP link
with the command:
                          uux - remote!rnews
and on a Berknet:
                          net -mremote rnews
It is important that the message be sent via a reliable mechanism,
normally involving the possibility of spooling, rather than direct
real-time remote execution.  This is because, if the remote system
is down, a direct execution command will fail, and the message will
never be delivered.  If the message is spooled, it will eventually
be delivered when both systems are up.

Transfer by Mail

On some systems, direct remote spooled execution is not possible.
However, most systems support electronic mail, and a news message
can be sent as mail.  One approach is to send a mail message which
is identical to the news message: the mail headers are the news
headers, and the mail body is the news body.  By convention, this
mail is sent to the user newsmail on the remote machine.
One problem with this method is that it may not be possible to
convince the mail system that the "From" line of the message is
valid, since the mail message was generated by a program on a
system different from the source of the news message.  Another
problem is that error messages caused by the mail transmission
would be sent to the originator of the news message, who has no
control over news transmission between two cooperating hosts
and does not know whom to contact.  Transmission error messages
should be directed to a responsible contact person on the
sending machine.
A solution to this problem is to encapsulate the news message into a
mail message, such that the entire message (headers and body) are
part of the body of the mail message.  The convention here is that
such mail is sent to user rnews on the remote system.  A mail
message body is generated by prepending the letter N to each line of
the news message, and then attaching whatever mail headers are
convenient to generate.  The N's are attached to prevent any special
lines in the news message from interfering with mail transmission,
and to prevent any extra lines inserted by the mailer (headers,
blank lines, etc.) from becoming part of the news message.  A
program on the receiving machine receives mail to rnews, extracting
the message itself and invoking the rnews program.  An example in
this format might look like this:
            Date: Mon, 3 Jan 83 08:33:47 MST
            From: [email protected]
            Subject: network news message
            To: [email protected]
            NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
            NFrom: [email protected] (Derek Andrew)
            NNewsgroups: misc.test
            NSubject: necessary test
            NMessage-ID: <[email protected]>
            NDate: Mon, 3 Jan 83 00:59:15 MST
            N
            NThis really is a test.  If anyone out there more than 6
            Nhops away would kindly confirm this note I would
            Nappreciate it.  We suspect that our news postings
            Nare not getting out into the world.
            N
Using mail solves the spooling problem, since mail must always be
spooled if the destination host is down.  However, it adds more
overhead to the transmission process (to encapsulate and extract the
message) and makes it harder for software to give different
priorities to news and mail.

Batching

Since news messages are usually short, and since a large number of
messages are often sent between two hosts in a day, it may make
sense to batch news messages.  Several messages can be combined into
one large message, using conventions agreed upon in advance by the
two hosts.  One such batching scheme is described here; its use is
highly recommended.
News messages are combined into a script, separated by a header of
the form:
               #! rnews 1234
where 1234 is the length of the message in bytes.  Each such line is
followed by a message containing the given number of bytes.  (The
newline at the end of each line of the message is counted as one
byte, for purposes of this count, even if it is stored as <CARRIAGE
RETURN><LINE FEED>.)  For example, a batch of message might look
like this:
            #! rnews 239
            From: [email protected] (Jerry Schwarz)
            Path: cbosgd!mhuxj!mhuxt!eagle!jerry
            Newsgroups: news.announce
            Subject: Usenet Etiquette -- Please Read
            Message-ID: <[email protected]>
            Date: Fri, 19 Nov 82 16:14:55 EST
            Approved: [email protected]
            Here is an important message about USENET Etiquette.
            #! rnews 234
            From: [email protected] (Jerry Schwarz)
            Path: cbosgd!mhuxj!mhuxt!eagle!jerry
            Newsgroups: news.announce
            Subject: Notes on Etiquette message
            Message-ID: <[email protected]>
            Date: Fri, 19 Nov 82 17:24:12 EST
            Approved: [email protected]
            There was something I forgot to mention in the last
            message.
Batched news is recognized because the first character in the
message is #.  The message is then passed to the unbatcher for
interpretation.
The second argument (in this example rnews) determines which
batching scheme is being used.  Cooperating hosts may use whatever
scheme is appropriate for them.

The News Propagation Algorithm

This section describes the overall scheme of USENET and the
algorithm followed by hosts in propagating news to the entire
logical network.  Since all hosts are affected by incorrectly
formatted messages and by propagation errors, it is important
for the method to be standardized.
USENET is a directed graph.  Each node in the graph is a host
computer, and each arc in the graph is a transmission path from
one host to another host.  Each arc is labeled with a newsgroup
pattern, specifying which newsgroup classes are forwarded along
that link.  Most arcs are bidirectional, that is, if host A
sends a class of newsgroups to host B, then host B usually sends
the same class of newsgroups to host A.  This bidirectionality
is not, however, required.
USENET is made up of many subnetworks.  Each subnet has a name, such
as comp or btl.  Each subnet is a connected graph, that is, a path
exists from every node to every other node in the subnet.  In
addition, the entire graph is (theoretically) connected.  (In
practice, some political considerations have caused some hosts to be
unable to post messages reaching the rest of the network.)
A message is posted on one machine to a list of newsgroups. That
machine accepts it locally, then forwards it to all its neighbors
that are interested in at least one of the newsgroups of the
message.  (Site A deems host B to be "interested" in a newsgroup if
the newsgroup matches the pattern on the arc from A to B.  This
pattern is stored in a file on the A machine.)  The hosts receiving
the incoming message examine it to make sure they really want the
message, accept it locally, and then in turn forward the message to
all their interested neighbors.  This process continues until the
entire network has seen the message.
An important part of the algorithm is the prevention of loops.  The
above process would cause a message to loop along a cycle forever.
In particular, when host A sends a message to host B, host B will
send it back to host A, which will send it to host B, and so on.
One solution to this is the history mechanism.  Each host keeps
track of all messages it has seen (by their Message-ID) and
whenever a message comes in that it has already seen, the incoming
message is discarded immediately.  This solution is sufficient to
prevent loops, but additional optimizations can be made to avoid
sending messages to hosts that will simply throw them away.
One optimization is that a message should never be sent to a machine
listed in the "Path" line of the header.  When a machine name is
in the "Path" line, the message is known to have passed through the
machine.  Another optimization is that, if the message originated
on host A, then host A has already seen the message.  Thus, if a
message is posted to newsgroup misc.misc, it will match the pattern
misc.all (where all is a metasymbol that matches any string), and
will be forwarded to all hosts that subscribe to misc.all (as
determined by what their neighbors send them).  These hosts make up
the misc subnetwork.  A message posted to btl.general will reach all
hosts receiving btl.all, but will not reach hosts that do not get
btl.all.  In effect, the messages reaches the btl subnetwork.  A
messages posted to newsgroups misc.misc,btl.general will reach all
hosts subscribing to either of the two classes.

Notes

<1>  UNIX is a registered trademark of AT&T.