RFC7763

From RFC-Wiki

Internet Engineering Task Force (IETF) S. Leonard Request for Comments: 7763 Penango, Inc. Category: Informational March 2016 ISSN: 2070-1721

                  The text/markdown Media Type

Abstract

This document registers the text/markdown media type for use with Markdown, a family of plain-text formatting syntaxes that optionally can be converted to formal markup languages such as HTML.

Status of This Memo

This document is not an Internet Standards Track specification; it is published for informational purposes.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7763.

Copyright Notice

Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

Introduction

This Is Markdown! Or: Markup and Its Discontents

In computer systems, textual data is stored and processed using a continuum of techniques. On the one end is plain text: computer- encoded text that consists only of a sequence of code points from a given standard, with no other formatting or structural information [UNICODE]. (On the other end is binary data, which computer systems store and process with bit-for-bit accuracy.) Many of these standards include control characters that are used as in-band signaling to cause effects other than the addition of a symbol (or grapheme) to the text.

Markup offers an alternative means to encode this signaling information by overloading certain graphic characters (see, e.g., [ISO646]) with additional meanings. Therefore, markup languages allow for annotating a document in a syntactically distinguishable way from the text, while keeping the annotations printable. Markup languages are (reasonably) well-specified and tend to follow (mostly) standardized syntax rules. Examples of formal markup languages include Standard Generalized Markup Language (SGML), HTML, XML, and LaTeX. Standardized rules lead to interoperability between markup processors, but they impose skill requirements on new users that lead to markup languages becoming less accessible to beginners. These rules also reify "validity": content that does not conform to the rules is treated differently (i.e., is rejected) than content that conforms.

In contrast to formal markup languages, lightweight markup languages use simple syntaxes; they are designed to be easy for humans to enter and understand with basic text editors. Markdown, the subject of this document, began as an /informal/ plain-text formatting syntax [MDSYNTAX] and Perl script HTML/XHTML processor [MARKDOWN] targeted at non-technical users using unspecialized tools, such as plain-text email clients. [MDSYNTAX] explicitly rejects the notion of validity: there is no such thing as "invalid" Markdown. If the Markdown content does not result in the "right" output (defined as output that the author wants, not output that adheres to some dictated system of rules), the expectation is that the author should continue experimenting by changing the content or the processor to achieve the desired output.

Since its development in 2004 [MARKDOWN], a number of web- and Internet-facing applications have incorporated Markdown into their text-entry systems, frequently with custom extensions. Markdown has thus evolved into a kind of Internet meme [INETMEME] as different communities encounter it and adapt the syntax for their specific use cases. Markdown now represents a family of related plain-text formatting syntaxes and implementations that, while broadly compatible with humans [HUMANE], are intended to produce different kinds of outputs that push the boundaries of mutual intelligibility between software systems.

To support identifying and conveying Markdown, this document defines a media type and parameters that indicate the Markdown author's intent on how to interpret the content. This registration draws particular inspiration from text/troff RFC4263, which is a plain- text formatting syntax for typesetting based on tools from the 1960s ("RUNOFF") and 1970s ("nroff", et al.). In that sense, Markdown is a kind of troff for modern computing. A companion document RFC7764 provides additional Markdown background, philosophy, local storage strategies, and variant registrations (including examples).

Markdown Is About Writing and Editing

 "HTML is a *publishing* format; Markdown is a *writing* format.
  Thus, Markdown's formatting syntax only addresses issues that can
  be conveyed in plain text." [MDSYNTAX]

The paradigmatic use case for text/markdown is the Markdown editor: an application that presents Markdown content (which looks like an email or other piece of plain-text writing) alongside a published format, so that an author can see results instantaneously and can tweak his or her input in real time. A significant number of Markdown editors have adopted "split-screen view" (or "live preview") technology that looks like Figure 1.

+----------------------------------------------------------------------+ | File Edit (Cloud Stuff) (Fork Me on GitHub) Help | +----------------------------------------------------------------------+ | [ such-and-such identifier ] [ useful statistics] | +----------------------------------++----------------------------------+ | (plain text, with || (text/html, likely | | syntax highlighting) || rendered to screen) | | || |

|# Introduction ||

Introduction

|

| || |

|## Markdown Is About Writing and /|

Markdown Is About Writing and | / Editing ||Editing

|

| || |

|> HTML is a *publishing* format; ||

HTML is a |

|> Markdown is a *writing* format. || publishing format; | |> Thus, Markdown's formatting || Markdown is a writing | |> syntax only addresses issues || format. Thus, Markdown's | |> that can be conveyed in plain <> formatting syntax only addresses | |> text. [MDSYNTAX][] || issues that can be conveyed in | | || plain text. <a href="http://darin/ |The paradigmatic use case for |/gfireball.net/projects/markdown/sy/ |`text/markdown` is the Markdown |/ntax#html" title="Markdown: Syntax/ |editor: an application that |/: HTML">MDSYNTAX</a> |

|presents Markdown content ||

|

|... || |

| ||

The paradigmatic use case for | |[MDSYNTAX]: http://daringfireball./%7C text/markdown is the| /net/projects/markdown/syntax#html || Markdown editor: an application | |"Markdown: Syntax: HTML" || that presents Markdown content | | || ...

|

+----------------------------------++----------------------------------+

LEGEND: "/" embedded in a vertical line represents a line-continuation
 marker, since a line break is not supposed to occur in that content.
      Figure 1: Markdown Split-Screen / Live Preview Editor

To get the best results, implementations ought to produce and consume mutually intelligible and identifiable bits of Markdown. That way, users on diverse platforms can collaborate with their tools of choice. Those tools can be desktop-based (MarkdownPad, MultiMarkdown Composer); browser-based (Dillinger, Markable); integrated widgets (Discourse, GitHub); general-purpose editors (emacs, vi); or plain old "Notepad". Additionally, implementations ought to have common ways to identify particular areas of Markdown content when the Markdown becomes appreciably large (e.g., book chapters and Internet- Drafts -- not just blog posts). So that users have the option to use Markdown in MIME-capable systems to convey their works in progress,

not just their finished products (for which full-blown markups ranging from text/html to application/pdf are appropriate), implementations ought to label such Markdown content with a common media type: text/markdown. This registration facilitates interoperability between these Markdown editors by conveying the syntax of the particular Markdown variant and the desired output format.

Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119.

Since Markdown signifies a family of related formats with varying degrees of formal documentation and implementation, this specification uses the term "variant" to identify such formats.

Markdown Media Type Registration Application

This section provides the media type registration application for the text/markdown media type (see Section 5.6 of RFC6838).

Type name: text

Subtype name: markdown

Required parameters:

  charset: Per Section 4.2.1 of RFC6838, charset is REQUIRED.
     There is no default value because neither [MDSYNTAX] nor many
     popular implementations at the time of this registration do
     either.  [MDSYNTAX] clearly describes Markdown as a "writing
     format"; its syntax rules operate on characters (specifically,
     on punctuation) rather than code points.  Many Markdown
     processors will get along just fine by operating on characters
     in the US-ASCII repertoire (specifically punctuation),
     blissfully oblivious to other characters or codes.

Optional parameters:

  variant: An optional identifier of the specific Markdown variant
     that the author intended.  The value serves as a "hint" to the
     recipient, meaning that the recipient MAY interpret the
     Markdown as that variant, but is under no obligation to do so.
     When omitted, there is no hint; the interpretation is entirely
     up to the receiver and context.  This identifier is plain US-
     ASCII and case-insensitive.  To promote interoperability,
     identifiers can be registered in the registry defined in
     Section 6.  If a receiver does not recognize the variant
     identifier, the receiver MAY present the identifier to a user
     to inform him or her of it.
     Other parameters MAY be included with the media type.  The
     variant SHOULD define the semantics of such other parameters.
     Additionally, the variant MAY be registered under another media
     type; this text/markdown registration does not preclude other
     registrations.

Encoding considerations:

  Markdown content is plain-text content; any octet sequence is
  valid as long as it conforms to the character codes of the charset
  parameter.  See RFC2046.  Markup characters in [MDSYNTAX] are
  limited to printable US-ASCII; however, other variants can define
  markup characters outside this range (including control characters
  such as NUL and characters encoded in multiple octets).

Security considerations:

  Markdown interpreted as plain text is relatively harmless.  A text
  editor need only display the text.  The editor SHOULD take care to
  handle control characters appropriately and to limit the effect of
  the Markdown to the text-editing area itself; malicious Unicode-
  based Markdown could, for example, surreptitiously change the
  directionality of the text.  An editor for normal text would
  already take these control characters into consideration, however.
  Markdown interpreted as a precursor to other formats, such as
  HTML, carries all of the security considerations as the target
  formats.  For example, HTML can contain instructions to execute
  scripts, redirect the user to other web pages, download remote
  content, and upload personally identifiable information.  Markdown
  also can contain islands of formal markup, such as HTML.  These
  islands of formal markup may be passed as they are, transformed,
  or ignored (perhaps because the islands are conditional or
  incompatible) when the Markdown is processed.  Since Markdown may
  have different interpretations depending on the tool and the
  environment, a better approach is to analyze (and sanitize or
  block) the output markup, rather than attempting to analyze the
  Markdown.

Interoperability considerations:

  Markdown variations (some might say "innovations") are designed to
  be broadly compatible with humans ("humane"), but not necessarily
  with each other.  Therefore, syntax in one Markdown derivative may
  be ignored or treated differently in another derivative.  The
  overall effect is a general degradation of the output that
  increases with the quantity of variant-specific Markdown used in
  the text.  When it is desirable to reflect the author's intent in
  the output, stick with the variant identified in the variant
  parameter, i.e., receivers SHOULD only accept Markdown variants
  that they explicitly know about, and senders SHOULD avoid use of
  variants that intended recipients are not known to understand.

Published specification: This specification; [MDSYNTAX].

Applications that use this media type:

  Markdown conversion tools, Markdown WYSIWYG (What You See is What
  You Get) editors, and plain-text editors and viewers; markup
  processor targets indirectly use Markdown (e.g., web browsers for
  Markdown converted to HTML).

Fragment identifier considerations:

  See Section 3.

Additional information:

  Magic number(s): None
  File extension(s): .md, .markdown
  Macintosh file type code(s):
     TEXT.  A uniform type identifier (UTI) of
     "net.daringfireball.markdown", which conforms to
     "public.plain-text", is RECOMMENDED [MDUTI].  See RFC7764 for
     other considerations.

Person & email address to contact for further information:

  Sean Leonard <[email protected]>

Restrictions on usage: None.

Author/Change controller: Sean Leonard <[email protected]>

Intended usage: COMMON

Provisional registration? No

Implementations SHOULD record the value of the variant parameter (and other parameters if defined by the variant) along with the Markdown content when the content leaves the domain of formats that are Internet media type capable. Strategies for doing so are discussed in RFC7764.

The Content-Disposition header (particularly the preview-type parameter) can be used with Markdown content. See Section 4.

Fragment Identifiers

[MARKDOWN] does not define any fragment identifiers, but some variants do, and many types of Markdown processor output (e.g., HTML or PDF) will have well-defined fragment identifiers. Which fragment identifiers are available for a given document are variant-defined.

When encoded in a URI, characters that are outside of the fragment production of RFC3986 are percent-encoded. The default encoding (character set) of percent-encoded octets in URIs is the same as the Markdown content, which is identified by the charset parameter or by other contextual means. Fragment identifiers SHOULD be considered case-sensitive, which maintains consistency with HTML. Variants MAY override the guidance in this paragraph.

At least the first equals sign "=" SHOULD be percent-encoded to prevent ambiguity as described in the following section.

Parameters

Similar to application/pdf RFC3778 and text/plain RFC5147, this registration permits a parameter syntax for fragment identifiers. The syntax is a parameter name, the equals sign "=" (which MUST NOT be percent-encoded), and a parameter value. To the extent that multiple parameters can appear in a fragment production, the parameters SHALL be separated by the ampersand "&" (which MUST NOT be percent-encoded).

The only parameter defined in this registration is "line", which has the same meaning as in RFC5147, i.e., counting is zero-based. For example: "#line=10" identifies the eleventh line of Markdown input. Implementers should take heed that different environments and character sets may have a wide range of code sequences to divide lines.

Markdown variants are free to define additional parameters.

Content Disposition and preview-type

The Content-Disposition header RFC2183 conveys presentational information about a MIME entity, using a type and set of parameters. The parameter preview-type is defined here for Markdown content.

When present, preview-type indicates the Internet media type (and parameters) of the preview output desired from the processor by the author. With reference to the "paradigmatic use case" (i.e., collaborative Markdown editing) in Section 1.3, the preview-type parameter primarily affects the "right-hand" side of a Markdown editor. There is no default value: when absent, a Markdown user agent can render or display whatever it wants.

The value of this parameter is an Internet media type with optional parameters. The syntax (including case-sensitivity considerations) is the same as specified in RFC2045 for the Content-Type header (with updates over time, e.g., RFC2231 and RFC6532).

Implementations SHOULD anticipate and support HTML (text/html) and XHTML (application/xhtml+xml) output, to the extent that a syntax targets those markup languages. These types ought to be suitable for the majority of current purposes. However, Markdown is increasingly becoming integral to workflows where HTML is not the target output; examples range from TeX, to PDF, to Outline Processor Markup Language (OPML), and even to entire e-books (e.g., [PANDOC]).

The reflexive media type text/markdown in this parameter value means that the author does not want to invoke Markdown processing at all: the receiver SHOULD present the Markdown source as is.

The preview-type parameter can be used for other types of content, but the precise semantics are not defined here.

Example

The following is an example of Markdown as an email attachment:

MIME-Version: 1.0
Content-Type: text/markdown; charset=UTF-8; variant=Original
Content-Disposition: attachment; filename=readme.md;
 preview-type="application/xhtml+xml"
Sample HTML 4 Markdown
=============
This is some sample Markdown. [Hooray!][foo]
(Remember that link identifiers are not case-sensitive.)
Bulleted Lists
-------
Here are some bulleted lists...
* One Potato
* Two Potato
* Three Potato
- One Tomato
- Two Tomato
- Three Tomato
More Information
-----------
[.markdown, .md](http://daringfireball.net/projects/markdown/)
has more information.
[fOo]: http://example.com/loc 'Will Not Work with Markdown.pl-1.0.1'

IANA Considerations

IANA has registered the media type text/markdown using the application provided in Section 2 of this document.

IANA has registered preview-type in the "Content Disposition Parameters" subregistry of the "Content Disposition Values and Parameters" registry, with the following description: "Internet media type (and parameters) of the preview output desired from a processor by the author of the MIME content".

Markdown Variants

IANA has established a registry called "Markdown Variants". While the registry has been created in the context of the text/markdown media type, the registry is intended for broad community use, so protocols and systems that do not rely on Internet media types can still tag Markdown content with a common variant identifier. Each entry in this registry shall consist of basic information about the variant:

Identifier: unique identifier for the variant Name: the commonly known name of the variant Description: a prose description of the variant, including

                       how it differs from other variants such as
                       Original

Additional Parameters*: additional Content-Type parameters Fragment Identifiers*: additional fragment identifier syntaxes and

                       semantics

References: URIs or other references to documentation Contact Information: whom to contact (email, URI, phone, address,

                       etc.)

Expiration Date^: when this provisional registration expires

  • (optional)

^ (if provisional)

While the variant parameter is "plain US-ASCII" (see registration template), the Identifier field (and by implication, all registered identifiers) SHALL conform to the ABNF RFC5234:

  ALPHA [*VCHAR (ALPHA / DIGIT)]

For style and compatibility reasons, the Identifier field SHOULD conform to the ABNF:

  ALPHA *( ["-" / "." / "_" / "~"] 1*(ALPHA / DIGIT) )

That is, the identifier MUST start with a letter and MAY contain punctuation in the middle, but not at the end: the last character MUST be alphanumeric. The second production uses the same characters as the "unreserved" rule of RFC3986 and is designed to be compatible with characters in other identification systems, e.g., filenames. Since the identifier MAY be displayed to a user -- particularly in cases where the receiver does not recognize the identifier -- the identifier SHOULD be rationally related to the vernacular name of the variant.

The Name, Description, Additional Parameters, Fragment Identifiers, References, and Contact Information fields SHALL be in a Unicode character set (e.g., UTF-8).

The registry includes the registration in Section 6.1.4 (Original Markdown). RFC7764 includes additional registrations.

Reserved Identifiers

The registry has the following identifiers RESERVED, as they have engendered some controversy in the Markdown community. No one is allowed to register them (or any case variations of them). These identifiers are not and cannot be registered:

  Standard
  Common
  Markdown

The registry includes the following text in the note field: The variant names Standard, Common, and Markdown are reserved and cannot be registered.

Standard of Review

Registrations are made on a First Come, First Served RFC5226 basis by anyone with a need to interoperate. While documentation is required, any level of documentation is sufficient; thus, neither Specification Required nor Expert Review are warranted. The checks prescribed by this section can be performed automatically.

All references (including contact information) MUST be verified as functional at the time of the registration.

As a special "escape valve", registrations can be updated with IETF Review RFC5226. All fields may be updated except the variant identifier, which is permanent: not even case may be changed.

Provisional Registration

Any registrant may make a provisional registration to reserve a variant identifier. Only the variant identifier and contact information fields are required; the rest are optional. Provisional registrations expire after three months, after which time the variant identifier may be reused. To make a registration permanent, a registrant simply needs to complete a permanent registration with the same identifier as the provisional registration.

Original Markdown

The registry includes this initial variant. A conforming implementation that processes the variant parameter MUST recognize this variant (although the processing behavior is not defined here).

Identifier: Original

Name: Markdown

Description:

  Gruber's original Markdown syntax.

References:

  [MARKDOWN]
  [MDSYNTAX]

Contact Information:

  (individual) John Gruber <http://daringfireball.net/>
                           <[email protected]>

Security Considerations

See the Security considerations entry in Section 2.

References

Normative References

[MARKDOWN] Gruber, J., "Daring Fireball: Markdown", December 2004,

          <http://daringfireball.net/projects/markdown/>.

[MDSYNTAX] Gruber, J., "Daring Fireball: Markdown Syntax

          Documentation", December 2004,
          <http://daringfireball.net/projects/markdown/syntax>.

[MDUTI] Gruber, J., "Daring Fireball: Uniform Type Identifier for

          Markdown", August 2011,
          <http://daringfireball.net/linked/2011/08/05/
          markdown-uti>.

RFC2045 Freed, N. and N. Borenstein, "Multipurpose Internet Mail

          Extensions (MIME) Part One: Format of Internet Message
          Bodies", RFC 2045, DOI 10.17487/RFC2045, November 1996,
          <http://www.rfc-editor.org/info/rfc2045>.

RFC2119 Bradner, S., "Key words for use in RFCs to Indicate

          Requirement Levels", BCP 14, RFC 2119,
          DOI 10.17487/RFC2119, March 1997,
          <http://www.rfc-editor.org/info/rfc2119>.

RFC2183 Troost, R., Dorner, S., and K. Moore, Ed., "Communicating

          Presentation Information in Internet Messages: The
          Content-Disposition Header Field", RFC 2183,
          DOI 10.17487/RFC2183, August 1997,
          <http://www.rfc-editor.org/info/rfc2183>.

RFC2231 Freed, N. and K. Moore, "MIME Parameter Value and Encoded

          Word Extensions: Character Sets, Languages, and
          Continuations", RFC 2231, DOI 10.17487/RFC2231, November
          1997, <http://www.rfc-editor.org/info/rfc2231>.

RFC3778 Taft, E., Pravetz, J., Zilles, S., and L. Masinter, "The

          application/pdf Media Type", RFC 3778,
          DOI 10.17487/RFC3778, May 2004,
          <http://www.rfc-editor.org/info/rfc3778>.

RFC3986 Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform

          Resource Identifier (URI): Generic Syntax", STD 66,
          RFC 3986, DOI 10.17487/RFC3986, January 2005,
          <http://www.rfc-editor.org/info/rfc3986>.

RFC5147 Wilde, E. and M. Duerst, "URI Fragment Identifiers for the

          text/plain Media Type", RFC 5147, DOI 10.17487/RFC5147,
          April 2008, <http://www.rfc-editor.org/info/rfc5147>.

RFC5226 Narten, T. and H. Alvestrand, "Guidelines for Writing an

          IANA Considerations Section in RFCs", BCP 26, RFC 5226,
          DOI 10.17487/RFC5226, May 2008,
          <http://www.rfc-editor.org/info/rfc5226>.

RFC5234 Crocker, D., Ed., and P. Overell, "Augmented BNF for

          Syntax Specifications: ABNF", STD 68, RFC 5234,
          DOI 10.17487/RFC5234, January 2008,
          <http://www.rfc-editor.org/info/rfc5234>.

RFC6532 Yang, A., Steele, S., and N. Freed, "Internationalized

          Email Headers", RFC 6532, DOI 10.17487/RFC6532, February
          2012, <http://www.rfc-editor.org/info/rfc6532>.

RFC6838 Freed, N., Klensin, J., and T. Hansen, "Media Type

          Specifications and Registration Procedures", BCP 13,
          RFC 6838, DOI 10.17487/RFC6838, January 2013,
          <http://www.rfc-editor.org/info/rfc6838>.

Informative References

[HUMANE] Atwood, J., "Is HTML a Humane Markup Language?", May 2008,

          <http://blog.codinghorror.com/
          is-html-a-humane-markup-language/>.

[INETMEME] Solon, O., "Richard Dawkins on the internet's hijacking of

          the word 'meme'", June 2013,
          <http://www.wired.co.uk/news/archive/2013-06/20/
          richard-dawkins-memes>,
          <http://www.webcitation.org/6HzDGE9Go>.

[ISO646] International Organization for Standardization,

          "Information technology - ISO 7-bit coded character set
          for information interchange", ISO Standard 646, 1991.

[PANDOC] MacFarlane, J., "Pandoc", 2014,

          <http://johnmacfarlane.net/pandoc/>.

RFC2046 Freed, N. and N. Borenstein, "Multipurpose Internet Mail

          Extensions (MIME) Part Two: Media Types", RFC 2046,
          DOI 10.17487/RFC2046, November 1996,
          <http://www.rfc-editor.org/info/rfc2046>.

RFC4263 Lilly, B., "Media Subtype Registration for Media Type

          text/troff", RFC 4263, DOI 10.17487/RFC4263, January 2006,
          <http://www.rfc-editor.org/info/rfc4263>.

RFC7764 Leonard, S., "Guidance on Markdown: Design Philosophies,

          Stability Strategies, and Select Registrations", RFC 7764,
          DOI 10.17487/RFC7764, March 2016,
          <http://www.rfc-editor.org/info/rfc7764>.

[UNICODE] The Unicode Consortium, "The Unicode Standard, Version

          8.0", (Mountain View, CA: The Unicode Consortium, 2015.
          ISBN 978-1-936213-10-8),
          <http://www.unicode.org/versions/Unicode8.0.0/>.

Author's Address

Sean Leonard Penango, Inc. 5900 Wilshire Boulevard 21st Floor Los Angeles, CA 90036 United States

Email: [email protected] URI: http://www.penango.com/