Email address

From Wikipedia, the free encyclopedia
  (Redirected from E-mail address)
Jump to: navigation, search

An email address identifies an email box to which email messages are delivered. The universal standard for the format and meaning of an email address today is the model developed for Internet electronic mail systems since the 1980s, but some earlier systems, and many proprietary commercial email systems used different address formats.

An email address such as John.Smith@example.com is made up of a local part, an @ symbol, then a domain part. The domain part is not case-sensitive, but local-parts may be. In practice, the mail system at example.com may choose to treat John.Smith as equivalent john.smith or even johnsmith.[1] Mail systems often limit their users' choice of name to a subset of the technically valid characters, and may in some cases also limit which addresses it is possible to send mail to.

With the introduction of internationalized domain names, efforts are progressing to permit non-ASCII characters in email addresses.

Overview[edit]

The transmission of electronic mail within the Internet uses the Simple Mail Transfer Protocol (SMTP), defined in Internet standards RFC 5321 and RFC 5322, and extensions like RFC 6531. The mailboxes may be accessed and managed by users with the Post Office Protocol (POP) or the Internet Message Access Protocol (IMAP) with email client software that runs on a personal computer, mobile device, or with webmail systems that render the messages on a screen or on paper printouts.

The general format of an email address is localpart@domain, and a specific example is jsmith@example.org. An address consists of two parts. The part before the @ sign (localpart) identifies the name of a mailbox. This is often the username of the recipient, e.g., jsmith. The part after the @ symbol is a domain name that represents the administrative realm for the mail box, e.g., a company's domain name, example.com.

A mail server uses the Domain Name System (DNS) to locate the destination mail server for the domain of the recipient by querying for mail exchanger records (MX records). The organization holding the delegation for a given domain, the mailbox provider, can define the target hosts for all email destined to its domain. The mail exchanger does not need to be located in the domain of the destination mail box, however it must accept mail for the domain. The target hosts are configured with a mechanism to deliver mail to all destination mail boxes. If no mail exchangers are configured, a mail sender directly queries the address record (A record or AAAA record) for the domain name in the email address.

The local-part of an email address has no significance for intermediate mail relay systems other than the final mailbox host. Email senders and intermediate mail relay systems must not assume it to be case-insensitive, since the final mailbox host may or may not treat it as such. A single mailbox may receive mail for multiple email addresses, if configured by the administrator. Conversely, a single email address may be the alias to a distribution list to many mailboxes. Email aliases, electronic mailing lists, sub-addressing, and catch-all addresses, the latter being mailboxes that receive messages regardless of the local part, are common patterns for achieving a variety of delivery goals.

The addresses found in the header fields of an email message are not directly used by mail exchangers to deliver the message. An email message also contains a message envelope that contains the information for mail routing. While envelope and header addresses may be equal, forged email addresses are often seen in spam, phishing, and many other Internet-based scams. This has led to several initiatives which aim to make such forgeries easier to spot.

To indicate the message recipient, an email address also may have an associated display name for the recipient, which is followed by the address specification surrounded by angled brackets, for example: John Smith <john.smith@example.org>.

Earlier forms of email addresses on other networks than the Internet included other notations, such as that required by X.400, and the UUCP bang path notation, in which the address was given in the form of a sequence of computers through which the message should be relayed. This was widely used for several years, but was superseded by the Internet standards promulgated by the Internet Engineering Task Force (IETF).

Syntax[edit]

The format of email addresses is local-part@domain where the local-part may be up to 64 characters long and the domain name may have a maximum of 253 characters – but the maximum of 256-character length of a forward or reverse path restricts the entire email address to be no more than 254 characters long.[2] The formal definitions are in RFC 5322 (sections 3.2.3 and 3.4.1) and RFC 5321 – with a more readable form given in the informational RFC 3696[3] and the associated errata.

Local part[edit]

The local-part of the email address may use any of these ASCII characters.[4] RFC 6531 permits Unicode characters beyond the ASCII range:

  • Uppercase and lowercase English letters (a–z, A–Z) (ASCII: 65–90, 97–122)
  • Digits 0 to 9 (ASCII: 48–57)
  • These special characters: ! # $ % & ' * + - / = ? ^ _ ` { | } ~
  • Character . (dot, period, full stop) (ASCII: 46) provided that it is not the first or last character, and provided also that it does not appear consecutively (e.g. John..Doe@example.com is not allowed).
  • Special characters are allowed with restrictions. They are:
    • Space and "(),:;<>@[\] (ASCII: 32, 34, 40, 41, 44, 58, 59, 60, 62, 64, 91–93)
The restrictions for special characters are that they must only be used when contained between quotation marks, and that 2 of them (the backslash \ and quotation mark " (ASCII: 92, 34)) must also be preceded by a backslash \ (e.g. "\\\"").[citation needed]
  • Comments are allowed with parentheses at either end of the local part; e.g. "john.smith(comment)@example.com" and "(comment)john.smith@example.com" are both equivalent to "john.smith@example.com".
  • International characters above U+007F, encoded as UTF-8, are permitted by RFC 6531, though mail systems may restrict which characters to use when assigning local parts.

A quoted string may exist as a dot separated entity within the local-part, or it may exist when the outermost quotes are the outermost characters of the local-part (e.g. abc."defghi".xyz@example.com or "abcdefghixyz"@example.com are allowed. Conversely, abc"defghi"xyz@example.com is not; neither is abc\"def\"ghi@example.com). Quoted strings and characters however, are not commonly used. RFC 5321 also warns that "a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form".

The local-part postmaster is treated specially–it is case-insensitive, and should be forwarded to the domain email administrator. Technically all other local-parts are case-sensitive, therefore jsmith@example.com and JSmith@example.com specify different mailboxes; however, many organizations treat uppercase and lowercase letters as equivalent.

Most organizations do not allow use of many of the technically valid special characters. Organizations are free to restrict the forms of their own email addresses as desired, e.g., Windows Live Hotmail, for example, only allows creation of email addresses using alphanumerics, dot (.), underscore (_) and hyphen (-).[5]

Systems that send mail must be capable of handling outgoing mail for all valid addresses. Contrary to the relevant standards, some defective systems treat certain legitimate addresses as invalid and fail to handle mail to these addresses. Hotmail, for example, refuses to send mail to any address containing any of the following standards-permissible characters: !#$%*/?^`{|}~.[citation needed]

Domain part[edit]

The domain name part of an email address has to conform to strict guidelines: it must match the requirements for a hostname, consisting of letters, digits, hyphens and dots. In addition, the domain part may be an IP address literal, surrounded by square braces, such as jsmith@[192.168.2.1] or jsmith@[IPv6:2001:db8::1], although this is rarely seen except in email spam. Internationalized domain names (which are encoded to comply with the requirements for a hostname) allow for presentation of non-ASCII domain parts. In mail systems compliant with RFC 6531 and RFC 6532 an email address may be encoded as UTF-8, both a local part as well as a domain name.

Comments are allowed in the domain part as well as in the local part. E.g. "john.smith@(comment)example.com" and "john.smith@example.com(comment)" are equivalent to "john.smith@example.com"

Examples[edit]

Valid email addresses[edit]

  • niceandsimple@example.com
  • very.common@example.com
  • a.little.lengthy.but.fine@dept.example.com
  • disposable.style.email.with+symbol@example.com
  • other.email-with-dash@example.com
  • user@localserver

Invalid email addresses[edit]

  • Abc.example.com (an @ character must separate the local and domain parts)
  • A@b@c@example.com (only one @ is allowed outside quotation marks)
  • a"b(c)d,e:f;g<h>i[j\k]l@example.com (none of the special characters in this local part is allowed outside quotation marks)
  • just"not"right@example.com (quoted strings must be dot separated or the only element making up the local-part)
  • this is"not\allowed@example.com (spaces, quotes, and backslashes may only exist when within quoted strings and preceded by a backslash)
  • this\ still\"not\\allowed@example.com (even if escaped (preceded by a backslash), spaces, quotes, and backslashes must still be contained by quotes)
  • john..doe@example.com (double dot before @)
  • john.doe@example..com (double dot after @)

Common local-part semantics[edit]

According to RFC 5321 2.3.11 Mailbox and Address, "...the local-part MUST be interpreted and assigned semantics only by the host specified in the domain part of the address.". This means that no assumptions can be made about the meaning of the local-part of another mail server. It is entirely up to the configuration of the mail server.

Local-part normalization[edit]

Interpretation of the local-part of an email address is dependent on the conventions and policies implemented in the mail server. For example, case sensitivity may distinguish mailboxes differing only in capitalization of characters of the local-part, although this is not very common.[6] Gmail ignores all dots in the local-part for the purposes of determining account identity.[7] This prevents the creation of user accounts your.user.name or yourusername when the account your.username already exists.

Address tags [edit]

Some mail services allow a user to append a tag to their email address (e.g., where joeuser@example.com is the main address, which would also accept mail for joeuser+work@example.com or joeuser-family@example.com). The text of tag may be used to apply filtering and to create single-use addresses.[8] Some IETF standards-track documents, such as RFC 5233 refer to this convention as "sub-addressing". However, many websites' automatic form validation scripts or software will reject + as an invalid character in the email address. (In some cases, Facebook for example, service providers will incongruously use address tags for legitimate purposes in their own outbound email, but disallow address tags for their subscribers or users.)

Disposable email addresses of this form, using various separators between the base name and the tag, are supported by several email services, including Runbox (plus and hyphen), Gmail (plus),[9] Yahoo! Mail Plus (hyphen),[10] Apple's iCloud (plus), Outlook.com (plus),[11] FastMail (plus and Subdomain Addressing),[12] and MMDF (equals).

Most installations[which?] of the qmail and Courier Mail Server products support the use of a hyphen '-' as a separator within the local-part, such as joeuser-tag@example.com or joeuser-tag-sub-anything-else@example.com. This allows qmail through .qmail-default or .qmail-tag-sub-anything-else files to sort, filter, forward, or run an application based on the tagging system established.[13][14]

Postfix allows configuring an arbitrary separator from the legal character set. The separator info remains available on the email (address is not rewritten to remove it), and thus is useful in internal mail-routing, filtering, and forwarding via any of the mechanisms existing in Postfix.[15]

Validation and verification[edit]

Email addresses are often requested as input to website as user identification for the purpose of data validation.

An email address is generally recognized as having two parts joined with an at-sign (@). However, the technical specification detailed in RFC 822 and subsequent RFCs are more extensive, offering complex and strict restrictions.[16]

It is impossible to match these restrictions with a single technique. Using regular expressions results in long patterns giving incomplete results.[17]

Syntactically correct, verified email addresses do not guarantee email box existence. Thus many mail servers use other techniques and check the mailbox existence against relevant systems such as the Domain Name System for the domain part or using callback verification to check if the mailbox exists. This is however often disabled to avoid directory harvest attack.

Assuring an email address is of a good quality, requires a combination of various validation techniques. Large websites, bulk mailers or spammers, require fast algorithms that predict validity of email address. Such methods depend heavily on heuristic algorithms and statistical models.[18]

Conversely, many websites evaluate the validity of email addresses differently from the standard specification, rejecting addresses containing valid characters, such as + or / signs, or setting arbitrary length limitations (e.g., 30 characters). RFC 3696 provides specific advice for validating Internet identifiers, including email addresses.

The new HTML5 forms implemented in many browser, using the new email state of the input element, allow email address validation to be handled by the browser.[19]

Email address internationalization provides for a much larger range of characters than many current validation algorithms allow, such as all Unicode characters above U+0080, encoded as UTF-8.

Identity validation[edit]

Despite the growth of the World Wide Web as a primary interface for communication, email addresses continue to remain the primary means, besides cell phone number validation, postal mail validation, fax validation, etc., of identity validation for website account activation. This is usually accomplished by the website sending a temporary hyperlink to the inbox of the user-provided email address in order to open, immediately activating the account. Email addresses are also useful as means of forwarding messages from the website, e.g., user messages, user actions, to the email inbox.

Internationalization[edit]

The IETF conducts a technical and standards working group devoted to internationalization issues of email addresses, entitled Email Address Internationalization (EAI, also known as IMA – Internationalized Mail Address).[20] This group produced RFC 6530, RFC 6531, RFC 6532, and RFC 6533, and continues to work on additional EAI-related RFCs.

The IETF's EAI Working group published RFC 6530 "Overview and Framework for Internationalized Email", which enabled non-ASCII characters to be used in both the local and domain parts of an email address. RFC 6530 provides for email based on the UTF-8 encoding, which permits the full repertoire of Unicode. RFC 6531 provides a mechanism for SMTP servers to negotiate transmission of the SMTPUTF8 content.

The basic EAI concepts involve exchanging mail in UTF-8. Though the original proposal included a downgrading mechanism for legacy systems, this has now been dropped.[21] The local servers are responsible for the "local" part of the address, whereas the domain portion would be restricted by the rules of internationalized domain names, though still transmitted in UTF-8. The mail server is also responsible for any mapping mechanism between the IMA form and any ASCII alias.

EAI enables users to have a localized address in a native language script or character set, as well as an ASCII form for communicating with legacy systems or for script-independent use. Applications that recognize internationalized domain names and mail addresses must have facilities to convert these representations.

Significant demand for such addresses is expected in China, Japan, Russia, and other markets that have large user bases in a non-Latin-based writing system.

Internationalization examples[edit]

The example addresses below would not be handled by RFC 5322 based servers, but are permitted by RFC 6530. Servers compliant with this will be able to handle these:

  • Latin Alphabet (with diacritics): Pelé@example.com
  • Greek Alphabet: δοκιμή@παράδειγμα.δοκιμή
  • Traditional Chinese Characters: 我買@屋企.香港
  • Japanese Characters: 甲斐@黒川.日本
  • Cyrillic Characters: чебурашка@ящик-с-апельсинами.рф

Internationalization support[edit]

Postfix mailer supports[22] Internationalized Email since development version 20140715, stable release 2.12. Modified versions of sendmail exists that support the EAI rules.[citation needed] Google has support for sending emails to and from internationalised domains, but does not allow the registration of non-ASCII email addresses.[23]

Standards documents[edit]

  • RFC 821 – Simple Mail Transfer Protocol (Obsoleted by RFC 2821)
  • RFC 822 – Standard for the Format of ARPA Internet Text Messages (Obsoleted by RFC 2822) (Errata)
  • RFC 1035 – Domain names – implementation and specification (Errata)
  • RFC 1123 – Requirements for Internet Hosts – Application and Support (Updated by RFC 2821, RFC 5321) (Errata)
  • RFC 2142 – Mailbox Names for Common Services, Roles and Functions (Errata)
  • RFC 2821 – Simple Mail Transfer Protocol (Obsoletes RFC 821, Updates RFC 1123, Obsoleted by RFC 5321) (Errata)
  • RFC 2822 – Internet Message Format (Obsoletes RFC 822, Obsoleted by RFC 5322) (Errata)
  • RFC 3696 – Application Techniques for Checking and Transformation of Names (Errata)
  • RFC 4291 – IP Version 6 Addressing Architecture (Updated by RFC 5952) (Errata)
  • RFC 5321 – Simple Mail Transfer Protocol (Obsoletes RFC 2821, Updates RFC 1123) (Errata)
  • RFC 5322 – Internet Message Format (Obsoletes RFC 2822) (Errata)
  • RFC 5952 – A Recommendation for IPv6 Address Text Representation (Updates RFC 4291) (Errata)

See also[edit]

References[edit]

  1. ^ "...you can add or remove the dots from a Gmail address without changing the actual destination address; they'll all go to your inbox...", Google.com
  2. ^ RFC 5321, section 4.5.3.1. Size Limits and Minimums explicitly details protocol limits.
  3. ^ Written by J. Klensin, the author of RFC 5321
  4. ^ RFC 5322 Section 3.2.3
  5. ^ The character limitation is written in plain English in the subscription page "Sign up for Windows Live". Retrieved 2008-07-26. . However, the phrase is hidden, thus one has to either check the availability of an invalid ID, e.g. me#1, or resort to alternative displaying, e.g. no-style or source view, in order to read it.
  6. ^ Are Email Addresses Case Sensitive? by Heinz Tschabitscher
  7. ^ Google Mail – help center article
  8. ^ "Instant disposable Gmail addresses" by Gina Trapani 2005
  9. ^ Using an address alias
  10. ^ help.yahoo.com
  11. ^ Outlook.com supports simpler "+" email aliases too
  12. ^ FastMail's subdomain addressing
  13. ^ "dot-qmail – control the delivery of mail messages". Retrieved 27 January 2012. 
  14. ^ Sill, Dave. "4.1.5. extension addresses". Life with qmail. Retrieved 27 January 2012. 
  15. ^ Postfix configuration parameters: recipient_delimiter
  16. ^ I Knew How To Validate An Email Address Until I Read The RFC]
  17. ^ Mail::RFC822::Address
  18. ^ Verification & Validation Techniques for Email Address Quality Assurance
  19. ^ [1]
  20. ^ "Eai Status Pages". Email Address Internationalization (Active WG). IETF. March 17, 2006-March 18, 2013. Retrieved July 26, 2008.  Check date values in: |date= (help)
  21. ^ "Email Address Internationalization (eai)". IETF. Retrieved November 30, 2010. 
  22. ^ [2] Postfix SMTPUTF8 support (unicode email addresses)
  23. ^ "A first step toward more global email". Google Official Blog. Google. Retrieved 6 August 2014. 

External links[edit]