HTTP

The HyperText Transfer Protocol (HTTP) Protocol

The HTTP protocol is the heart of the web. It is a well-designed protocol and consequently has many uses beyond communication between web browsers and web servers.

This presentation describes HTTP using information and terminology from Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing and Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, the first two parts of the most recent Internet Engineering Task Force (IETF) specification for HTTP/1.1. The "References" section of this presentation contains links to the complete specification.

The abstract in this document has a good summary of the objectives of HTTP:

The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems.

The HTTP protocol

integrates both data and control over a single bidirectional communication channel,
supports the transfer of data with any type of encoding, and
uses an extensible control information structure.

Like many communication protocols, HTTP has two kinds of messages:

A request is sent by a client to request a service from a server.
A response is sent by a server to respond to a client request.

The HTTP Message Structure

As shown above, an HTTP message has two major parts:

a mandatory header that primarily carries control information, and
an optional message body that primarily carries data such as the response to a request.

Any kind of data coding can be used in the message body. The coding of the header is simple and tightly controlled, but flexible. It must at least provide enough information so that the receiver of the message knows how to decode the message body.

The flexibility of the HTTP protocol is a result of this message structure. A similar message structure pattern is used in many internet protocols.

The HTTP Message Body

Since HTTP is designed for dealing with arbitrary types of data, the message body can use any kind of coding. To help the receiver use it, the header must contain header lines that describes the coding. For example, a "Content-Type" header line can specify the MIME type of the message body, and a "Content-Encoding" header can specify encryption or compression applied to it.

Request Methods

GET
Transfer a current representation of the target resource.
HEAD
Same as GET, but only transfer the status line and header section.
POST
Perform resource-specific processing on the request payload.
PUT
Replace all current representations of the target resource with the request payload.
DELETE
Remove all current representations of the target resource.
CONNECT
Establish a tunnel to the server identified by the target resource.
OPTIONS
Describe the communication options for the target resource.
TRACE
Perform a message loop-back test along the path to the target resource.

General purpose web servers only support the GET, HEAD, and POST methods. However, HTTP is used for many purposes where the other methods are needed.

Web pages frequently contains forms that the user fills in, with a "Submit" button to send data to the server. When this button is clicked a request is sent to the server. For a small amount of data the GET method is used. The form data is encoded into the request URI since GET requests do not have message bodies. The POST method does allow a message body so it is used for larger amounts of data.

Request Header Fields

The field names below are a few of the many standardized field names that can be used in an HTTP request.

Host
The host that holds the requested resource.
Accept
a list of the MIME types the requester is willing to accept
Accept-Charset
a list of the character sets the requester is willing to accept
Accept-Encoding
a list of the compression or encryption encodings the requester is willing to accept
Accept-Language
a list of the languages the requester is willing to accept
If-Modified-Since
used for conditional requests — the server need not send a message body if the requested resource has not been modified recently
If-Unmodified-Since
used for conditional requests — the server need not send a message body if the requested resource has not been modified recently
User-Agent
the software that is making the request

Status and Reason Phrases

The HTTP response status code is a 3-digit code indicating the status of the response. The codes each have a standard reason phrase.

The status code fall into five general categories:

1xx (Informational)
The request was received, continuing process
2xx (Successful)
The request was successfully received, understood, and accepted
3xx (Redirection)
Further action needs to be taken in order to complete the request
4xx (Client Error)
The request contains bad syntax or cannot be fulfilled
5xx (Server Error)
The server failed to fulfill an apparently valid request

Header Fields

The field names below are a few of the many standardized field names that can be used in an HTTP response.

Content-Type
the MIME type of the message body
Content-Encoding
a list of encodings (compression, encryption) applied to the message body
Content-Length
the length of the message body
Date
the timestamp of the response
Last-Modified
when the response body was last modified
Server
the server that provided the response
Allow
the methods that the server will respond to

The HyperText Transfer Protocol (HTTP) Protocol

The HTTP Protocol

The HTTP Message Structure

HTTP Message Structure

The HTTP Header

The HTTP Message Body

The HTTP Request Format

HTTP Request

Request Methods

Request Header Fields

The HTTP Response Format

HTTP Response

Status and Reason Phrases

Status and Reason

Common Status Codes and Their Reason Phrases

Header Fields

References