Speeding up HTTP with minimal protocol changes
As SPDY works its way through IETF ratification I began wondering whether it was really necessary to add a complex, binary protocol to the HTTP suite to improve HTTP performance. One of the main things that SPDY sets out to fix is defined in the opening paragraph of the SPDY proposal:
There are probably reasons (that I've overlooked) why my proposal is a bad idea; what are they?
One of the bottlenecks of HTTP implementations is that HTTP relies on multiple connections for concurrency. This causes several problems, including additional round trips for connection setup, slow-start delays, and connection rationing by the client, where it tries to avoid opening too many connections to any single server. HTTP pipelining helps some, but only achieves partial multiplexing. In addition, pipelining has proven non-deployable in existing browsers due to intermediary interference.The solution to this problem (as currently proposed) is SPDY. But I couldn't help thinking that solving the multiplexing problem could be done in a simpler manner within HTTP itself. And so here is a partial proposal that involves adding two new headers to existing HTTP and nothing more.
1. Overview HMURR (pronounced 'hammer') introduces a new pipelining mechanism with explicit identifiers used to match requests and responses sent on the same TCP connection so that out-of-order responses are possible. The current HTTP 1.1 pipelining mechanism requires that responses be returned in the same order as requests are made (FIFO) which itself introduces a head-of-line blocking problem. In addition, HTTP 1.1 pipelining does not allow responses to be interleaved. When a response is transmitted the entire response must be sent before a later response can be transmitted. HMURR introduces a chunking mechanism that allows partial responses to be sent. This enables multiple responses to be interleaved on a single connection preventing a long response from starving out shorter ones. HMURR attempts to preserve the existing semantics of HTTP. All features such as cookies, ETags, Vary headers, Content-Encoding negotiations, etc. work as they do with HTTP; HMURR simply introduces an explicit multiplexing mechanism. HMURR introduces two new HTTP headers: one header that is used for requests and responses and one that is only present in responses. No changes are made to other HTTP headers or HTTP responses. 2. HTTP Version It is intended that HMURR be a modification to the existing HTTP standard RFC 2616 and requires a higher HTTP version number. Either HTTP 1.2 or HTTP 2.0 would be suitable. 3. HMURR Operation 3.1. Pipelining A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). Each request must contain a Request-ID header specifying a unique identifier used by the client to identify the request. When responding to a request the server will each the Request-ID header with the same value so that the client can match requests and responses. This mechanism allows HTTP responses to be returned in any order. Clients which assume persistent connections and pipeline immediately after connection establishment SHOULD be prepared to retry their connection if the first pipelined attempt fails. If a client does such a retry, it MUST NOT pipeline before it knows the connection is persistent. Clients MUST also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses. Clients SHOULD NOT pipeline requests using non-idempotent methods or non-idempotent sequences of methods (see section 9.1.2 of RFC2616). Otherwise, a premature termination of the transport connection could lead to indeterminate results. A client wishing to send a non-idempotent request SHOULD wait to send that request until it has received the response status for all previous outstanding requests made in the pipeline. 3.2. Multiplexed responses A server may choose to break a response into parts so that a large response does not consume the entire TCP connection. This allows multiple responses to be returned without any one waiting for another. When a response is broken into parts each part will consist of a normal HTTP header and body. These parts are called slices. The first slice sent in response to an HTTP request MUST contain either a Content-Length or specify Transfer-Encoding: chunked. Each slice MUST start with a valid Status-Line (RFC 2616 section 6.1) followed by response headers. The first slice MUST have the HTTP headers that would be present were the response transmitted unsliced. Subsequent slices MUST have only a Slice-Length (but see next paragraph) and Request-ID header. The minimal slice will consist of a Status-Line and a single Request-ID header. In satisfying an HTTP request the server MAY send multiple slices. All slices except the last one MUST contain a Slice-Length header specifying the number of bytes of content being transmitted in that slice. The final slice MUST NOT contain a Slice-Length header; the client MUST either use the Content-Length header sent in the first slice (if present) or the chunked transfer encoding to determine how much data is to be read. The HTTP response code MAY change from slice to slice if server conditions change. For example, if a server becomes unavailable while sending slices in response to a request the Status-Line on the initial slice could have indicated 200 OK but a subsequent slice may indicate 500 Internal Server Error. If the HTTP response code changes the server MUST send a complete set of HTTP headers as if the it were the first slice. Since there is no negotiation between client and server about sliced responses, a client sending a Request-ID header MUST be prepared to handle a sliced response. 3.3. Long responses A server MAY choose to use the slice mechanism in section 3.2 to implement a long response to a request. For example, a chat server could make a single HTTP request for lines of chat and the server could use the slice mechanism with chunked transfer encoding to send messages when they arrive. The client would simply wait for slices to arrive and decode the chunks within them. One simple mechanism would be to send a slice containing the same number of bytes as the chunk (the chunked encoding header would indicate X bytes and the Slice-Length would be X bytes plus the chunk header size). The client would then be able to read a complete slice containing a complete chunk and use it for rendering. 3.4. Example session In this example the HTTP version for HMURR is specified as 1.2. It shows a client making an initial request for a page without a Request-ID, receiving the complete response and then reusing the connection to send multiple requests and received sliced replies in a different order on a single TCP connection. client server GET / HTTP/1.2 Host: example.com Connection: keep-alive HTTP/1.2 200 OK Content-Length: 1234 Content-Type: text/html Connection: keep-alive (1234 bytes of data) GET /header.jpg HTTP/1.2 Host: example.com Request-ID: a1 GET /favicon.ico HTTP/1.2 Host: example.com Request-ID: b2 GET /hero.jpg HTTP/1.2 Host: example.com HTTP/1.2 200 OK Request-ID: c3 Content-Length: 632 Content-Type: image/jpeg GET /iframe.html HTTP/1.2 Request-ID: b2 Host: example.com Request-ID: d4 (632 bytes of data) HTTP/1.2 200 OK Content-Length: 65343 Request-ID: a1 Slice-Length: 1024 (1024 bytes of data) HTTP/1.2 200 OK Transfer-Encoding: chunked Request-ID: c3 Slice-Length: 4957 (4957 of chunked data) HTTP/1.2 200 OK Content-Length: 128 Request-ID: d4 (128 bytes of HTML) HTTP/1.2 200 OK Request-ID: a1 (64319 bytes of data) HTTP/1.2 200 OK Request-ID: c3 Slice-Length: 2354 (2354 bytes of chunked data) HTTP/1.2 200 OK Request-ID: c3
(chunked data that includes 00 block indicating end) In this example, the request for / is satisfied in full without using pipelining or slicing. The client then makes requests for four resources /header.jpg, /favicon.ico, /hero.jpg and /iframe.html and assigns them IDs a1, b2, c3 and d4 respectively. Since /favicon.ico (ID b2) is small it is sent while the client is generating requests and in full (the Request-ID header is present, but Slice-Length is not). /header.jpg is sent in two slices. The first has a Slice-Length of 1024 bytes and specifies the complete Content-Length of the resource. The second slice has no Slice-Length header indicating that it is the final slice satisfying the request with ID a1. /hero.jpg is sent using chunked encoding and in two slices. The first slice indicate a Slice-Length (of chunked data) and the second slice has no Slice-Length and the client reads the rest of the chunked data (which must include the 0 length final chunked block). /iframe.html is small and is satisfied with a non-sliced response. Responses are delivered in the order that is convenient for the server
and using slicing to prevent starvation. Since the client needs the / resource in its entirety before continuing it does not send a Request-ID header and receives the complete response. 4. Header Definitions This section defines the syntax and semantics of additional HTTP headers added with HMURR to the standard HTTP/1.1 header fields. 4.1. Request-ID The Request-ID is added to the HTTP request headers generated by a client to indicate that it intends to use HMURR and to uniquely identify the request. Request-ID = "Request-ID" ":" unique-request-tag When responding to the request the origin-server MUST insert a Request-ID header with the corresponding unique-request-tag so that the client can match requests and responses. 4.2. Slice-Length The Slice-Length response-header is added to a response by the origin-server to indicate the length of content that follows the HTTP response headers. Slice-Length = "Slice-Length" ":" 1*DIGIT
If this header is missing it indicates that the entire (or remaining unsent) response-body is being transmitted with this set of HTTP headers. If present it indicates the number of bytes of response that are being transmitted. The client MUST use the Content-Length to determine the total length expected, or if chunked transfer encoding is used the client MUST use the chunked encoding header to determine the end of the content.Obviously, this proposal does not provide all the functionality of SPDY (such as a forced TLS connection, header compression or built-in server push), but it does deal with connection multiplexing in a simple, textual manner.
There are probably reasons (that I've overlooked) why my proposal is a bad idea; what are they?