A Standard for Greatly Reducing HTTP Connections

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: Dec 20, 2002
  • Posts: 8922
  • Loc: Seattle, WA & Phoenix, AZ
  • Status: Offline

Post April 14th, 2009, 12:55 pm

This entire proposal I find interesting because I really do think it would decrease load on servers a great deal as well as speeding up the time it would take for pages to load on the client's end. The majority of the load on the ozzu server is HTTP connections.

I believe normally in the past most browsers have the default set to two for max concurrent connections (HTTP keep-alive connections or persistent connections) per RFC2616. This is actually a good thing for servers as it does help to spread out the load:

Quote:
Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.


http://www.w3.org/Protocols/rfc2616/rfc ... l#sec8.1.4

However, I think I have heard with FF3 it is increased to 6, I would have to double check if they are doing this and if in fact they are not following the RFC2616 recommendation (or if that recommendation has recently changed). So if a webpage you were visiting had 1000 objects to load (from css, images, javascript, etc), it could take some time depending on the server load and the clients computer processing and transfer speed, and whether or not anything is stored in cache.

The first thing I thought of when seeing this thread was how this is similiar to how game programmers package everything up into a few files. For instance a particular game I have played from Westwood (now EA) packages up all of the game content into files called .mix files. It allows your to load one file versus thousands of files separately. In a way its kind of the same concept, except here you would be reducing HTTP connections and also finding a way to get the most out of the RFC2616 recommended persistant connection limit. By the way incase you were interested in changing this limit with IE (not sure how you would do it with FF), I made a thread a long time ago on how to do it:

mswindows-forum/increase-browser-default-persistant-connection-t490.html

Per the RFCs though you probably shouldn't do it for respect of people running servers. I have tried it in the past and have noticed a huge increase in the speed that webpages load. That is one reason why I think Joebert's proposal could really improve load times on the client's end for websites, as well as decreasing load on the server end. If something like this was available, I would most likely implement it.
Ozzu Hosting - Want your website on a fast server like Ozzu?
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post April 14th, 2009, 12:55 pm

  • spork
  • Brewmaster
  • Silver Member
  • User avatar
  • Joined: Sep 22, 2003
  • Posts: 6129
  • Loc: Seattle, WA
  • Status: Offline

Post April 14th, 2009, 1:07 pm

Bigwebmaster wrote:
The first thing I thought of when seeing this thread was how this is similiar to how game programmers package everything up into a few files. For instance a particular game I have played from Westwood (now EA) packages up all of the game content into files called .mix files.

*sigh*... I miss the days of Command & Conquer.
The Beer Monocle. Classy.
  • joebert
  • Sledgehammer
  • Genius
  • No Avatar
  • Joined: Feb 10, 2004
  • Posts: 13455
  • Loc: Florida
  • Status: Offline

Post April 14th, 2009, 3:01 pm

Quote:
Actually, quite the opposite.


Sorry spork, I was trying to point out that this would be a problem if it were implemented the way I first suggested. We're on the same page now. :)

spork wrote:
The only thing I don't like about using the hash notation is that it seems a bit hackish. Defining an actual structure for this added functionality allows for expansion/modification later on.


Now if it was able to be implemented with the zipfile name first, and the resource after the hash, I would have to disagree with it being hackish. Browsers already use the hash to denote specific places, or resources, within a webpage, so a hash pointing to a filename in a zipfile would seem natural.
However since the only way to implement it without breaking pages in browsers that don't support it being to place the archive name after the hash, it would be counter-intuitive to use the hash.

I really am leaning toward the proposed <link> element for an archive, however I disagree with using a new file format because of how readily available and easy to transition to, the existing zip/gz/bz2 formats would be.
I think the mime-type should remain application/x-gzip|application/zip|etc, the extension remain zip|gz|bz2|etc, and a rel attribute would be the only thing telling the browser what to do with the archive.

Code: [ Select ]
<link href="./ui.zip" rel="nra" type="application/zip"/>



Bigwebmaster wrote:
However, I think I have heard with FF3 it is increased to 6, I would have to double check if they are doing this and if in fact they are not following the RFC2616 recommendation (or if that recommendation has recently changed). So if a webpage you were visiting had 1000 objects to load (from css, images, javascript, etc), it could take some time depending on the server load and the clients computer processing and transfer speed, and whether or not anything is stored in cache.


I've read a few things over the years that actually encourage people to increase the number of connections via Firefoxes about:config or whatever it is.

I'm certain the default in Opera is eight connections.
Strong with this one, the sudo is.
  • Bogey
  • Bogey
  • Genius
  • User avatar
  • Joined: Jul 14, 2005
  • Posts: 8211
  • Loc: USA
  • Status: Offline

Post April 14th, 2009, 3:20 pm

I wish this would be so... it would be really awesome! Or is it so? lol
"Bring forth therefore fruits meet for repentance:" Matthew 3:8
  • spork
  • Brewmaster
  • Silver Member
  • User avatar
  • Joined: Sep 22, 2003
  • Posts: 6129
  • Loc: Seattle, WA
  • Status: Offline

Post April 14th, 2009, 3:31 pm

joebert wrote:
I really am leaning toward the proposed <link> element for an archive, however I disagree with using a new file format because of how readily available and easy to transition to, the existing zip/gz/bz2 formats would be.
I think the mime-type should remain application/x-gzip|application/zip|etc, the extension remain zip|gz|bz2|etc, and a rel attribute would be the only thing telling the browser what to do with the archive.

Code: [ Select ]
<link href="./ui.zip" rel="nra" type="application/zip"/>

I completely agree that existing compression standards should be supported. I guess I was more or less using .nra as a sort of abstract idea to represent any kind of compressed file format.
The Beer Monocle. Classy.
  • Bozebo
  • Expert
  • Expert
  • User avatar
  • Joined: Feb 15, 2006
  • Posts: 709
  • Loc: 404
  • Status: Offline

Post April 14th, 2009, 3:40 pm

sounds interesting. if it existed then it would exist in every browser but IE from day one and appear in IE about 10 years late when something new replaces it...

Though, with images I already use one image with multiple areas on it, and have it shifted by pixels to reduce http requests - similar to the common mouse-over image change technique
  • joebert
  • Sledgehammer
  • Genius
  • No Avatar
  • Joined: Feb 10, 2004
  • Posts: 13455
  • Loc: Florida
  • Status: Offline

Post April 14th, 2009, 4:04 pm

Quote:
I completely agree that existing compression standards should be supported. I guess I was more or less using .nra as a sort of abstract idea to represent any kind of compressed file format.


I see.


Quote:
Though, with images I already use one image with multiple areas on it, and have it shifted by pixels to reduce http requests - similar to the common mouse-over image change technique


Well, the thing that got me going with this in the first place, is that I'm currently working on revamping a forum theme and the way it's currently setup I either have to rewrite certain parts of the application itself to support CSS sprites, live with numerous HTTP connections, or get rid of some eye candy.

Basically, all of my options suck at the moment.
Strong with this one, the sudo is.
  • joebert
  • Sledgehammer
  • Genius
  • No Avatar
  • Joined: Feb 10, 2004
  • Posts: 13455
  • Loc: Florida
  • Status: Offline

Post April 17th, 2009, 5:35 pm

I'm thinking that with a <link> element and a package, the directory structure of the compressed file would need to match that found in the elements using the resources.

This would be easy to do in most cases I think. Since the directory structure already exists on the server, it would just need to be packaged from the DocumentRoot of the site and the package would need to be loaded from the DocumentRoot.

I'm not sure whether a standard should require packages to be in the DocumentRoot though. And if packages are allowed to be loaded from any location on the server, for instance

Code: [ Select ]
<link href="/resources/ui.zip" rel="nra" type="application/zip"/>


Should the browser look for the items in that package assuming the package starts at DocumentRoot, or should the paths be relative to where the package was loaded.

For instance, if the above package was loaded and contained "/buttons/button.png", should the path available to further elements in the page be "/buttons/button.png" or "/resources/buttons/button.png" ?

I suppose there's the option of adding support for a <meta> element to decide this, working in a similar fashion to a <base> element. Perhaps it could be called "nra-base".

Code: [ Select ]
<meta http-equiv="nra-base" value="/resources/"/>
Strong with this one, the sudo is.
  • spork
  • Brewmaster
  • Silver Member
  • User avatar
  • Joined: Sep 22, 2003
  • Posts: 6129
  • Loc: Seattle, WA
  • Status: Offline

Post April 20th, 2009, 7:48 am

In my opinion the path specified in a particular element should be the direct path to the element within the archive, thus:

Code: [ Select ]
<link name="resources" href="/resources/ui.zip" rel="nra" type="application/zip"/>


would establish the location of the archive itself, and all references to files within that archive should be relative, with the root of the archive acting as the root of the resource itself:

Code: [ Select ]
<img nra="resources" src="images/buttons/button.png" alt="Home"/>
The Beer Monocle. Classy.
  • effim
  • Beginner
  • Beginner
  • User avatar
  • Joined: Apr 21, 2009
  • Posts: 35
  • Loc: Austin, TX
  • Status: Offline

Post April 21st, 2009, 6:57 pm

While the proposal is interesting and could be the start of solving the problem you present, I don't really think the problem being presented is in fact a problem currently or will become one in the future. Let me explain.

First, you're saying you're interested in reducing connections, not requests. Assuming that a server is using keep-alive and the browser supports it, hundreds of files can be served within a single connection through separate requests.

If you are getting at reducing requests, then obviously archiving resources into a single file or otherwise sending a multi-part response (like in email attachments) would be a way of doing that. Again, though, I don't really see any problem with the way it's currently done, provided that the server is configured properly, which brings me to the next thing...

Handling of an HTTP request for a static resource should be inexpensive, provided that the connection has adequate bandwidth and the file isn't large. A small server can determine the file requested, determine the best method for serving it, and serve it, all within a few milliseconds and without using much memory. Even for thousands of concurrent requests on thousands of unique static files, a suitably equipped system can cache the files in memory and incur no performance penalty on reading files from the disk.

Bigwebmaster wrote:
This entire proposal I find interesting because I really do think it would decrease load on servers a great deal as well as speeding up the time it would take for pages to load on the client's end. The majority of the load on the ozzu server is HTTP connections.


This is where things become problematic. Rather than serving requests using a server properly configured to serve static files, you're using a server configured as a one-size-fits-all solution that is likely loaded up with several mods for authorization, caching, ssl, and PHP. Due to the architecture of Apache, those mods incur a memory hit even when they aren't used (so each Apache thread has PHP capabilities, even for static resource requests).

The solution to this problem is simply to use several instances of the same server (Apache, for example) configured to handle different file types. I'm not as much of an Apache guy as I am a Lighttpd guy, though, so I can't tell you how to do it (though I know Dreamhost does). Alternately, you can use Apache to continue serving dynamic requests as it currently does, and then use a lightweight server light Lighttpd configured to serve static files.

Photoshop is overkill for resizing JPG images. Apache configured for dynamic content is overkill for serving 3KB CSS files.

----

Just for kicks: Assuming we implemented some sort of file packaging architecture within HTTP, what happens with caching when a single file out of a package of 100 changes? While you might reduce the overall number of HTTP requests, you could very well increase the amount of bandwidth being served unless the server could selectively serve files out of a package (essentially your web server would need to do all the compiling of archives when the request comes through). I think a better solution would be simple to allow for multipart HTTP requests and responses similar to the way emails are handled.
  • joebert
  • Sledgehammer
  • Genius
  • No Avatar
  • Joined: Feb 10, 2004
  • Posts: 13455
  • Loc: Florida
  • Status: Offline

Post April 21st, 2009, 7:20 pm

Quote:
First, you're saying you're interested in reducing connections, not requests.

Assuming that a server is using keep-alive and the browser supports it, hundreds of files can be served within a single connection through separate requests.


Bad choice of words on my part. I intended connections to be all-inclusive of connections and the requests being made.

Any way you say it, the situation is still analogous to making a dozen trips to the grocery store to pickup a single carton of eggs.
Whether Keep-Alive is used or not is like whether you keep driving the same car or get in a new car each trip.

Quote:
Handling of an HTTP request for a static resource should be inexpensive, provided that the connection has adequate bandwidth and the file isn't large. A small server can determine the file requested, determine the best method for serving it, and serve it, all within a few milliseconds and without using much memory. Even for thousands of concurrent requests on thousands of unique static files, a suitably equipped system can cache the files in memory and incur no performance penalty on reading files from the disk.


How many bytes do you reckon are used in an HTTP request/response for headers ?
Did you know Google uses a CSS Sprite for their result page logo and buttons/etc ?
Did you know that recently Slashdot, a site with a term named after it for taking sites down with traffic, started using CSS Sprites to reduce their load ?

Quote:
Rather than serving requests using a server properly configured to serve static files, you're using a server configured as a one-size-fits-all solution that is likely loaded up with several mods for authorization, caching, ssl, and PHP.


There's a lot of servers out there doing exactly that from what I've gathered while reading around.

Quote:
Just for kicks: Assuming we implemented some sort of file packaging architecture within HTTP, what happens with caching when a single file out of a package of 100 changes?


The same thing that happens when the CSS Sprite Google uses changes. The whole file is replaced.
If changes are often enough to increase resource usage, it's probably a good idea to rethink which files are in which packages. :)
Strong with this one, the sudo is.
  • effim
  • Beginner
  • Beginner
  • User avatar
  • Joined: Apr 21, 2009
  • Posts: 35
  • Loc: Austin, TX
  • Status: Offline

Post April 21st, 2009, 10:44 pm

joebert wrote:
Any way you say it, the situation is still analogous to making a dozen trips to the grocery store to pickup a single carton of eggs. Whether Keep-Alive is used or not is like whether you keep driving the same car or get in a new car each trip.


Not to nitpick, but we're talking about something inexpensive here. Driving to the store repeatedly consumes a vast amount of resources in comparison to the result. The HTTP headers consume some bandwidth, yes, but typically it's insignificant compared to the content being transferred. We still can't manage to get people to remove extra white space that accounts for several kilobytes from their HTML, CSS, and JavaScript files that go into a production environment, not to mention removing fundamentally useless server identifier tags that tend to suck up several hundred bytes.

I hardly think that we should implement new HTTP protocol for Google and Slashdot, personally. Google, for one, could simply utilize their Google Gears (Slashdot could too, for that matter) to store static files on the users machine and update them only when they need to be updated.

joebert wrote:
There's a lot of servers out there doing exactly that from what I've gathered while reading around.


I agree, and I think it's ridiculous, especially when it matters (like on Ozzu). Again, though, I don't think the solution is to create additional methods to solve it when a much more semantic one exists. To recycle your metaphor, these sites are using a large pickup truck to go get eggs instead of taking a scooter or a bicycle.

You didn't mention multi-part requests or responses. What are your thoughts on a system like that?
  • Bozebo
  • Expert
  • Expert
  • User avatar
  • Joined: Feb 15, 2006
  • Posts: 709
  • Loc: 404
  • Status: Offline

Post April 22nd, 2009, 5:49 am

effim poses an interesting argument. Though, the proposed technique would require changes made on the client side - and the package of resources to be gathered in one http request is just a normal archive file.
eg:
index.html
files.zip

files.zip contains the images and stylesheets (provided they are not dynamically produced).
And perhaps dynamic files could be external from the archive - so the server isn't re building it for every request.
  • effim
  • Beginner
  • Beginner
  • User avatar
  • Joined: Apr 21, 2009
  • Posts: 35
  • Loc: Austin, TX
  • Status: Offline

Post April 22nd, 2009, 10:03 am

I think the current HTTP specs might actually support a multipart response like I was mentioning...I'm digging in to see what I can find...

http://www.w3.org/Protocols/rfc2616/rfc ... l#sec3.7.2

and

http://www.motobit.com/tips/detpg_multi ... e-request/

Update

Apparently this is solidly supported, using a multipart/related mimetype in the HTTP response. Unfortunately, the RFC calls for the client to explicitly 'Allow' a multipart response in the headers. I want to give it a try a little later and see what I can come up with.
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: Dec 20, 2002
  • Posts: 8922
  • Loc: Seattle, WA & Phoenix, AZ
  • Status: Offline

Post April 22nd, 2009, 11:40 am

Currently the Ozzu server is doing fine, but I was just pointing out where most of the load is coming from. I agree with your point about using different configurations for static vs dynamic content, etc. Eventually down the road if ozzu gets big enough there would likely be server for static content, a server for dynamic content such as scripts, and a SQL server for all the database stuff. For now though all of that is not needed since I try to optimize everything I can and the server is still able to handle the load. I would classify it as a one-size-fits-all at the moment.

effim wrote:
We still can't manage to get people to remove extra white space that accounts for several kilobytes from their HTML, CSS, and JavaScript files that go into a production environment, not to mention removing fundamentally useless server identifier tags that tend to suck up several hundred bytes.


I think most people do not do it because they simply don't know any better. I use this:

http://developer.yahoo.com/yui/compressor/

to compress all of the CSS and Javascript on the site. If I recall that is saving about 40KB per user who visits the site. If your site doesn't get much traffic you probably don't need to worry about the nitty gritty details like that, but once you get enough visitors every little thing adds up.

30000 visitors x 40KB = 1.2 GB per day saved
Ozzu Hosting - Want your website on a fast server like Ozzu?
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post April 22nd, 2009, 11:40 am

Post Information

  • Total Posts in this topic: 42 posts
  • Users browsing this forum: No registered users and 200 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 2011 Unmelted, LLC. Ozzu® is a registered trademark of Unmelted, LLC.