<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>D90 Tools &#38; Techniques &#187; Web Hosting Tools</title>
	<atom:link href="http://www.d90.us/toolbox/category/web-hosting-tools/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.d90.us/toolbox</link>
	<description>So I can remember how I did stuff in the future...</description>
	<lastBuildDate>Fri, 26 Nov 2010 20:08:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.3</generator>
		<item>
		<title>Custom 404 for Apache (using PHP!)</title>
		<link>http://www.d90.us/toolbox/2010/11/04/custom-404-for-apache-using-php/</link>
		<comments>http://www.d90.us/toolbox/2010/11/04/custom-404-for-apache-using-php/#comments</comments>
		<pubDate>Thu, 04 Nov 2010 23:03:44 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Sysadmin Tools]]></category>
		<category><![CDATA[Web Hosting Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/?p=210</guid>
		<description><![CDATA[Images, css, js just get a simple 404 Not Found page. Everything not in the list of items we&#8217;re checking get redirected to the homepage. So a simple typo will get the redirect, but a missing PNG file that&#8217;s called by one of our pages won&#8217;t send a copy of the homepage to the client [...]]]></description>
			<content:encoded><![CDATA[<p>Images, css, js just get a simple 404 Not Found page.</p>
<p>Everything not in the list of items we&#8217;re checking get redirected to the homepage.  So a simple typo will get the redirect, but a missing PNG file that&#8217;s called by one of our pages won&#8217;t send a copy of the homepage to the client telling it it is an image file!</p>
<p><code><?php<br />
/*<br />
This is a custom 404 handler.</p>
<p>It does makes a decision -- if it's a small, auxiliary file like an image, code sheet, etc<br />
that is being requested and not found then we'll send a standard Apache 404 page.</p>
<p>However, anything that looks like a website request (i.e. everything NOT in the list)<br />
will be given a 301 redirect to our homepage.</p>
<p>Invoke by:<br />
ErrorDocument 404 /404.php</p>
<p>4 November 2010<br />
Matt Kivela<br />
*/</p>
<p>/* Currently set to filter:<br />
   aspx<br />
   asp<br />
   cgi<br />
   css<br />
   gif<br />
   ico<br />
   jpg<br />
   ogg<br />
   png<br />
*/</p>
<p>if (preg_match('/\.[Aa][Ss][Hh][Xx]|[Aa][Ss][Pp]|[Cc][Gg][Ii]|[Cc][Ss][Ss]|[Gg][Ii][Ff]|[Ii][Cc][Oo]|[Jj][Pp][Gg]|[Jj][Ss]|[Oo][Gg][Gg]|[Pp][Nn][Gg]$/', $_SERVER[REQUEST_URI], $foo))<br />
  {<br />
     echo Header("HTTP/1.1 404 Not Found");<br />
     echo Header("Status: 404 Not Found");<br />
     echo "<html><body>404 Error:</br>File: http://$_SERVER[SERVER_NAME]$_SERVER[REQUEST_URI] not found.</br>";<br />
     echo "If this is causing a problem, you may contact <a href=\"mailto:admin@yourdomain.org\">admin@yourdomain.org</a> or</br>";<br />
     echo "submit a ticket at <a href=\"http://bugzilla.yourdomain.org/\">http://bugzilla.yourdomain.org/</a></body></html>";<br />
   }<br />
else<br />
  {<br />
    $new_url = "http://$_SERVER[SERVER_NAME]/";<br />
    echo Header( "HTTP/1.1 301 Moved Permanently" );<br />
    echo Header( "Location: $new_url" );<br />
  }<br />
?></p>
<p>~<br />
</code></p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2010/11/04/custom-404-for-apache-using-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Squid handling http &#8211;&gt; https redirects</title>
		<link>http://www.d90.us/toolbox/2009/05/29/squid-handling-http-https-redirects/</link>
		<comments>http://www.d90.us/toolbox/2009/05/29/squid-handling-http-https-redirects/#comments</comments>
		<pubDate>Sat, 30 May 2009 01:35:48 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Squid]]></category>
		<category><![CDATA[Sysadmin Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/?p=95</guid>
		<description><![CDATA[In configuring Squid to handle both our port 80 and 443 traffic, we have the issue that we can use redirects at the webserver level to redirect certain pages to https:// . So this is handled in Squid. First, make a simple script.  There&#8217;s a possibility another redirector like Squirm might do a better job, [...]]]></description>
			<content:encoded><![CDATA[<p>In configuring Squid to handle both our port 80 and 443 traffic, we have the issue that we can use redirects at the webserver level to redirect certain pages to https:// .</p>
<p>So this is handled in Squid.</p>
<p>First, make a simple script.  There&#8217;s a possibility another redirector like Squirm might do a better job, but I haven&#8217;t played with them.</p>
<blockquote><p>!/usr/bin/perl<br />
$|=1;<br />
while (&lt;&gt;) {<br />
s@http://www7.getmiro.(com|net|org)/adopt(.*)$@301:https://www7.getmiro.com/adopt$2@;<br />
print;<br />
}</p></blockquote>
<p>Saved at /etc/squid3/squid_redirector.pl and chown/chmod so the user &#8220;proxy&#8221; that squid runs under can run it.  Your path, of course, may vary.</p>
<p>The key part for what we need is that we pre-pend &#8220;301&#8243; before https:  in the rewrite.  When this is returned to the user&#8217;s browser it redirects them to the secure page.  This script also takes anything at com, net, or org and forces them to a tld of .com as well.</p>
<p>It&#8217;s easy to test this perl script.  Simply type ./squid_redirector.pl which launches it interactively.</p>
<blockquote><p><span style="color: #000000;"># ./squid_redirector.pl<br />
</span><span style="color: #ff0000;"><span style="color: #000000;">http://www7.getmiro.com/foo</span></p>
<p>http://www7.getmiro.com/foo</p>
<p><span style="color: #000000;">http://www7.getmiro.com/adopt/test</span><br />
301:https://www7.getmiro.com/adopt/test<br />
<span style="color: #000000;">http://www7.getmiro.<strong>net</strong>/adopt/matt/is/an/evil/genius </span><br />
301:https://www7.getmiro.<strong>com</strong>/adopt/matt/is/an/evil/genius</span></p></blockquote>
<p>Next, tell Squid to use it.  We need to enable these lines in the squid.conf file:</p>
<p style="padding-left: 30px;">url_rewrite_program /etc/squid3/squid_redirector.pl<br />
url_rewrite_children 10<br />
url_rewrite_host_header off<br />
url_rewrite_bypass on</p>
<p>The first line tells Squid what to use to rewrite URLs, the second tells it to spawn 10 instances on startup.  I&#8217;m not sure, in the end, if host_header needs to be off.  url_rewrite_bypass on allows Squid to skip the re-writing step if all the redirectors are busy.  That&#8217;s a decision knowing our security risks, users, and needs &#8212; and I&#8217;m going with more reliability over absolute security.  We&#8217;ll should see skips showing up in the logs and adjust settings from there if necessary.</p>
<p>Restart Squid, give it a test.  Famous last words &#8212; it should work now.</p>
<p>References:</p>
<p>http://wiki.squid-cache.org/Features/Redirectors</p>
<p>http://brainextender.blogspot.com/2009/01/simple-squid-redirector-perl-script.html</p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2009/05/29/squid-handling-http-https-redirects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lighttpd, virtual hosts, alternative ports</title>
		<link>http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-alternative-ports/</link>
		<comments>http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-alternative-ports/#comments</comments>
		<pubDate>Fri, 29 May 2009 20:25:50 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Lighttpd]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Sysadmin Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/?p=92</guid>
		<description><![CDATA[In the configuration of our new server, all ports 80 and 443 traffic is handled by Squid as a reverse proxy.  8080 is the &#8220;backdoor&#8221; that bypasses Squid and hits Lighttpd directly. But the standard format of a Lighttpd virtual host entry doesn&#8217;t recognize alternate ports appended after the tld.  Not a big deal, this [...]]]></description>
			<content:encoded><![CDATA[<p>In the configuration of our new server, all ports 80 and 443 traffic is handled by Squid as a reverse proxy.  8080 is the &#8220;backdoor&#8221; that bypasses Squid and hits Lighttpd directly.</p>
<p>But the standard format of a Lighttpd virtual host entry doesn&#8217;t recognize alternate ports appended after the tld.  Not a big deal, this does the trick:</p>
<blockquote><p>$HTTP["host"] =~ &#8220;(^|\.)getmiro\.(com|net|org)($|:8080$)&#8221; {</p></blockquote>
<p>Translated:<br />
<em>(^|\.)</em> Any hostname<br />
<em>getmiro\</em> Going to the gemtiro domain<br />
<em>.(com|net|org)</em> with a top level domain of com, net, or org<br />
<em>($|:8080$){</em> and ending with the tld or :8080 will be processed by the rules that follow.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-alternative-ports/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Lighttpd, virtual hosts, and wildcard domains</title>
		<link>http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-and-wildcard-domains/</link>
		<comments>http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-and-wildcard-domains/#comments</comments>
		<pubDate>Fri, 29 May 2009 20:01:53 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Sysadmin Tools]]></category>
		<category><![CDATA[Web Hosting Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/?p=86</guid>
		<description><![CDATA[So we&#8217;re setting up mirocommunity.com, and I don&#8217;t want to be hassled continously to create new hostnames in DNS. To avoid that, it&#8217;s a simple wildcard entry like this in the appropriate named database: *.mirocommunity.com.    IN      CNAME   mirocommunity.com. Which directs everything to our server. Now our server hosts multiple sites via host entries, so we [...]]]></description>
			<content:encoded><![CDATA[<p>So we&#8217;re setting up mirocommunity.com, and I don&#8217;t want to be hassled continously to create new hostnames in DNS.</p>
<p>To avoid that, it&#8217;s a simple wildcard entry like this in the appropriate named database:</p>
<blockquote><p>*.mirocommunity.com.    IN      CNAME   mirocommunity.com.</p></blockquote>
<p>Which directs everything to our server.</p>
<p>Now our server hosts multiple sites via host entries, so we can&#8217;t use a simple negation like this:</p>
<blockquote><p>$HTTP["host"] !~ &#8220;^(www|medfield)\.mirocommunity\.(com|net|org)($|:8080$)&#8221; {<br />
url.redirect = (<br />
&#8220;^(.*)$&#8221; =&gt; &#8220;http://www.mirocommunity.com$1&#8243;,<br />
)<br />
}</p></blockquote>
<p>Note the negation by using !~ instead of =~.  That would work if all we had was mirocommunity sites to host, but when hitting another site on the server like www7.getmiro.com it would read it as not being www or medfield dot mirocommunity, and thus drop you to www.mirocommunity.com.  For the curious, the 8080 part of the url parsing is a bypass of the Squid proxies on ports 80 and 443.</p>
<p>Anything that doesn&#8217;t match a virtual host or alias on our server gets dropped by default to /var/www.</p>
<p>There lies the simple solution &#8212; put an index.php file there that does the redirect work:</p>
<blockquote><p>&lt;?php<br />
// Hostnames that aren&#8217;t matched in Lighttpd get dropped here<br />
// by default.<br />
// This script removes the hostname(s) and drops them to<br />
// www.[domain].[tld]<br />
// 29 May 2009 MRK<br />
$split_host = split(&#8220;\.&#8221;, $_SERVER[HTTP_HOST]);<br />
$domain = count($split_host) &#8211; 2;<br />
$tld = count($split_host) &#8211; 1;<br />
$new_host = &#8220;http://www.$split_host[$domain].$split_host[$tld]&#8220;;<br />
// echo &#8220;$new_host&#8221;;<br />
header(&#8220;Location: $new_host&#8221;);<br />
exit;<br />
?&gt;</p></blockquote>
<p>The <em>split</em> command splits the $_SERVER[HTTP_HOST] variable at each period, and put it&#8217;s contents less the periods into an array called $split_host.</p>
<p>The <em>count($split_host)</em> determines how many members we have in the $split_host array.  We know we always want the last (the top level domain &#8212; .com, etc) and second to last (the domain &#8212; mirocommunity, etc).  Since arrays start at 0, we simply count -1 for the tld and -2 for the domain.</p>
<p>By adding the <em>count</em> logic, we can handle domains like brooklyn.newyork.mirocommunity.com which have more then one hostname before the domain and tld.</p>
<p>$new_host then forms the URL we want to catch wildcard hostnames that haven&#8217;t been configured yet.  It&#8217;s simply the www.domain.tld form.  That&#8217;s fed to a http<em> header</em> which causes the user&#8217;s browser to redirect to the default website we want.</p>
<p>So as of today, while newyork.mirocommunity.com and brooklyn.newyork.mirocommunity.com have no virtual hosts, you do arrive successfully at www.mirocommunity.com.</p>
<p>Our developers can activate those hostnames simply by adding an entry in the appropriate lighttpd conf file and reload lighttpd &#8212; no need to contact the sysadmin to go make an entry in our DNS system for each new city added.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-and-wildcard-domains/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Installing Squid to handle both 80 and 443</title>
		<link>http://www.d90.us/toolbox/2009/05/26/installing-squid-to-handle-both-80-and-443/</link>
		<comments>http://www.d90.us/toolbox/2009/05/26/installing-squid-to-handle-both-80-and-443/#comments</comments>
		<pubDate>Tue, 26 May 2009 20:54:41 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Web Hosting Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/?p=70</guid>
		<description><![CDATA[This outlines configuring Squid, running two instances, to handle both port 80 and 443 traffic on an Amazon EC2 instance running Ubuntu Jaunty.  We can bypass Squid by going directly to Lighttpd on port 8080. To answer a couple questions off the top, you should also read my post on how to configure http &#8211;&#62; [...]]]></description>
			<content:encoded><![CDATA[<p>This outlines configuring Squid, running two instances, to handle both port 80 and 443 traffic on an Amazon EC2 instance running Ubuntu Jaunty.  We can bypass Squid by going directly to Lighttpd on port 8080.</p>
<p>To answer a couple questions off the top, you should also read <a href="http://www.d90.us/toolbox/2009/05/29/squid-handling-http-https-redirects/" target="_blank">my post on how to configure http &#8211;&gt; https</a> redirects at the Squid level since the web server won&#8217;t be able to handle that in this configuration, and this <a href="http://www.d90.us/toolbox/2009/05/29/lighttpd-virtual-hosts-alternative-ports/" target="_blank">post</a> documents a little bit of magic that needs to be done to support 8080 with virtual hosts.</p>
<p>In configuring our new servers, the choice of Squid was pretty easy &#8212; it can handle SSL traffic, Varnish <a href="http://varnish.projects.linpro.no/wiki/FAQ#IsthereanywaytodoHTTPSwithVarnish" target="_blank">can&#8217;t</a> by itself.  We already use Squid to do ssl traffic on some of our physical servers being replaced, so I&#8217;d like to continue using that feature.  In a future post, we&#8217;ll talk about configuring Squid to use a Universal Certificate that can handle multiple domains on one IP (it looks doable in theory, but I haven&#8217;t purchased that yet).</p>
<p>Normally installation is a simple</p>
<p>apt-get install squid</p>
<p>to install Squid.  However, Ubuntu doesn&#8217;t package OpenSSL with Squid and for license reasons has no intention of doing so.  So you&#8217;re better off following <a href="http://www.d90.us/toolbox/2009/05/26/adding-ssl-support-to-squid-package-on-ubuntu/" target="_blank">these directions</a> and modifying a package to include ssl support, then installing that.</p>
<p>Modify /etc/squid/squid.conf</p>
<p>This is the port 80 traffic.  Note &#8212; we actually had a large number of &#8220;acl valid_dst dstdomain,&#8221; which block attempts to use Squid as a pass-thru proxy at the proxy level instead of having the webserver reject the traffic.</p>
<blockquote><p>http_port 80 accel vhost</p>
<p>cache_peer 127.0.0.1 parent 8080 0 no-query originserver login=PASS</p>
<p>logformat combined %&gt;a %ui %un [%{%d/%b/%Y:%H:%M:%S +0500}tl] &#8220;%rm %ru HTTP/%rv&#8221; %Hs %h&#8221; &#8220;%{User-Agent}&gt;h&#8221; %Ss:%Shaccess_log /var/log/squid/access.log combined</p>
<p>acl SSL_ports port 8080</p>
<p>acl Safe_ports port 8080</p>
<p>acl valid_dst dstdomain .somedomain.com</p>
<p>http_access allow valid_dst</p></blockquote>
<p>Copy squid.conf to squid_ssl.conf, comment out http_port and make the following changes:</p>
<blockquote><p>https_port 443 accel vhost cert=/(cert location) key=/(key location)<br />
cache_log /var/log/squid3/cache_ssl.log<br />
cache_store_log /var/log/squid3/store_ssl.log</p></blockquote>
<p>We have seperate cache and store logs for troubleshooting, but both configurations use access.log to record traffic.  This simplifies using AWStats to analyze the logs; if we run into performance problems in the future we may need a tool like logmerge.pl to consolidate seperate access logs.  While I can think of a few things that could go wrong, I don&#8217;t know they will go wrong till we try, so let&#8217;s see if the simple way works first.</p>
<p>Now, let&#8217;s configure and initialize a seperate spool for ssl traffic:</p>
<blockquote><p>mkdir /var/spool/squid3_ssl<br />
chown -R proxy:proxy /var/spool/squid3_ssl/<br />
squid3 -z -f /etc/squid3/squid_ssl.conf</p></blockquote>
<p>Copy /etc/init.d/squid3 to /etc/init.d/squid3_ssl and make the following changes:</p>
<blockquote><p>NAME=squid3_ssl<br />
SQUID_ARGS=&#8221;-D -YC -f /etc/squid3/squid_ssl.conf&#8221;<br />
CONFIG=/etc/squid3/squid_ssl.conf<br />
$DAEMON -z -f $CONFIG</p></blockquote>
<p>And do a ln -s /etc/init.d/squid3_ssl /etc/rc2.d/S30squid3_ssl to make it start automatically.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2009/05/26/installing-squid-to-handle-both-80-and-443/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Optimizing Website integration with Amazon&#8217;s S3 Service</title>
		<link>http://www.d90.us/toolbox/2009/02/28/optimizing-website-integration-with-amazons-s3-service/</link>
		<comments>http://www.d90.us/toolbox/2009/02/28/optimizing-website-integration-with-amazons-s3-service/#comments</comments>
		<pubDate>Sat, 28 Feb 2009 06:15:26 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Networking]]></category>
		<category><![CDATA[Sysadmin Tools]]></category>
		<category><![CDATA[Web Hosting Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/?p=48</guid>
		<description><![CDATA[At Participatory Culture Foundation we use Amazon&#8217;s S3 Service to host our static content &#8212; css, js, and images. This accomplishes two things &#8212; it improves the performance for our visitors since Amazon has faster performance and reliability then we can afford on our own servers, and it does so at a lower cost. In [...]]]></description>
			<content:encoded><![CDATA[<p>At Participatory Culture Foundation we use Amazon&#8217;s S3 Service to host our static content &#8212; css, js, and images.</p>
<p>This accomplishes two things &#8212; it improves the performance for our visitors since Amazon has faster performance and reliability then we can afford on our own servers, and it does so at a lower cost.</p>
<p>In this post we&#8217;ll look at how much bandwidth and files/redirect we use without S3, then with various combination of local and redirected files, up to code as optimized as I have been able to make it &#8212; fully optimized we our servers only transfer 6.7% of the bytes that the &#8220;unoptimized&#8221; site would.  Optimizing this single popular page to use S3 efficiently saves PCF about $1,000 a year in hosting costs.</p>
<h4>1)<br />
Let&#8217;s look at a redacted Squid log after the February, 2009 redesign of www.getmiro.com when not using S3 at all:</h4>
<pre><em>"GET http://www.getmiro.com/ HTTP/1.1" 200 21781 "-" "Mozilla/5.0
 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"
"GET http://www.getmiro.com//css/nav.css HTTP/1.1" 200 5004 "http://www.getmiro.com/" "Mozilla/5.0
"GET http://www.getmiro.com//css/styles.css HTTP/1.1" 200 21515 "http://www.getmiro.com/" "Mozilla/5.0
"GET http://www.getmiro.com/css/index.css HTTP/1.1" 200 7709 "http://www.getmiro.com/" "Mozilla/5.0
"GET http://www.getmiro.com//i/blue_bg.png HTTP/1.1" 200 1198 "http://www.getmiro.com//css/styles.css" "Mozilla/5.0
"GET http://www.getmiro.com//i/nav_back.gif HTTP/1.1" 200 530 "http://www.getmiro.com//css/nav.css" "Mozilla/5.0
[blah blah blah...]</em></pre>
<p>That&#8217;s 37 files, for a total of 338,359 Bytes.</p>
<h4>2)<br />
Now let&#8217;s look if we load CSS from our server, but use Apache to re-write images and js to the S3 service:</h4>
<pre><em>"GET http://www.getmiro.com/ HTTP/1.1" 200 21781 "http://www.getmiro.com/" "Mozilla/5.0
 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 Firefox/3.0.6"
"GET http://www.getmiro.com//css/nav.css HTTP/1.1" 200 5004 "http://www.getmiro.com/" "Mozilla/5.0
"GET http://www.getmiro.com//css/styles.css HTTP/1.1" 200 21515 "http://www.getmiro.com/" "Mozilla/5.0
"GET http://www.getmiro.com/css/index.css HTTP/1.1" 200 7709 "http://www.getmiro.com/" "Mozilla/5.0
"GET http://www.getmiro.com//i/blue_bg.png HTTP/1.1" 302 688 "http://www.getmiro.com//css/styles.css" "Mozilla/5.0
"GET http://www.getmiro.com//i/nav_back.gif HTTP/1.1" 302 690 "http://www.getmiro.com//css/nav.css" "Mozilla/5.0
[blah blah blah...]</em></pre>
<p>Now it&#8217;s four files, plus 33 redirects &#8212; and only 78,705 bytes.</p>
<h4>3)<br />
Now let&#8217;s use Apache to redirect the CSS to be pulled from Amazon S3.</h4>
<pre><em>"GET http://www.getmiro.com/ HTTP/1.1" 200 21781 "-" "Mozilla/4.0
 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
"GET http://www.getmiro.com//css/nav.css HTTP/1.1" 302 684 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com//css/styles.css HTTP/1.1" 302 690 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com/css/index.css HTTP/1.1" 302 688 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com//i/blue_bg.png HTTP/1.1" 302 687 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com//i/nav_back.gif HTTP/1.1" 302 689 "http://www.getmiro.com/" "Mozilla/4.0
[blah blah blah...]</em></pre>
<p>Still four files and 33 redirects, but down to 46,506 bytes.</p>
<p>So far it&#8217;s all been pretty standard stuff in Apache using mod_rewrite redirects.  Apache sees a CSS sheet being called, it redirects it to S3.</p>
<pre><em>RewriteRule ^/css/(.*) http://s3.getmiro.3.0.com.s3.amazonaws.com/css/$1</em></pre>
<p>And then a css sheet may have a line like this:</p>
<p><em>background: url(../i/screen_dropshadow.png) -20px -36px no-repeat;</em></p>
<p>Now an observant user may note in the logs above that I switched from using Firefox to IE.  Why?  The browsers interpret the CSS differently.</p>
<p>Firefox interprets &#8220;../i/&#8221; relative to where the CSS style sheet is LOADED from &#8212; in our case  http://s3.getmiro.3.0.com.s3.amazonaws.com/css/.<br />
Internet Explorer interprets &#8220;../i/&#8221; relative to where the CSS style sheet is CALLED from &#8212; in our case http://www.getmiro.com/css/.</p>
<p>Those familiar with unix notation know that &#8220;../i/&#8221; from &#8220;getmiro.com/css/&#8221; gets you to &#8220;getmiro.com/i/&#8221;.</p>
<h4>4)<br />
Now we get fancy.</h4>
<p>In implementing S3, we have a bash script which handles the synchronization between our servers and S3.  So in that script, let&#8217;s intercept the CSS sheets, do a simple SED, and upload the modified files to a special location:</p>
<pre><em># getmiro css
# This substitutes ../i with http://s3.getmiro.3.0.com.s3.amazonaws.com/ in the getmiro css code
# and uploads them to a special directory in amazon.  This is in turn re-written by Apache to point there.
# Having the full url hard coded saves tens of thousands of redirects and gigs of bandwidth.
# It's also more efficient then "php-ifying" css to do the url substitution.

  # First, copy they css to a working directory:
    cp /data/getmiro/css/*.css /scripts/getmiro_css

  # It's safer to just modify files we know about, rather then automate finding and modifying without foreknowledge:
    sed -i 's/\.\.\/i/http:\/\/s3.getmiro.3.0.com.s3.amazonaws.com\/i/g' /scripts/getmiro_css/download-features.css
    sed -i 's/\.\.\/i/http:\/\/s3.getmiro.3.0.com.s3.amazonaws.com\/i/g' /scripts/getmiro_css/index.css
    sed -i 's/\.\.\/i/http:\/\/s3.getmiro.3.0.com.s3.amazonaws.com\/i/g' /scripts/getmiro_css/nav.css
    sed -i 's/\.\.\/i/http:\/\/s3.getmiro.3.0.com.s3.amazonaws.com\/i/g' /scripts/getmiro_css/styles.css

  # And let's upload them:
    /usr/local/s3sync/s3sync.rb -r -p -v /scripts/getmiro_css/ s3.getmiro.3.0.com:css/s3_coded/</em></pre>
<p>In the background so the web developers don&#8217;t have to worry about modifying the CSS sheets to include the hard link, transforming lines like:</p>
<pre><em>background: url(../i/screen_dropshadow.png) -20px -36px no-repeat;
</em></pre>
<p>into</p>
<pre><em>background: url(</em><em><em>s3.getmiro.3.0.com.s3.amazonaws.com</em>/i/screen_dropshadow.png) -20px -36px no-repeat;</em></pre>
<p>It&#8217;s necessary to use the pattern \.\./i/ in sed in case a developer does hard code the amazon link.  The \. means literally a period; regexes like this otherwise use a . as a wildcard to match one character, and just ../i/ would match any two characters before /i/.</p>
<p>In Apache, we change the redirect to this:</p>
<pre><em>RewriteRule ^/css/(.*) http://s3.getmiro.3.0.com.s3.amazonaws.com/css/S3_coded/$1</em></pre>
<p>With this change implemented:</p>
<pre><em>"GET http://www.getmiro.com/ HTTP/1.1" 200 21781 "-" "Mozilla/4.0
 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
"GET http://www.getmiro.com//css/styles.css HTTP/1.1" 302 708 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com//css/nav.css HTTP/1.1" 302 702 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com/css/index.css HTTP/1.1" 302 706 "http://www.getmiro.com/" "Mozilla/4.0</em></pre>
<p>Much better!  One file, three redirects for 23,897 bytes.  That one change represents nearly a 50% reduction in bandwidth usage on our physical servers from just the example above, and only about 1/3rd the bandwidth if we used CSS sheets being served locally even if they had the hard links to S3 on them.</p>
<h4>5)<br />
Finally one more tweak.</h4>
<p>The default Apache redirect includes HTML code saying where a file has moved.  But this isn&#8217;t necessary &#8212; a web browser just needs the correct headers to tell it where to go.</p>
<p>So replacing the redirect to S3 we used before, we use this:</p>
<pre><em>RewriteRule ^/css/(.*) /custom_messages/css_rewrite.php</em></pre>
<p>This is css_rewrite.php:</p>
<pre><em>&lt;?php
/*
This rewrites css just using headers.  This saves about 300bytes per
redirect -- which saves a heck of a lot of bandwidth over time when we're doing 3
css rewrites for every page view...works out to 30+ MB / day!</em></pre>
<pre><em>Invoke by:
RewriteRule ^/css/(.*) /custom_messages/css_rewrite.php
*/</em></pre>
<pre><em>$new_server = "http://s3.getmiro.3.0.com.s3.amazonaws.com/css/s3_coded/";
$new_url = preg_replace('/^.*\//', $new_server, $_SERVER[REQUEST_URI]);</em></pre>
<pre><em>echo Header( "HTTP/1.1 301 Moved Permanently" );
echo Header( "Location: $new_url" );
?&gt;</em></pre>
<p>This produces a very minimal redirect &#8212; under 400 bytes rather then over 700 bytes.</p>
<p>I haven&#8217;t done a complete analysis to know if this significantly slower then a native Apache redirect; initial review shows it is not slower for any given page load.  This step would need a very, very busy site however to make a meaningful performance or cost impact.  It&#8217;s something I&#8217;m noting though, because there could be other situations this type of optimization could be useful.  Most importantly the variables Apache (or IIS) can pass to other programs like PHP.  See this <a href="http://koivi.com/apache-iis-php-server-array.php" target="_blank">link</a> for a list of them.</p>
<pre><em>"GET http://www.getmiro.com/ HTTP/1.1" 200 21781 "-" "Mozilla/4.0
 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"
"GET http://www.getmiro.com//css/styles.css HTTP/1.1" 301 397 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com//css/nav.css HTTP/1.1" 301 394 "http://www.getmiro.com/" "Mozilla/4.0
"GET http://www.getmiro.com/css/index.css HTTP/1.1" 301 396 "http://www.getmiro.com/" "Mozilla/4.0</em></pre>
<p>Now just one file, three redirects and 22,968.</p>
<h4>Bottom line?</h4>
<p>Let&#8217;s take a typical day when the getmiro.com homepage is called 20,000 times.</p>
<pre>Scenario    Size     Total Daily      Estimated
                       Bandwidth     Daily Cost**
1           338,359        6.5GB          $4.73
2            78,705        1.5GB           1.09
3            46,506        887MB*           .65
4            23,897        455MB            .33
5            22,968        438MB            .32
* Unadjusted for Firefox's interpretation of CSS paths
** This estimate is based on purchasing enough fixed bandwidth (Mbps)
to cover our peak daily usage.  Our communication costs for our
physical servers is approximately times as much as Amazon S3 based on
actual transfers.</pre>
<p>So without S3 or any optimization, we&#8217;d be looking at a monthly cost around $142.00.</p>
<p>With S3 and with all our optimization, we&#8217;re looking at a monthly cost around $54.00.</p>
<p>For a small non-profit, that&#8217;s a nice savings over time.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2009/02/28/optimizing-website-integration-with-amazons-s3-service/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Creating &amp; Debugging SSL Certificates</title>
		<link>http://www.d90.us/toolbox/2007/10/03/debugging-ssl-certificate-problems/</link>
		<comments>http://www.d90.us/toolbox/2007/10/03/debugging-ssl-certificate-problems/#comments</comments>
		<pubDate>Wed, 03 Oct 2007 13:19:45 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Web Hosting Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/2007/10/03/debugging-ssl-certificate-problems/</guid>
		<description><![CDATA[Generating a public SSL certificate: Information Needed: Country Name (2 letter code) [AU]: State or Province Name (full name) [Some-State]: Locality Name (e.g., city) [ ]: Organization Name (e.g., company) [Internet Widgets Pty Ltd]: Organizational Unit Name (e.g., section) [ ]: Common Name (e.g., YOUR name) [ ]: (See my email...) Email Address [ ]: Generate [...]]]></description>
			<content:encoded><![CDATA[<p>Generating a public SSL certificate:</p>
<blockquote>
<pre>Information Needed:
Country Name (2 letter code) [AU]:
State or Province Name (full name) [Some-State]:
Locality Name (e.g., city) [ ]:
Organization Name (e.g., company) [Internet Widgets Pty Ltd]:
Organizational Unit Name (e.g., section) [ ]:
Common Name (e.g., YOUR name) [ ]: (See my email...)
Email Address [ ]:</pre>
</blockquote>
<p>Generate a CSR from an existing private key:</p>
<blockquote>
<pre> openssl req -new -key private-key.pem -out /home/rfagundo/csr_2007.pem</pre>
</blockquote>
<p>==========================================<br />
Some handy commands:</p>
<p>View the key:<br />
openssl rsa -noout -text -in name.key</p>
<p>View the CSR:<br />
openssl req -noout -text -in name.csr</p>
<p>View the Certificate:<br />
openssl x509 -noout -text -in name.crt</p>
<p>Modulus (+ Exponent) should match between the key and the others&#8230;otherwise you get a key mismatch <img src='http://www.d90.us/toolbox/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>==========================================<br />
Squid Reverse Proxy &amp; SSL:</p>
<p>Squid can&#8217;t handle seperate chainfiles.  But it&#8217;s a pretty easy fix.</p>
<p>Use this sample for an Apache conf file for a site being moved to Squid:</p>
<blockquote><p>SSLEngine on<br />
SSLCertificateFile /etc/httpd/ssl/mysite/www.mysite.com.crt<br />
SSLCertificateKeyFile /etc/httpd/ssl/mysite/mysite.key<br />
SSLCertificateChainFile /etc/httpd/ssl/mysite/sf_intermediate.crt</p></blockquote>
<p>This process certainly can be modified / simplified as needed:<br />
Copy the www.mysite.com.crt to squid_www.mysite.com.crt<br />
Copy the text from sf_intermediate.crt and paste it to the bottom of squid_www.mysite.com.crt.</p>
<p>Here&#8217;s the Squid SSL line that uses those:</p>
<blockquote><p>https_port 8081 cert=/etc/httpd/ssl/mysite/squid_www.mysite.com.crt key=/etc/httpd/ssl/mysite/mysite.key vhost defaultsite=www.mysite.com</p></blockquote>
<p>BTW, in the above sample&#8230;<br />
public ip:443 = Apache running SSL<br />
public ip:8081 = Squid running SSL, which then connects to 127.0.0.1:8080 on Apache *not* running SSL.</p>
<p>In production, you take public ip:443 off apache, and have Squid listen on public ip:443 instead.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2007/10/03/debugging-ssl-certificate-problems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sun Java 5 &amp; Fedora Core 6</title>
		<link>http://www.d90.us/toolbox/2007/10/02/sun-java-5-fedora-core-6/</link>
		<comments>http://www.d90.us/toolbox/2007/10/02/sun-java-5-fedora-core-6/#comments</comments>
		<pubDate>Tue, 02 Oct 2007 15:44:27 +0000</pubDate>
		<dc:creator>Matt</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Web Hosting Tools]]></category>

		<guid isPermaLink="false">http://www.d90.us/toolbox/2007/10/02/sun-java-5-fedora-core-6/</guid>
		<description><![CDATA[Installation of Sun Java 5 on FC6 Download the rpm.bin from Sun.  When you run it, it will install the RPM too.  http://www.java.com/en/download/manual.jsp Now, update the java and javac called by default, via update-alternatives: The &#8220;1500&#8243; is the priority.  Higher number = precedence (on the machine I was installling on, the existing java was priority [...]]]></description>
			<content:encoded><![CDATA[<p>Installation of Sun Java 5 on FC6</p>
<p>Download the rpm.bin from Sun.  When you run it, it will install the RPM too.  <a href="http://www.java.com/en/download/manual.jsp">http://www.java.com/en/download/manual.jsp</a></p>
<p>Now, update the java and javac called by default, via update-alternatives:</p>
<p>The &#8220;1500&#8243; is the priority.  Higher number = precedence (on the machine I was installling on, the existing java was priority 1420)</p>
<p>And you may want to use &#8220;find / -name java&#8221;, etc to find the actual locations on your system for linking in the new versions.</p>
<p>This seems like a good reference, too:  <a href="http://www.gagme.com/greg/linux/fc6-tips.php">http://www.gagme.com/greg/linux/fc6-tips.php</a></p>
<blockquote><p># To display (and verify) settings before and after:<br />
/usr/sbin/update-alternatives &#8211;display java</p>
<p># Java itself:<br />
/usr/sbin/update-alternatives &#8211;install \<br />
/usr/bin/java java /usr/java/jdk1.5.0_12/jre/bin/java 1500 \<br />
&#8211;slave /usr/bin/keytool keytool /usr/java/jdk1.5.0_12/jre/bin/keytool \<br />
&#8211;slave /usr/bin/rmiregistry rmiregistry /usr/java/jdk1.5.0_12/jre/bin/rmiregistry</p>
<p># To display (and verify) settings before and after:<br />
  /usr/sbin/update-alternatives &#8211;display javac</p>
<p># Javac:<br />
/usr/sbin/update-alternatives &#8211;install \<br />
/usr/bin/javac javac /usr/java/jdk1.5.0_12/bin/javac 1500 \<br />
&#8211;slave /usr/bin/javadoc javadoc /usr/java/jdk1.5.0_12/bin/javadoc \<br />
&#8211;slave /usr/bin/javah javah /usr/java/jdk1.5.0_12/bin/javah \<br />
&#8211;slave /usr/bin/jar jar /usr/java/jdk1.5.0_12/bin/jar \<br />
&#8211;slave /usr/bin/jarsigner jarsigner /usr/java/jdk1.5.0_12/bin/jarsigner \<br />
&#8211;slave /usr/bin/appletviewer appletviewer /usr/java/jdk1.5.0_12/bin/appletviewer \<br />
&#8211;slave /usr/bin/rmic rmic /usr/java/jdk1.5.0_12/bin/rmic<br />
  # These *may* be needed&#8230;doesn&#8217;t seem to be used on the system I&#8217;m working on (i.e. no java_jdk actually exist outside of /etc/alternatives)<br />
  # &#8211;slave /usr/lib/jvm/java java_sdk $sun/jdk \<br />
  #&#8211;slave /usr/lib/jvm-exports/java java_sdk_exports $sun/jdk/jre/lib<br />
 <br />
 </p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.d90.us/toolbox/2007/10/02/sun-java-5-fedora-core-6/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

