apachedude
24 Mar 2007, 03:36 PM
Original Article (http://www.askapache.com/2007/webmaster/faster-google-analytics-with-a-local-urchinjs.html)This method uses crontab to execute a shell script that downloads an updated urchin.js file every 24 hours and saves it into your local sites directory. Thats it!
The problem occurs when google-analytics.com/urchin.js is requested by billions of web users all over the world at one time, it can cause your sites pages to load at a snails pace. Especially if you are using WordPress or a similar CMS.
Official Google Position (http://www.google.com/support/analytics/bin/answer.py?answer=43183&query=urchin.js&topic=&type=) on locally hosting urchin.js
Setup crontab by typing crontab -e at a unix-style command prompt (ssh) then add:
11 12 * * * /home/user/websites/urch.sh >/dev/null 2>&1
shell script example
#!/bin/sh
rm /home/user/websites/askapache.com/z/j/urchin.js
cd /home/user/websites/askapache.com/z/j/
wget http://www.google-analytics.com/urchin.js
chmod 644 /home/user/websites/askapache.com/z/j/urchin.js
cd ${OLDPWD}
exit 0;
There are 2 pretty major things that you accomplish by hosting urchin.js locally
You Enable persistant connections
You ensure that the correct 304 Not Modified header is sent back to your site visitors instead of reserving the entire file.
One problem with remote hosted urchin.js is the server that the urchin.js file is served from does not allow persistant connections.
Another big big reason is that even though Cache-Control headers are correctly set by google-analytics when serving urchin.js, Instead of responding to an If-Modified-Since header correctly with a 304 Not Modified header, indicating the file has not been modified, google-analytics instead returns the entire urchin.js file again, thus rendering the cache-control void.
You can see this problem clearly with a wireshark (http://wireshark.org) capture (http://www.askapache.com/2007/htaccess/sniff-http-to-debug-apache-htaccess-and-httpdconf.html).
GET /urchin.js HTTP/1.1
Accept: */*
Referer: http://www.askapache.com
Accept-Language: en-us
UA-CPU: x86
Accept-Encoding: gzip, deflate
If-Modified-Since: Tue, 20 Mar 2007 22:49:11 GMT
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SU 2.011; .NET CLR 1.1.4322; .NET CLR 2.0.50727; Alexa Toolbar; .NET CLR 3.0.04506.30)
Host: www.google-analytics.com
Connection: Keep-Alive
HTTP/1.1 200 OK
Cache-Control: max-age=604800, public
Content-Type: text/javascript
Last-Modified: Tue, 20 Mar 2007 22:54:02 GMT
Content-Encoding: gzip
Server: ucfe
Content-Length: 5675
Date: Sat, 24 Mar 2007 18:23:12 GMT
Note: You will need a caching scheme (http://www.askapache.com/2006/htaccess/speed-up-sites-with-htaccess-caching.html) on your server for optimum results.
Pretty sweet bit of overkill!
The problem occurs when google-analytics.com/urchin.js is requested by billions of web users all over the world at one time, it can cause your sites pages to load at a snails pace. Especially if you are using WordPress or a similar CMS.
Official Google Position (http://www.google.com/support/analytics/bin/answer.py?answer=43183&query=urchin.js&topic=&type=) on locally hosting urchin.js
Setup crontab by typing crontab -e at a unix-style command prompt (ssh) then add:
11 12 * * * /home/user/websites/urch.sh >/dev/null 2>&1
shell script example
#!/bin/sh
rm /home/user/websites/askapache.com/z/j/urchin.js
cd /home/user/websites/askapache.com/z/j/
wget http://www.google-analytics.com/urchin.js
chmod 644 /home/user/websites/askapache.com/z/j/urchin.js
cd ${OLDPWD}
exit 0;
There are 2 pretty major things that you accomplish by hosting urchin.js locally
You Enable persistant connections
You ensure that the correct 304 Not Modified header is sent back to your site visitors instead of reserving the entire file.
One problem with remote hosted urchin.js is the server that the urchin.js file is served from does not allow persistant connections.
Another big big reason is that even though Cache-Control headers are correctly set by google-analytics when serving urchin.js, Instead of responding to an If-Modified-Since header correctly with a 304 Not Modified header, indicating the file has not been modified, google-analytics instead returns the entire urchin.js file again, thus rendering the cache-control void.
You can see this problem clearly with a wireshark (http://wireshark.org) capture (http://www.askapache.com/2007/htaccess/sniff-http-to-debug-apache-htaccess-and-httpdconf.html).
GET /urchin.js HTTP/1.1
Accept: */*
Referer: http://www.askapache.com
Accept-Language: en-us
UA-CPU: x86
Accept-Encoding: gzip, deflate
If-Modified-Since: Tue, 20 Mar 2007 22:49:11 GMT
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SU 2.011; .NET CLR 1.1.4322; .NET CLR 2.0.50727; Alexa Toolbar; .NET CLR 3.0.04506.30)
Host: www.google-analytics.com
Connection: Keep-Alive
HTTP/1.1 200 OK
Cache-Control: max-age=604800, public
Content-Type: text/javascript
Last-Modified: Tue, 20 Mar 2007 22:54:02 GMT
Content-Encoding: gzip
Server: ucfe
Content-Length: 5675
Date: Sat, 24 Mar 2007 18:23:12 GMT
Note: You will need a caching scheme (http://www.askapache.com/2006/htaccess/speed-up-sites-with-htaccess-caching.html) on your server for optimum results.
Pretty sweet bit of overkill!