Put your message here! Contact me for more information
 
 







 

Archive for the ‘Tutorials’ Category


 

I recently purchased and installed a new SSL certificate from GoDaddy for Marrily. During the process, I came to learn more about SSL and the different steps to set everything up from scratch. There are an abundant amount of articles and tutorials on how you can get started, but surprisingly there are no articles on “why” you have to follow those steps. Truth is I was pretty confused when I first started. There were a bunch of different steps and different key, pem, crt, csr files that need to be generated. The result was that I got lost and screwed up during the process. I then add insults to injury by accidentally revoking my certificate instead of re-keying it and ended up having to call GoDaddy to revert the deletion. Since any entrepreneur with a SaaS website will eventually need to implement SSL to protect their customers, having a better understanding of SSL will be greatly beneficial. This is my explanation to the entire process in plain English in hope that I can help clear up the confusion.

Why SSL?

To protect the communications between your web server and the client’s browser, you need to implement an encrypted channel so that all data transferred back and forth can only be read by your server and the browser. Anyone who eavesdrops in between will just see gibberish. Only your web server and the client’s browser know the right “secrets” to unlock the encrypted message. This communication protocol is called https, with the s stands for “secured”.

When user requests a page via https, your server will need to encrypt the content using a secret which the user’s browser can decrypt using a well-known identity. If somehow the content is encrypted with an unknown identity, the browser will be very hesitant to accept it, and it will ask user to make the hard decision to proceed or not.

Why Purchase a SSL Certificate?

To purchase a SSL certificate is to obtain a publicly verifiable identity for your domain that is accepted in all browsers. Most modern browsers include a list of well-known root Certificate Authority (CA) public keys, and any encryption done using these CA sources will be accepted by the browser. It is also possible for you to generate a root Certificate Authority set of key as well, technically speaking you become your own Certificate Authority. However, since your identity is unknown and not verifiable, the browser will not trust your keys and thus it will pop up an alert to notify the user. Nonetheless, once you add your certificate key to your browser’s list of accepted certificates, it will come to know about your identity and hence it won’t bother popping up anymore.

Since you can’t ask everyone to manually install your public key to their browser’s list of accepted certificates, you will need to buy the certificate from an established vendor whose public key already came bundled by default in the browser. I read somewhere that this is how browser vendors can make some money, e.g. the SSL guys will need to pay to have their identity (the public key) included in the browser. In exchange, these SSL vendors can turn around and certify (or “sign”) anyone who wants to get a SSL certificate for a fee.

If you think about becoming a SSL vendor, you will need to convince all other browsers that you’re completely trustworthy, and you protect your private key used to generate the SSL certificate with your life, since whoever gets their hands on your private key will be able to sign any SSL request, thus compromising your identity as the reputable Certificate Authority. All SSL vendors offer a warranty on their SSL certificate service from $1,000 to $10,000 to a lot more specifically as a statement that they keep their secret hidden really well to protect the identity of their customers’ SSL certificates.

Obtaining a SSL Certificate

Step 1: Generate your private key

To handle https requests, your web server will need to encrypt the data. Hence the first step you need to do is to generate a private key that will be used for the encryption. You can use different encryption algorithms but a SSL vendor can ask you to use a specific method and key length. The longer the key, the better the encryption strength. If the key is too short, the bad guy can quickly run through all the possibilities and found out your private key, then he can pretend to be you. In my case, GoDaddy want to have 2048 bits (256 bytes) for the strength for the private key. For personal use, a key strenght of 1024 bits (128 bytes) would be sufficient.

openssl genrsa -out private.key 2048
Generating RSA private key, 2048 bit long modulus
..............................+++
.+++
e is 65537 (0x10001)

Step 2: Generate a new SSL Request .csr file

The next step is to generate a “request” for a new SSL using your private key. This request file has an extension of .csr which stands for Certificate Signing Request, and it contains the identity about you (or your company), and most importantly, where the SSL certificate would be valid for: a single domain (cheapest) or any sub-domains (a.k.a. wildcard, and a bit more pricey). All these information will be encrypted using your private key and saved to a file. The SSL Vendor will then take this file and sign it to produce a valid SSL certificate that can be applied to your server.

EV SSL
If you pay more money, you can also get your identity in the SSL certificate confirmed as a legitimate business entity. This type of SSL certificate is called EV SSL (Extended Validated Certificate). Essentially the SSL vendor will verify the identity of your company by asking you to submit your business registration paperwork, bank account, letter from attorney or accountant, etc., for an additional fee ($400 to $1,000). In return, you will have a green-bar status with your company’s name next to the browser’s address bar. The theory is that user can identify your company’s name, and thus feels more secured as he/she knows that the website is the correct one, not a phished site that just pretend to be your website. Most (if not all) banks and prominent businesses have this type of EV certificate to protect their identity.

To generate a new CSR from your private key, use the command:

$ openssl req -new -key private.key -out marrily.com.csr

As I mentioned, the most important bit of the CSR file is where the SSL Cert should be valid for, which is defined in the “Common Name” attribute. For single domain (https://marrily.com, or https://www.marrily.com), you can use either “domain.com” or “www.domain.com”, since the “www” subdomain is so commonly used and thus can be omitted. Check out line 14 below for more details:

$ openssl req -new -key private.key -out marrily.com.csr
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:
Locality Name (eg, city) []:
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Marrily
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:marrily.com
Email Address []:alexle@marrily.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

I did not specify any challenge password in this case to keep everything simple.

Step 3: Submit your CSR to get a SSL Cert

Now that you have the CSR file containing your identity and which domain the SSL would be valid for, you can submit this CSR file to the SSL vendor (of course you will have to pay them first). They will take your CSR file and generate a new .crt (certificate) file using their own private key. Essentially they “sign” your CSR file with their carefully guarded secret file. You will then get back the your .crt file corresponding to the CSR, and another .crt file that belongs to the SSL vendor.

Chances that the SSL Vendor’s crt file actually contains a list of different certificates (public keys). The reason is that more or less your SSL vendor is actually a re-seller of another Certificate Authority, which can also be a reseller of another higher-level CA. So the first certificate would belong to your immediate SSL vendor, the one after that belongs to the higher-level CA that signed your vendor’s cert, and the cert listed after that belongs to an even higher CA that signed the CA’s cert that signed your vendor’s cert which signed your own certificate. Essentially it’s a tree of certificates that lead all the way up to the highest level of CA, which is a root certificate that is included in the browsers by default. For GoDaddy, the root CA is www.valicert.com, and for VeriSign, it is VeriSign’s own Class 3 Public Primay Certification Authority - G5.


(notice the green bar, that’s the EV SSL which costs you some more money to obtain)

Step 4: Configure Your Web Server

Now you should have in your possession these files:

1) your private key
2) your .csr file (not used anymore)
3) your new SSL certificate provided by your vendor as a .crt file, which is valid for your domain.
4) your SSL vendor’s crt file, containing a list of different certificates.

You are now ready to go and configure the web server to use your private key and your new SSL certificate (which is technically a public key) for the https-enabled website. The specific configuration for each web server is different, but the process will be the same. Also, the .crt files sometimes have a “.pem” extension as well, but for simplicity’s sake, they can be used interchangeably.

Nginx and GoDaddy SSL

In my case, I used nginx to serve my Rails application. I originally installed this nginx instance from source using passenger’s installer but ssl was not enabled by default (you can check this by running “nginx -V” and look for - -with-http_ssl_module). I re-ran the passenger’s installer again and add the - -with-http_ssl_module switch to the optional parameters, and everything was good to go.

One gotcha for Nginx is that you will have to combine the 2 certs that GoDaddy give you into one .crt file, with your SSL certificate comes first, then GoDaddy’s crt file (gd_bundle.crt). The browser would understand this as your SSL was signed by the CA whose public key is next cert entry, then that one was signed by the one after it, etc. all the way to the root CA.


$ cat www.marrily.com.crt gd_bundle.crt > marrily_combined.crt

I then added a new server{} block to listen for ssl requests on port 443. After restarting Nginx, Marrily is now ssl-protected with a green padlock.

server {
    listen          443;
    server_name     marrily.com;
    # passenger stuff

    ssl on;
    ssl_certificate         /your/ssl/folder/marrily_combined.crt;
    ssl_certificate_key     /your/ssl/folder/private.key;
}

Self-Signing your Certificate and Testing SSL Locally

Now that Marrily is https-enabled and some of the actions requires SSL, I wanted to develop the site locally using SSL as well to make sure all the logic worked correctly. I’d need to self-sign a new SSL certificate and have it installed locally.

Preparation
In my environment (Mac OS X Snow Leopard), I also have nginx installed using Homebrew. Homebrew installed nginx with ssl support by default so no recompilation was needed. I also added a new entry to my host file so that I can use a fake domain to access my local site, and I’d use this fake domain in my CSR as well.

# /etc/hosts
127.0.0.1 marrilydev.com

Self-Signing a New Certificate
I generated a new private key using openssl:

$ openssl genrsa -out privatekey.pem 2048

Then I generated a CA cert using this private key:

$ openssl req -new -x509 -key privatekey.pem -out cacert.pem -days 3650
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:
State or Province Name (full name) [Some-State]:
Locality Name (eg, city) []:
Organization Name (eg, company) [Internet Widgits Pty Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (eg, YOUR name) []:marrilydev.com
Email Address []:

I didn’t care about any of the details except for the Common Name field, which I specified the fake domain.

Since the cacert.pem file was generated (a.k.a. signed) using the same privatekey.pem file, we could use it as the SSL certificate directly. All we’d need to do is set the ssl_certificate_key setting in the configuration to the privatekey.pem file:

upstream rails { server 127.0.0.1:3000; }

server {
   listen       443;
   server_name  marrilydev.com;

   ssl                  on;
   ssl_certificate      /Users/sr3d/projects/misc/ssl/cacert.pem;
   ssl_certificate_key  /Users/sr3d/projects/misc/ssl/privatekey.pem;
   ssl_session_timeout  5m;

   server_name   marrilydev.com;
   access_log    /Users/sr3d/projects/marrily/svn/marrily_marrily/m3/app/log/access.log;
   error_log     /Users/sr3d/projects/marrily/svn/marrily_marrily/m3/app/log/error.log;
   root          /Users/sr3d/projects/marrily/svn/marrily_marrily/m3/app/public/;

   location / {
     proxy_set_header  X-Real-IP  $remote_addr;
     proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
     proxy_set_header Host $http_host;
     proxy_connect_timeout 74; # max is 75s
     proxy_redirect off;

     # Proxy to Backend
     if (!-f $request_filename) {
        proxy_pass http://rails;
        break;
     }
   }
}

(Note: locally I have the nginx proxy all traffic to the development server running on port 3000)

Also, since Mac OS X has special restrictions for port 80 and port 443, nginx must run with sudo to listen to port 443, otherwise it would silently fail and you won’t be able to hit the site via https.

Getting Rid of SSL Warnings by Installing the Self-Signed Cert
With nginx configured to listen to secured requests, I opened up the site in Chrome, and saw a huge red error message complaining about the validity of the certificate, since Chrome did not recognize the identity of the cacert.pem. Obviously I could just ignore the warning and proceed to the https site for the current session, but there’s a better solution: add the cacert.pem to the list of approved certificates.

To install the self-signed certificate, just double click on the cacert.pem file in Finder. The cert would be added automatically to Keychain Access.

With the cert added to Keychain, all browsers installed in the system would gladly accept a https connection to https://marrilydev.com.

Summary

  • SSL certificate is not all that confusing once you understand the gist of it and why each file is needed
  • The process in simple steps:
    • generate a new private key for encryption
    • Using this private key, generate a CSR containing the domain information for the SSL
    • submit the CSR file to the SSL vendor to obtain a new CRT certificate file
    • configure your web server to listen to 443 https traffic using the private key in step 1 and the CRT obtained from the vendor
  • GoDaddy SSL has different pricing on their SSL stuff, so search around and don’t pay a full price.
  • SSL is cheap, implement it to protect your customers and gain their trust
  • If you’re gziping your site, should add this line to your nginx’s conf file:
    gzip_buffers 16 8k; to make sure nginx doesn’t loose large gzipped JS or CSS

Reference

view comments
 

For a Rails/SQLServer application I’m working on, I had to deal with pagination with custom queries because of the different joins. The mislav-will_paginate plugin works great for MySQL, but for SQL Server, the paginated query generated by the current SQL Server Adapter (I’m using activerecord-sqlserver-adapter-1.0.0.9250) does not work very well. The current implementation is targetted really for SQL Server 2000 and older versions since these versions do not have support for ROW_NUMBER() method. It is a major pain in the butt to do pagination with these databases. With the newer SQL Sever 2005, the job is a bit easier. Microsoft implemented the ROW_NUMBER() method with a convoluted syntax to have better support for pagination, but it is still a drag because of the weird syntax.

Semergence wrote in his blog about patching the SQLServerAdapter to support pagination. Based on his post, I improved ActiveRecord::ConnectionAdapters::SQLServerAdapter::add_limit_offset! to make the query work in a more general way with free-form queries, e.g. queries ran with the paginate_by_sql() method provided by mislav-will_paginate

Include this script in your environment.rb file, or an external file and “require” the file within environment.rb.

  # monkey-patching SQLServerAdapter to support SQL Server 2005-style pagination
  module ActiveRecord
    module ConnectionAdapters
      class SQLServerAdapter
        def add_limit_offset!(sql, options)
          puts sql
          options[:offset] ||= 0
          options_limit = options[:limit] ? "TOP #{options[:limit]}" : ""
          options[:order] ||= if order_by = sql.match(/ORDER BY(.*$)/i)
                                order_by[1]
                              else
                                sql.match('FROM (.+?)\b')[1] + '.id'
                              end
          sql.sub!(/ORDER BY.*$/i, '')
          sql.sub!(/SELECT/i, "SELECT #{options_limit} * FROM ( SELECT ROW_NUMBER() OVER( ORDER BY #{options[:order] } ) AS row_num, ")
          sql << ") AS t WHERE row_num > #{options[:offset]}”
          puts sql
          sql
        end
      end
    end
  end

The method above monkey-patches the SQLServerAdapter by overwriting the add_limit_offset! method.

Here’s a custom query that I used and the transformed result:

Resource.paginate_by_sql([
      %!SELECT  resources.*
        	,skills_count.skill_count
        FROM resources
        	,(
        		SELECT resource_id
        			, COUNT(*) AS skill_count
        		FROM resource_skills
            WHERE meta_skill_id IN (1,2,3,4,5,6,7,8,9,10)
        		GROUP BY resource_id
        	) AS skills_count
        WHERE resources.is_active = ?
          AND resources.id = skills_count.resource_id
        ORDER BY skill_count DESC
      !, true ], :page => page, :per_page => per_page

With :page = 1, :per_page = 2, the resulted SQL is:

SELECT TOP 2 * FROM ( SELECT ROW_NUMBER() OVER( ORDER BY skill_count DESC ) AS row_num, resources.*
 	,skills_count.skill_count
 FROM resources
 	,(
 		SELECT resource_id
 			, COUNT(*) AS skill_count
 		FROM resource_skills
 WHERE meta_skill_id IN (1,2,3,4,5,6,7,8,9,10)
 		GROUP BY resource_id
 	) AS skills_count
 WHERE resources.is_active = 1
 AND resources.id = skills_count.resource_id

 ) AS t WHERE row_num > 0

The will_pagination’s COUNT query is

SELECT COUNT(*) FROM (
 SELECT resources.*
 	,skills_count.skill_count
 FROM resources
 	,(
 		SELECT resource_id
 			, COUNT(*) AS skill_count
 		FROM resource_skills
 WHERE meta_skill_id IN (21,22)
 		GROUP BY resource_id
 	) AS skills_count
 WHERE resources.is_active = 1
 AND resources.id = skills_count.resource_id
 ) AS count_table

The ORDER BY part is automatically removed from the main query (which becomes a sub-select) by the plugin to speed up the query. This in turns sanatizes the sql so that SQL Server doesn’t not complain about nested “ORDER BY” within a sub-select. Neat!

The only catch with the current add_limit_offset! is that it does not support ALIAS-ing, because the aliasing confuses the reqex to parse out the ORDER BY condition in the OVER() part of the query.

For regular find() queries, here’s a sample result

Resource.find(:first)
# original query:  SELECT * FROM resources
# transformed:   SELECT TOP 1 * FROM ( SELECT ROW_NUMBER() OVER( ORDER BY resources.id ) AS row_number, * FROM resources ) AS t WHERE row_num > 0

Hope this helps and cheers!

view comments
 

I randomly ran into Steven Levithan’s blog while searching to get an idea of how JavaScript handles Unicode. Steven is a JavaScript - Unicode and Regular Expression expert. He has a cool section called “Code Challenge” with some good food-for-thoughts challenges. It’s really JavaScript being pushed to the max, in terms of brevity, creativity, and obscurity. Check out Stephen’s “Roman Numeral Convert” challenge for example.

Reading through the comments, I picked out a nugget explaining a JavaScript behavior which actually caused me some unexpected issues with TubeCaption’s Captionizer. Steven explained best in his original comment

… you might have already realized this, but the unary + operator and parseInt are not equivalent. + can convert strings to numbers, and returns NaN if the element cannot be converted. parseInt (which takes an optional second argument for the radix) does the same thing, but also extracts leading numbers from strings. E.g., parseInt(”12x”) returns 12, while +”12x” returns NaN. Additionally, parseInt and + make different assumptions about the radix when there’s a leading zero. +”012″ returns 12, but parseInt(”012″) returns 10. The leading zero causes parseInt to treat it as an octal number in probably all browsers, despite octals being summarily deprecated in ES3. Of course, you can use parseInt(”012″,10) to get around that.

Here is a quick demo of how parseInt() behaves.

For the SRT import feature of TubeCaption’s Captionizer, I heavily relied on parseInt() to get the different time values. I was caught by surprise when a user notified me that his SRT file could not be imported into the timeline. After some debugging, it turned out that some values had padding values and the parseInt() returned incorrect results in octal instead of decimal. I wish I had known about the “+” trick and the subtlety of JavaScript at the time.

view comments
 

This is a quick summary for this process so that I can refer to it later on, and hopefully someone will find it useful as well.

Memcached requires libevent to handle its network IO stuff. The bundled libevent in the standard yum repository is old so it’s pretty useless. The newer versions memcached runs on newer libevent library so I ended up compiling libevent and memcached from the latest stable sources. I’m using libevent-1.4.4-stable and memcached-1.2.5.

First off, uninstall the libevent that yum may have installed on your machine

# sudo yum remove libevent

Download the sources for libevent and memcached , unzip( # gunzip *.gz ), untar (# tar -xvf *.tar), CD to the libevent folder. We will compile the libevent first.

# ./configure –prefix=/usr/local

# make

# make install

Basically we are telling libevent to install itself under /usr/local/lib/. When we compile memcached, we need to point it to the correct location as well. Once libevent is done installing (it’s really quick), we can move on and complie memcached.

CD to the un-tar memcached folder,

# ./configure –with-lib-event=/usr/local/

# make

# make install

After memcached is installed, you can try

# memcached

In my situation, I ran into an error

error while loading shared libraries: libevent-1.4.so.2: cannot open shared object file: No such file or directory

It turned out that the new libevent get installed, it doesn’t “register” the actual library file (similar to DLL on Windows) with the system. When Memcached runs, it tries to look for the libevent-1.4.so.2 file but since libevent is still playing hide and seek somewhere, memcached cries.

To fix this, we need to manually load the libevent library file into the system via the ld configuration. From the man page of ld:

ld combines a number of object and archive files, relocates their data and ties up symbol references. Usually the last step in compiling a program is to run ld.

I like to think ld as the regsrv32 used to register DLL’s on Windows. Now to fix up the reference to the libevent so file, we need to create a file under /etc/ld.so.conf.d/

# vi /etc/ld.so.conf.d/libevent-i386.conf

then enter

/usr/local/lib/

Write and quit (:wq!)

The path in the libevent-i386.conf is the path where the actual .so files are located at. We set this path when we run the ./configure –prefix=/usr/local/ during the libevent compilation. Reloading the ld configuration with

# ldconfig

now, you can start memcached in verbose mode (-vv) for testing

# memcached -vv

If you see something like ..

slab class 1: chunk size 104 perslab 10082
slab class 2: chunk size 136 perslab 7710
slab class 3: chunk size 176 perslab 5957
slab class 4: chunk size 224 perslab 4681
slab class 5: chunk size 280 perslab 3744

….

slab class 37: chunk size 367192 perslab 2
slab class 38: chunk size 458992 perslab 2
<6 server listening
<7 send buffer was 126976, now 268435456
<7 server listening (udp)

Congratulations! Memcached is up and running!

PS:  I’m renting the VPS from www.slicehost.com and so far my experience with them ( 1.5 months) is excellent.

view comments
 

I’ve been working extensively with JavaScript for the past few weeks, developing an all-Javascript application (the code base is exceeding 2000 lines of JavaScript already). Since the application needs to scale well with a large amount of dynamically generated elements, I have discovered, learend, and used quite a few shortcuts to boost the speed, minimize memory usage with mem-leak checks, and reduce file-size in general while still maintain the readability of the code.

This article is mainly about the nuggets that I extract out from my experience. I learned a few things from reading the source of PrototypeJS and Scriptaculous, while I learned other things (ideas, concepts) from different languages such as Ruby (famous for one-liners). I have to admit, nothing beats reading other people’s source code. Prototype is extremely well-written and there are tons of programming gems that one can learn from.

A few tricks are possible only in JavaScript because it is NOT Java, or C#, or C, and it is extremely powerful and flexible. With the Prototype cool-aid, Object-Oriented JavaScript is not only possible, but easy, standardized, and straight-forward to implement

Tip 0: Use a JavaScript framework
Pick a framework and use it. Any framework is *much* better than no framework at all. Different browser behaves differently and the framework will help smooth out those wrinkles. You are simply insane if you are not using one.

Our menu comes with several choices: so it’s a matter of personal preference. The learning curve will a bit steep as you will have to Google alot for the API’s, but once you are familiar and comfortable, you will become a much better scripter.

  • Prototype: Easy to use. One of the first that popularized the idea of frameworks. Lowest learning curve with virtually NO dependencies (1 file, ~ 130Kb)
  • jQuery: almost feel like writing javascript short-hand notation. Will take sometime to get used to the syntax. Core library is very small (~30Kb)
  • YUI: there’s a lot being offered in this framework. Get quite large (ZIP file: ~10MB!). I just don’t like everything being stuffed inside the YUI namespace, even when I register my own namespace.

Tip 1: Format your code well.
Nothing is worse than writing a large ( or any) application with poor code readability. As the application gets more complex, as it did in my case, having the ability to scan through the code and pin-point the hot-spot to fix is extremely helpful. Simply by consistently indent the code with NO EXCEPTION and keep yourself to a high coding-standard, your code will look pretty, work well, and have less bugs. The key is to do this as you code, not after everything has been done. You think you won’t ever look at your code again once it’s done, well, YOU ARE WRONG!

Once you are in the habit of formatting the code yourself, you become more aware of the “beauty” of your code. You’ll start seeing cool things, similar to the guys in Matrix see our world. You’ll become Neil and fly through your code and feel its force-field.


Tip 2: Having a high coding standard.
The reason why you have to keep your code in order and of high standard with no exception is that I believe in the Broken-window theory (wikipedia). Once there is a sub-standard spot in your code and you ACCEPT that it is there, you’ll more likely to repeat it and produce more sub-standard code. Nobody vandalizes your code better than you are. Soon, it will become an unmanageable, unreadable mess. Imagine debug these kinds of code? I’ll be cursing all day long.

I highly recommend the Code Complete book. It is thick enough to use as a pillow, but it is a mandatory read if you are serious about programming.


Tip 3: Explicitly write out your coding style and standard.
This is essential when you are working in a team with other people. Also, once you have established a coding style guideline, you will spend less time on thinking about the formatting, how to name variables, how to organize your code. I fondly remember DHH’s word: constraint is liberation. Yup, having a certain constraints will allow you to be more freely to do other productive programming.


Tip 4. Namespacing your code
If you are develop something more complicated than an alert box, put your code inside namespaces to avoid cluttering up the global scope, reducing potential conflicts with accidental over-written from other scripts, and organizing your code much better.

Repeat after me: Namespace your code, it’s easy. Do it.

Example:
Our application is a JavaScript drawing program called JsPaint. We can separate the code into different namespaces as follow:

/* Include in your first included JS file */
var JsPaint = {
  Data: {}
  ,UI: {}
  ,Util: {}
};

To add more methods or classes to JsPaint.UI:

JsPaint.UI = {
  Menu: {}
  ,MenuItem: {}
  ,Canvas: {}
  ,getCanvas:  function() { /* return reference to the canvas here */ }
};

To add a Renderer sub-namespace later on to the JsPaint.UI:

JsPaint.UI.Renderer = {
  Simple: function() {} /* class */
  ,Fancy:  function() {} /* class */
};

/* define shortcut to Renderer */
var jsr = JsPaint.UI.Renderer;
var simpleRenderer = new jsr.Simple();

Namespace is awesome to organize your code. However, if you have lots of level within your namespace, you can, and should, use aliases for the different namespaces. By aliasing, you also reduce the amount of “dot” operations to query the object, thus helping the properties lookup faster.


Tip 5: Putting comma-separator at the begining of the line.
I ran into this post on Thomas Fuchs’s blog a couple days ago. Thomas is the guy behind Scriptaculous. I couldn’t help but smile.

Internet Explorer is every picky about JSON format (I think this is good in a way, as it forces you to pay attention to the code). If you leave an extra trailing comma, IE will silently crash, while good old FireFox still works just fine.

Example:

var ieWontLikeThis = {
  isIEBadForYou:  true
  ,hoursWastedOnDebuggingIE:  function() { throw "Number Out Of Range";}
  ,noticeTheTrailingCommaBelow: true
  ,
}

The extra comma will cause IE to be so confused and throw up.

My solution is to always put comma right before your properties, so that you won’t end up forgetting about the extra comma. The code may look ugly and weird a little bit at first, but you’ll save yourself lots of pain and aggravation down the road. Moreover, you can comment out a properties without further re-formatting of the code. If I don’t want the hoursWastedOnDebuggingIE() method, I only need to comment it out. If you put the comma trailing the previous line, you will have to remember to remove it too. Not fun.

Bonus:
This coding style with the comma or the separator as the prefix is also useful for other languages as well, especially for SQL statements. It helps with 1) List of Columns to be SELECT’ed, 2) WHERE conditions, 3) Different temp variables in Common Table Expression (SQL Server 2005). (Besides that, I also force myself to indent the code in a consistent way, indentation matters!).

SELECT
  p.title
  ,p.description
  ,p.is_published
--  ,p.category_id   /* don't need this column*/
  ,a.user
  ,a.author_id
  ,p.created_at
FROM posts p
INNER JOIN author a
  ON p.author_id = a.author_id
WHERE a.user = 'alexle'
  AND p.type = 'post'
  AND p.is_published = true
--  AND p.category_id = 1   /* this line can be commented out without affecting the previous code */
ORDER BY p.created_at DESC
--   AND p.category_id  /* this line also can be commented out without affecting the code */

Another example with writing and debugging Common Table Expressions. This comes directly from my experience with writing a humongous CTE in SQL Server 2005

WITH firstCTE AS (

)
-- SELECT * FROM firstCTE  /* uncomment this line to debug the first CTE */
, secondCTE AS (

)
-- SELECT * FROM firstCTE  /* uncomment this line to debug the second CTE */
, thirdCTE AS ( 

)
-- SELECT * FROM firstCTE  /* uncomment this line to debug the third CTE */

SELECT *
FROM firstCTE
INNER JOIN secondCTE
  ON ...
INNER JOIN thirdCTE
  ON ...

Tip: 6 Having fun with Function Arguments
If you have to write functions with lots of arguments, well, you don’t have to. Instead, pass an array (or hash to be exact) of parameters as the argument and have your code handled it intelligently. You have more flexibility as the parameters list can be changed anytime. This technique is used extensively through out Prototype, Scriptaculous, and other frameworks such as jQuery, or ExtJs.

JsPaint.UI.Canvas = function( options ) {
  var defaultOptions = {
    canvasId:     'canvas_container'
    ,width:       100
    ,height:      200
    ,menuItems:   [ "File", "Edit", "View", "Help" ]
  };  

  /* merging default options with the passed in options  */
  this.canvasId   = options ? ( options.canvasId || defaultOptions.canvasId ) : defaultOptions.canvasId;
  this.width      = options ? ( options.width || defaultOptions.width ): defaultOptions.width;
  this.height     = options ? ( options.height || defaultOptions.height ): defaultOptions.height;
  this.menuItems  = options ? ( options.menuItems || defaultOptions.menuItems ) : defaultOptions.menuItems;
};

I use a shortcut with the ternary operator ? : and the “or” operator || to quickly merge the options with the defaultOptions.


Tip 7: Using or operator || for Coalescing nullable values
If you need to choose the first not-null value from a list, you can use the || (’or’) operator. Besides its usage as a logical boolean operator, || can be used as a fall-through-if-null operator, return the first non-null value.

var width = options.width || defaultOptions.width || 100;

It will check for the options.width, if null, defaultOptions.width will be used, then finally 100 if both the previous variables are null.

In SQL Server and MySQL, you can use the function COALESCE( csv_list_of_variables ) to do the same, e.g. getting the first non-null value;


Tip 8: Using ternary operator to reduce the amount of if-else in your code
I love the ternary operator. It may look cryptic, but after a while, your code will be shortened quite a bit while still remains readable with proper formatting. Nested ternary operations can be hard to read so it’s a matter of balancing between coding shorthand and code readability.


Tip 9: Function Arguments, enhanced with Prototype
Do you see how ugly our previous code to get the values from the arguments? If you are using Prototype, you can shorten all the ugly manual merging with a one liner

var JsPaint.UI.Canvas = function( options ) {
  var defaultOptions = {
    canvasId:     'canvas_container'
    ,width:       100
    ,height:      200
    ,menuItems:   [ "File", "Edit", "View", "Help" ]
  };  

  /* merge defaultOptions with options */
  options = Object.extend( defaultOptions, options );
  this.canvasId   = options.canvasId;
  this.width      = options.width;
  this.height     = options.height;
  this.menuItems  = options.menuItems;
};

Done! Object.extend() merges all the options together so you can have fun writing other productive code.


Tip 10: Optimize Loops
If you need to iterate through a large array, you should squeeze every bit of speed by caching the length. There’s always a trade-off between speed and storage, and if you want speed, you must trade RAM for it. The Prototype’s API page for Array has an excellent example:

// Custom loop with cached length property: maximum full-loop performance on very large arrays!
for (var index = 0, len = myArray.length; index < len; ++index) {
  var item = myArray[index];
  // Your code working on item here...
}

Tip 11: Object Inheritance made easy with Prototype
Since Prototype version 1.6, you now have full support for inheritance for your Object-Oriented JavaScript application. JavaScript has the support for inheritance, albeit not built-in, but it has never been easy to work with. Before, I had to write hacks and came up with something monstrous, ugly, and kludgy in the end. With Prototype, you get full support for inheritance for virtually free using Object.extend() and Class.create().

To ready more about inheritance, consult the Prototype’s Tutorial page on Inheritance


Tip 12: Leveraging Prototype’s Internal Methods
I highly recommend you to read through Prototype’s source code. The file contains more than 4000 lines of high-quality JavaScript code and you will learn a lot from doing so. An example is Event.KEY_* attributes (~ line 3700), containing the numerical code for various keys so you don’t have to look them up or re-define them again. Another example is the Prototype.emptyFunction attribute, which is an empty function. I often use emptyFunction in my base classes to make it clear that these methods are virtual methods, to be overridden by the sub-classes.

Example:

JsPaint.Shape.ShapeBase = Class.create( {
  initialize: function() {}

  /* only the object knows how to draw itself */
  ,draw: Prototype.emptyFunction

  /* only the object knows how to refresh the screen */
  ,refresh: Prototype.emptyFunction

} );

JsPaint.Shape.Rectangle = Class.create( JsPaint.Shape.ShapeBase, {
  initialize: function( $super, width, height, x, y ) {
    /* calling parent's constructor */
    $super();
  }

  ,draw: function( $super ) {
    /* drawing logic here */
  }

  ,refresh: function( $super ) {
    /* refresh logic here */
  }
} );

JsPaint.Shape.Square = Class.create( JsPaint.Shape.Rectangle, {
  initialize: function( $super, side, x, y ) {
    /* calling Rectangle's constructor */
    $super( side, side, x, y);
  }
  /* draw() and refresh() stay the same */
} );

Tip 13: Implementing Enumerable methods for your custom collections
For my application, I need to implement a generic sorted list that must behaves similar to a regular array while being optimized for speed. I set out to write a hybrid data structure that uses binary sorting to keep the array always in-ordered. I’ll publish the code in a separate post, but the interesting idea I want to share is the implementation of methods from the Enumerable interface, which helps working with the custom sorted list a lot more enjoyable.

From Prototype’s API page for Enumerable, the methods are

  • all
  • any
  • collect
  • detect
  • each
  • eachSlice
  • entries
  • find
  • findAll
  • grep
  • inGroupsOf
  • include
  • inject
  • invoke
  • map
  • max
  • member
  • min
  • partition
  • pluck
  • reject
  • select
  • size
  • sortBy
  • toArray
  • zip

I did not implement many these methods for my collection (I didn’t need them all), but my favorite ones are each(), first(), last() (these 2 are not listed, but they are still my favorites). Enumerable is a great time saver, implement it if you can.


Tip 14: A better Enumerable#each() implementation
The current implementation for Enumberable#each() in Prototype is not as nice and it’s cumbersome to break out from the loop. I’m greedy and I want everything a native for-loop can offer: item, its index, ability to break and continue, plus the piece of mind of not worrying about the index-out-of-range issue.

Here is my so-called better each() implementation:

  each: function( iterator ) {
		for( var i = 0, len = this.collection.length; i < len; ++i ) {
			if( typeof( temp = iterator( this.collection[i], i ) ) != 'undefined' && !temp )
				break;
		}
	}

The trick is to only break when the iterator explicitly returns false.

Example Usage:

var list = new SortedList ();
/* populate the list here */

/* iterating ... */
list.each( function( item, index ) {
  /* process here */
  item.process();

  /* skipping ...*/
  if( continueConditionIsTrue )
    return; /* this will returned 'undefined' back to the iterator */

  /* break out early */
  if( breakOutConditionIsTrue )
    return false;

} );

Tip 15: Passing function reference: The Strategy Pattern
Once you realize how flexible JavaScript is, you begin to have fun with it. Passing function reference is a neat trick that can be very useful. If you look at the each()’s implementation above, you see that the iterator function is being passed in as an argument variable. The iterator is than invoked directly via that argument variable just as you would normally execute a function. Neato!

I used this trick in my SortedList implementation and love it. Since the items stored in the list are custom objects that do not implement IComparable interface (that you can’t do item.compareTo( otherItem) ), however, to sort them, I need to be able to compare 2 object together directly. I implemented a comparator that contains (or encapsulate) the business logic for the object comparison. Basically the SortedList doesn’t care on how the objects are constructed, as long as the comparator can give the correct comparison results, the list is guaranteed to work

Here is the constructor for my SortedList. I used Prototype Class.create() to create a new class and the initialize() method is the constructor.

/* I use Prototype */
var SortedList = Class.create( {
  initialize: function( options ) {
    this.collection = [];

    var defaultOptions = {
      comparator: function(a,b) { return ( a < b ) ? -1 : ( a == b ) ? 0 : 1; }
      ,jsonifier: function(a) { return a.toJSON() }
    };

    this.comparator = options ? options.comparator || defaultOptions.comparator : defaultOptions.comparator;
    this.jsonifier  = options ? options.jsonifier || defaultOptions.jsonifier : defaultOptions.jsonifier;
  }
} );

I have a default comparator which can be used for basic data types (int, float, double, etc.), or I can pass in a custom comparator in the options argument to override the default one. The strategy for the comparison can be swapped out anytime and still the rest of the implementation remains unchanged.

The indexOf(item) function can be then implemented as follow:

indexOf: function( item ) {
  if( this.comparator( this.collection[ 0 ], item ) == 0 )
    return 0;
  else if( this.comparator( this.collection[ this.collection.length - 1 ], item ) == 0 )
    return this.collection.length - 1;

  for( var i = 0, len = this.collection.length; i < len; ++i )
  {
    if( this.comparator( this.collection[ i ], item ) == 0 )
      return i;
  }

  return -1; // not found;
}

Tip 16: Closure
Closure is powerful. Without this feature, JavaScript won’t be as flexible and robust as it is now. There are lots of good tutorials on closure so I won’t be cover it again in depth. I’ll provide a code sample instead. Yup, show, don’t just tell. The simplest example is to access the “this” keyword in a setTimeout or setInterval context.

Let’s have a StopWatch class that keeps track of its internal run time and update its display accordingly. Here is a very simple implementation that I cooked up in 5 minutes

var StopWatch = Class.create({
  initialize: function() {
    /* init these varialbe */
    this.runTime        = 0;
    this.tickDuration   = 100;  /* tick every 100 ms */
    this.timer          = null;
  }

  ,start: function() {
    /* exit if the stop watch already running */
    if( this.timer ) return; 

    var me = this;
    this.timer = setInterval( function() { me.tick() }, this.tickDuration );
  }

  ,tick: function() {
    this.runTime += this.tickDuration;
    this.updateDisplay();
  }

  ,stop: function() {
    clearInterval( this.timer );
  }

  ,updateDisplay: function() {
    /* update the time display here */;
  }

});

/* Now run the stop watches */
var exeriseStopWatch = new StopWatch();
exeriseStopWatch.start(); /* you'll see the time ticking here */

var cookingStopWatch = new StopWatch();
cookingStopWatch.start(); /* you'll see the time ticking here */

exeriseStopWatch.stop();
cookingStopWatch.stop();

In the start() method, we declare a “me” variable to hold the reference to the current object and if you remember, we can pass function references as variables within JavaScript. Since the function defined within setInterval is still in the local scope of the start() method, we have access to the “me” variable — which again point back to “this” StopWatch.


Tip 17: Closure with Prototype Cool-aid
If you use Prototype, you can have a much nicer syntax for closure and accessing the “this” scope. Let’s rewrite the start() function of our WtopWatch class.

  ,start: function() {
    /* exit if the stop watch already running */
    if( this.timer ) return;
    this.timer = setInterval( this.tick.bind( this ), this.tickDuration );
  }

Prototype extends the native Function object within JavaScript with some neat wrappers, bind() is one of them. The explanation for function.bind() :

Wraps the function in another, locking its execution scope to an object specified by thisObj.

Your code suddenly becomes much clearer and it states the intention of what it does! Prototype’s developers get the principle from the Ruby language: optimizing the programmer’s fun and productivity. These utility methods such as bind() and bindAsEventListener() just make writing JavaScript a lot more enjoyable and straightforward.


Final Words
This is by far the longest post I have ever written in a single day. I am excited to learn more about JavaScript, and I am more than glad to share my experience with everyone. I hope you can get out a few good things from this post and apply in your everyday work.

Let me know if you find any bugs and typo. Disclaimer: the code samples in this post have not been tested so point it out if you can.

view comments