Put your message here! Contact me for more information
 
 








 

I’ve been doing quite a bit of reading about High-Availability (HA) web clustering techniques for the past 2 weeks. Thanks to VMWare, now it is possible to create a virtual web server farm with multiple linux instances running concurrently to simulate a cluster setting. The initial result is very heart-warming: I’ve sucessfully installed a mysql database, configure fail-over (thus means high-avaibility) web servers using heartbeat, and name servers running BIND9 to do name lookup. The last bit of the puzzle is now the Linux Virtual Server (LVS) Director for load-balancing. That would be my next experiment.

Here is my initial diagram of my web servers farm and the IP/server name assignments
Web Farm

With this design, I tried to eliminate the single point of failure by implementing redundant, ready to fail-over servers for critical sections of the network, for example, the DNS, the load balancers for incoming traffic, and the load balancers for the database farm.

Basically, incoming traffic will pass through A, the load balancer. The load balancer A will spread out the load to the web servers C in the farm using LVS-DR method (direct-routing). There are also 2 kinds of web servers: one is supposed to be the beefy, powerful server with fast CPU to run the web applications, and the other is the media server which doesn’t need good CPU but requires fast HDD (SCSI) and lots of RAM. The applications server will do the number crunching, churning out pages as fast as they can while the media servers will provide all the images, CSS, and javascript files. Of course since I am using VMWare, it virtually costs me $0.00 to add a new scsi drive to the VM machine. Great!
The name servers B running Bind9 are located centrally to help with the name resolutions. Of course DNS is critical so we need to have a certain level of HA. Hearbeat will make sure the DNS is always up and ready.

Meanwhile, for the database farm E, which is on a separate network (supposed to be high-speed, low latency with very expensive switches) a pair of load balancers is needed to spread out the “read” (SELECT) load. I’m not quite sure how to implement the “write” (UPDATE/ DELETE, ALTER TABLE, etc.) DB servers yet, but I’m sure that we can improvise along the way. Again, Heartbeat will be implemented to keep the database load balancers up and happy. Our database farm will consist of 2 network storage nodes to store data and 2 “API” nodes to do the database heavy lifting. A fifth server is used to be the management node to manage (add, delete, or update) the database servers.

Finally, (and not shown in the above diagram as I just realize that I am missing something), a monitoring server running Nagios is implemented to do health-monitoring and network management. With the current design, all part of the network can be scaled independently: if more web servers are needed, we add new boxes to section C. If we need more database storage nodes, we can quickly add a new NDB node to the MySQL database cluster F. The bottle neck will now be our gateway, the load balancers in A. However, since it’s been confirmed (see the linux-ha.org site) that a decent load balancer can easily handle the amount traffic to saturate a 100Mbps connection, I would say for a small/ medium business settings, this is more than enough.

If you are asking why I am writing all of this down. I am doing this because I will begin to construct this web farm using VMWare with CentOS 4.4. The post and the diagram will serve as a guideline for this particular project. Moreover, I intend to do screencast of the entire process of setting up this web farm. Yes, I’d like to commoditize the knowledge of building Linux cluster using off-the-shelf tools. It’s a noble goal, I know, but I’m doing it for myself first so you don’t have to thank me now.
Now off to work I go. Keep on checking back alexle.net for more information about Web clustering. “This is Alex Le doing it so you don’t have to.” (yeah, I copy Ze frank’s line, so excuse me for the plagiarism. :)


 

6 Responses to “VMWare Web Server Cluster: Initial Diagram.



Catalinux
5:15 am
March 12, 2007
#18374

media01 and media02 have same content?

How do you add capacity for media content?




4:27 pm
March 12, 2007
#18411

Media01 and Media02 should have the same content to distribute the load. I actually haven’t setup the cluster completely so I’m not sure about the details of how to add more storage capacity to the media servers, but I guess it should be fairly standard: either adding more partitions (physical drives), mount NFS shared drives, etc.

If you want to read more about scaling up storage, Google have a very interesting article describing the Google File System at http://labs.google.com/papers/gfs-sosp2003.pdf. To scale up their storage horizontally, they use central “master” server(s) to store the index to different chunk servers (file/media server) running on top of the regular Linux file system.




gopal
12:10 am
September 4, 2007
#54197

what are all steps involved to create web cluster project. The requirement is when the server(master) going to down another server take over that process




2:48 pm
September 27, 2007
#60899


3:45 am
January 14, 2008
#101286

Did you finally managed to configure all that stuff ? Will be the virtual machines publicly available ?

Thanks !




Luis
4:23 pm
February 5, 2008
#111966

You may want to get a 2 NAS boxes to hosted your environment and you can use iSCSI so the systems can connect to them.




 

Leave a Reply