Saturday, February 3, 2007

availability+scalability=load balacing

Sometimes you are supposed to provide high-available service, but you do not want to build a "real" cluster of servers. Linux Virtual Server provides a quick and easy way to make your server farm highly scalable and highly available.

The Linux Virtual Server is a highly scalable and highly available server built on a cluster of real servers, with the load balancer running on the Linux operating system. The architecture of the server cluster is fully transparent to end users, and the users interact as if it were a single high-performance virtual server.

Image Uploaded by ImageShack ToolbarThe real servers and the load balancers may be interconnected by either high-speed LAN or by geographically dispersed WAN. The load balancers can dispatch requests to the different servers and make parallel services of the cluster to appear as a virtual service on a single IP address, and request dispatching can use IP load balancing technolgies or application-level load balancing technologies. Scalability of the system is achieved by transparently adding or removing nodes in the cluster. High availability is provided by detecting node or daemon failures and reconfiguring the system appropriately.

The Linux Virtual Server as an advanced load balancing solution can be used to build highly scalable and highly available network services, such as scalable web, cache, mail, ftp, media and VoIP services.

There're three types of load balancing (Layer-2, Layer-4 and Layer-7 load balancing), LVS supports two of them - Layer-4 and Layer-7. For layer-7 switching LVS has KTCPVS(Kernel TCP Virtual Server), for layer-4 switching - TCPSP (TCP Splicing).

KTCPVS
Image Uploaded by ImageShack ToolbarKTCPVS stands for Kernel TCP Virtual Server. It implements application-level load balancing inside the Linux kernel, so called Layer-7 switching. Since the overhead of layer-7 switching in user-space is very high, it is good to implement it inside the kernel in order to avoid the overhead of context switching and memory copying between user-space and kernel-space. Although the scalability of KTCPVS is lower than that of IPVS (IP Virtual Server), it is flexible, because the content of request is known before the request is redirected to one server.

The kernel threads are used to parse the content of requests, forward them to backend servers according to scheduling rules, and relay data between client and server. The user-space program tcpvsadm is to administrator KTCPVS, it can write the virtual server rules inside the kernel through setsockopt, and read the KTCPVS rules through getsockopt or /proc file system.

TCP Splicing
TCPSP implements tcp splicing for the Linux kernel. The tcp splicing is a technique to splice two connections inside the kernel, so that data relaying between the two connections can be run at near router speeds. This technique can be used to speed up layer-7 switching, web proxy and application firewall running in the user space.

TCPSP is released as a small software component of the Linux Virtual Server project.

Load balancing on FreeBSD
is possible with LVS, see LVS On FreeBSD Project.

Piranha
Piranha is the clustering product from Red Hat Inc., it includes the LVS kernel code, a GUI-based cluster configuration tool and cluster monitoring tool. The whitepaper of Piranha and Piranha HOWTO are available at Red Hat web site. The RPMS and SRPMS of piranha can be found in the RedHat 6.1 distribution, or can be downloaded from the ftp.redhat.com site.

Ultra Monkey
Ultra Monkey is a complete open source server farm solution for linux, providing high availability and load balancing. See the Ultra Monkey site for more information.