Forum LAMS for Tech-Heads - General Forum: Scaling LAMS to handle over 2000 concurrent users


 
You may request notification for Scaling LAMS to handle over 2000 concurrent users.
Search: 

1: Scaling LAMS to handle over 2000 concurrent users
03/15/11 10:47 AM
[ Reply | Forward ]
Hi Everyone,

For about a week and half I've been working on scaling LAMS to handle a lot of simultaneous (concurrent) users, specifically more than 2000 users.

As you might have seen before, LAMS scales quite well even on a single server or box. However, to take on 2000 concurrent users with lower response times (the time that it takes for you to get a page back once your browser requests it), we need several machines to do this configured in a cluster.

A cluster is basically several machines or physical servers that act as one. We've done LAMS clustering with some significant results using Amazon's cloud computing servers. But at the time we were looking at only 500 simultaneous users but we now we need 2000+!

This time I used Soft Layer Bare Metal Instances to set up the cluster as you can rent their servers by the hour, which is very convenient for testing cases like this.

The setup takes a little while as there's quite a few things to install and configure separate places. But it breaks down like this:

  • 1 x Db server: 8 core 2.0 GHz Core Bare Metal Instance + 8 GB Ram, running MySQL 5.5.9.
  • 3 x Jboss nodes: 4 core 2.0 GHz Core Bare Metal Instance + 4 GB Ram

So 4 servers in total. In one of the Jboss nodes I put the load balancer which is a webserver (Apache in this case) that forwards the request that comes from the web to one of the jboss nodes for processing. So the load balancer is the software that splits the load across the jboss servers.

As the university won't allow me recruit 2000 students and use them as my guinea pigs, we have to use our Test Harness, that allows us to simulate LAMS using going thru lessons. So for this case, I used 20 different small servers (1GB RAM + 1 core x 2.0 GHz) using the Test harness, each simulating 100 users going thru a lesson.

The results using a 10 activity lesson sequence (noticeboards) were better than I expected: for every page request, with 2000 users in LAMS at the time, the cluster is able to return the page to the browser in 1.1 second in average.

Given this results, I tried to push my luck a bit further and test it with 2300 concurrent users instead. In this case, when LAMS is having 2300 simultaneously the average response time is 2.8 seconds.

This is really good and the response times are quite low. But I'm sure there's more room for improvement in the cluster and database configuration to get even lower results.

Additionally, I would like to do test with other interactive activities like Q&A, Forums, etc to get a more complete picture. I didn't do this as this requires changes in our Test harness but we'll get to this eventually.

Here's a page I added on the wiki with these results:

http://wiki.lamsfoundation.org/display/lams/Clustering+for+2000+concurrent+users

If you have any questions/comments on this or clustering in general, hit the reply button and ask away!

Thanks,

Ernie

Posted by Ernie Ghiglione

2: Re: Scaling LAMS to handle over 2000 concurrent users
In response to 1 07/17/14 12:13 AM
[ Reply | Forward ]
Hi,

Thank you so much for a detailed explanation for this.

I wanted to know, if I were to increase the limit to 3000 concurrent users, which part of the hardware/servers would I be increasing?
Would it simply mean I include more app servers in the cluster to handle the 3000+ load? Will that setup use the same single Apache load balancer?

Thanks!

Posted by Haroon Shoukat

3: Re: Re: Scaling LAMS to handle over 2000 concurrent users
In response to 2 07/19/14 04:17 PM
[ Reply | Forward ]
Hi Haroon,

Since I posted this original message, I've been doing a fair bit of clustering and learned a few good lessons.

First I've been favouring HAproxy as a load balancer. Is very very fast, gives a lot of flexibility to configure thing and also you can simple set up HAProxy clustering if you set up scales a lot more.

Additionally, for all the static content (.css, .js, .png, .jpg, etc) I set up a Varnish cache server.

This makes LAMS super snappy fast and the JBoss nodes only have to serve the dynamic content, which brings better performance.

As for your questions, yes, you'll need to increase the number of nodes. How many? Well, it depends on how LAMS is being used. There are some educational activities that are a lot more CPU demanding than others. So you'll need to test according to your load and expand the nodes as you find bottle necks.

Thanks

Ernie

Posted by Ernie Ghiglione

Reply to first post on this page
Back to LAMS for Tech-Heads - General Forum