ECF is choking - 502 bad gateway errors

Status
Not open for further replies.

NC_Fog

Super Member
ECF Veteran
Verified Member
Apr 29, 2011
644
1,167
Central N.C.
What happened? Friday about 10:30 pm EST ECF just... died. Nothing would load - at all - and pings only produced 100% packet loss with "request timed out".

That was going on as early as 05:00 am EST - the last time I checked it. I was having no problems doing anything else internetty or accessing any other sites - but this one. :confused:

Same here on Friday night. Saturday night starting around midnight I had to hit refresh several times to get a complete page to load. Sunday/Monday, so far site is running great.
 

catilley1092

Super Member
ECF Veteran
Verified Member
Nov 3, 2013
553
847
North Carolina, USA
I've noticed this also, especially late nights between 12AM & 4AM EST. Though I figure that ECF, being a huge worldwide forum, has a lagre following & the servers are under load during these peak hours. There's always many members online & several times more visitors.

There are other sites where I experience the same issue, in particular the more popular tech forms during the same time span.

Optimizing one's Internet connection by unplugging the modem & router for 30 seconds sometimes helps, as this act flushes the lines, giving better performance. The browser will be more responsive & I perform this at least once every 2 weeks.

Cat
 

rolygate

Vaping Master
Supporting Member
ECF Veteran
Verified Member
Sep 24, 2009
8,354
12,405
ECF Towers
In the last couple of days we have had a bunch of errors related to a new CDN we're trying, Cloudflare. This is a 'content delivery network' or distributed backup/delivery cloud service that is supposed to provide resilience against DDOS attacks (mass botnet attacks with tens of thousands of computers connecting to the site at the same time in order to take it offline).

We need something like this because large sites get occasional DDOS attacks for blackmail - pay up or we take you down. It can normally be fended off by the site hosts, who block bad IP ranges on the network until things slow down.

A decision was made to try a redundant backup service instead, to see if it works better. It probably won't work because giant forums are the one exception to the 'the cloud is best' rule, where the normal state of play is that if you distribute your site content around the network then it becomes highly resistant to attack as it's in several different places. This is easy to do with most kinds of website as the content is static or nearly so (nothing much is changing on the site). A forum is different because, in contrast, everything is changing.

The trouble with a giant forum is that the database is being written to all the time, and there are thousands of read/writes to the DB every second. It simply isn't possible to duplicate a giant DB (ours is tens of gigabytes in size) and then sync all those DBs - the disks would crash because they would have to multi-sync millions of DB read/writes across multiple servers. Therefore a giant forum needs a single fortified hosting service. This is easier said than done because you are talking about the necessity to have several hosting tech support staff able to repulse a network attack at 4am while also having the staff capability to run the rest of their hosting operations. It can't be done in a small hosting operation since by definition you have to use a full-service large-scale host who has top level tech staff (Level 1 staff) on hand 24/7. Most work is done by Level 3 techs but they are not versed in network defence.

We just moved to another host with a combination of better hardware and lower costs; however it is obvious their tech support staff cannot cope with a major DDOS attack, as one took us offline all night last week. I don't know how this situation will be resolved except by moving to a more capable host.
 

thebanik

Senior Member
ECF Veteran
Verified Member
Mar 8, 2014
283
179
New Delhi, India
In the last couple of days we have had a bunch of errors related to a new CDN we're trying, Cloudflare. This is a 'content delivery network' or distributed backup/delivery cloud service that is supposed to provide resilience against DDOS attacks (mass botnet attacks with tens of thousands of computers connecting to the site at the same time in order to take it offline).

We need something like this because large sites get occasional DDOS attacks for blackmail - pay up or we take you down. It can normally be fended off by the site hosts, who block bad IP ranges on the network until things slow down.

A decision was made to try a redundant backup service instead, to see if it works better. It probably won't work because giant forums are the one exception to the 'the cloud is best' rule, where the normal state of play is that if you distribute your site content around the network then it becomes highly resistant to attack as it's in several different places. This is easy to do with most kinds of website as the content is static or nearly so (nothing much is changing on the site). A forum is different because, in contrast, everything is changing.

The trouble with a giant forum is that the database is being written to all the time, and there are thousands of read/writes to the DB every second. It simply isn't possible to duplicate a giant DB (ours is tens of gigabytes in size) and then sync all those DBs - the disks would crash because they would have to multi-sync millions of DB read/writes across multiple servers. Therefore a giant forum needs a single fortified hosting service. This is easier said than done because you are talking about the necessity to have several hosting tech support staff able to repulse a network attack at 4am while also having the staff capability to run the rest of their hosting operations. It can't be done in a small hosting operation since by definition you have to use a full-service large-scale host who has top level tech staff (Level 1 staff) on hand 24/7. Most work is done by Level 3 techs but they are not versed in network defence.

We just moved to another host with a combination of better hardware and lower costs; however it is obvious their tech support staff cannot cope with a major DDOS attack, as one took us offline all night last week. I don't know how this situation will be resolved except by moving to a more capable host.

Thanks a lot for your detailed reply, I was thinking that you guyz are doing some sort of upgrade during off-peak hours.
 

DocTonyNYC

Vaping Master
ECF Veteran
Verified Member
Oct 21, 2013
5,870
6,803
San Juan, Puerto Rico (and NYC)
Thank you, rolygate, for the detailed explanation! Although it is frustrating when there are problems, I really appreciate how hard you all work to keep the site functioning. I'm certainly not very tech savvy, but given the size of ECF I find it amazing how smoothly things work most of the time.
 
Status
Not open for further replies.

Users who are viewing this thread