“Glitches” seems like an easy catch-phrase to explain encountered problems in the world of technology; but “glitches” doesn’t reconcile the real human frustration people encounter when they engage with a system that is experiencing a rather unique challenge.
I won’t say we are experiencing a glitch…. CTH 2.0 is experiencing a challenge that has actual emotional consequences… Here’s the issue in as much non-technical wording as I can muster.
SIDEBAR: For other websites with considerable scale, and considering the deplatforming issue underway, this might also serve as a guide. Additionally, for site users in any website this might explain some background decision-making on commenting functions that is often left unsaid outside of closed-door meetings.
First, my sincere apologies for the trouble everyone had, and is having, as we launched CTH 2.0 with new host servers. As you know, our commenting community is our #1 priority and we have years of relationship and trust together. The challenge before us today is considerable.
When we were told we had to leave WordPress/Automattic platform it was important to us to retain the entire site library that includes over 56,000 published articles and over 7.2 million comments.
Some articles carry up to several hundred citations and the average amount of content within the 7.2 million comment file alone exceeds 40 million lines of metadata.
All of that CTH data took approximately 60 hours to export and transfer (import) to a new site and eventually new host servers. The data was first uploaded to a test site to gauge the scale of data and time. After that initial transfer timeline was determined the data was then imported to new host servers, CTH and CTH 2.0 were mirrored, and fully migrated away from WordPress/Automattic servers.
The CTH 2.0 site was reviewed, some adjustments made to ensure user friendly systems were carried over, and we were ready to switch the site. That is called a DNS switchover.
As the domain name shifts from WordPress/Automattic to our new platform host the worldwide system of interconnected data networks needs to identify and promulgate searches for “TheConservativeTreehouse.Com” with the new site destination, the new servers. This process takes place in every country around the world, each with a different refresh rate for DNS activation.
At approximately 7:30pm ET last evening that process was initiated. CTH search results switched from final destinations at WordPress/Automattic servers to our new server host.
Regionally as each Internet Service Provider (ISP) updated it’s DNS mapping, users were directed to new CTH servers. CTH 2.0 slowly showed up in every city, state and town as the regions came on-line with updated directories. The process executed as expected.
We anticipated a significant data pull from initial users that would hit the server network, so we purchased a secondary “cache” service, Cloudfare, to help offset the scale of the data load by hosting the static or older library files. It’s a weird system but the data starts to promulgate two data-systems simultaneously. The load on the servers is offset by the ability of CloudFare to host and direct users to older content.
However, in the change to new servers there is no pre-existing room of old “cached” files that don’t need to be responsive to the user. The entire CTH archive library is essentially being pulled from the Host Servers as the promulgation continues and a new set of cache files is being created that will eventually take some of the load away.
Depending on the library size – for the first 24 hours the new servers are carrying all the load and the data capacity is stressed. The server host anticipates this heavy load and has a system in place called “autoscale” that activates to allow more server capacity during the DNS switch.
That is one initial load on the servers that improves in time. Time allows the servers to fully promulgate and get better at working with secondary cache services who take some of the burden. But in the first hours all of the data load is on the host server.
The secondary issue is where our problem rests. As each provider gained access to and directed customers to the new servers, many of the inbound users, our community, started commenting. The scale of our community is significant, we are proud of it.
The CTH 2.0 site content was promulgating new pathways (the DNS changeover) at the same time users started to engage with it. The cache service was also promulgating. However, the commenting function is a live-time engagement, so the secondary “cache” host doesn’t interact at all with the commenting data-load.
When you write a comment it creates a metadata file. Each comment is a unique url. Your unique id is part of that metadata, your gravatar or avatar is part of that metadata, the content of your comment (what you write) is part of that metadata and any article links, citations, pictures, gifs, or tweets -each carrying a unique url- is part of that metadata.
The average number of lines of unique metadata in one comment is six to ten lines of code, regardless of the length of your comment. Your comment creates a mini data-file with its own unique id (the url). The library of pre-existing comments is 7.2 million. Multiply that by the lines of metadata and you get well over 40 million lines of code in the pre-existing comment file alone.
♦ Here’s the issue. Whenever you write a unique comment the pre-existing library is searched by a software program to match you to your commenting history. This avoids duplication (if you hit send/submit more than once), and it also looks for spam or bot signatures.
Because we are using new servers the library of 7.2 million comments (each with 6 to 10 lines of metadata) has to be searched by the same software built into the system to avoid comment duplication. In essence as you hit submit/send the search goes through the entire library of 40 million lines of metadata looking for your unique id and then responds back to allow the comment to go through.
With only a few people reaching the site it’s not a problem. However as more and more people are both drawing from the host library of posted content; and simultaneously writing comments that have to go through searching 40 million lines of metadata; the servers get overloaded.
Put 1,000 people on the site at the same time and you get 1,000 x 40,000,000 lines of metadata being searched and queried at the same time.
With one thousand users the servers are trying to: (1) return library results on 11 years of archived posts; and (2) search through 40 billion lines of comment metadata and respond simultaneously. That’s the data load for 1,000 users.
[On an average day at CTH we hold 25k users/hr simultaneously and a peak avg around 35k to 50 thousand around 9pm ET.]
Last night, typical for CTH, as word spread we had 30,000 simultaneous users return to the website. Do the math: 30,000 users x 40 million lines of metadata in the comment library search and the servers overloaded…. “504” and “524” notifications were the result on your browser searches.
That’s what happened.
It’s not an excuse for what happened; that’s just what happened.
As the DNS promulgates more fully the servers will be less taxed on the article data retrieval and older files will shift to the cache service. The weight of the site data will lessen as the cache can step-in, and the servers will be less taxed. That part is not a problem.
The problem is the sheer scale of the comment library. Every comment submitted draws a search amid 40 million lines of metadata. Over time that will also improve, but the comment library is going to take much longer to promulgate; and it doesn’t help that Google search engines are simultaneously crawling through the entire site, including the totality of the comment library, to re-index their search results by pulling data from the new site servers.
One quick solution from the engineering side was to just get rid of every comment older than 90 days so the comment library shrinks to a manageable searchable size.
Another solution to regain the site was to shut down commenting again thereby stopping the search function associated with each unique user posting a comment.
The latter approach would buy time to find another solution that might avoid deleting the comment library. That is the approach we have chosen right now. We didn’t really have an option.
When I was asked “why not just drop the old comments” we run into a conversation that is personal for our CTH community.
There’s a few reasons why I wanted the total import of the comment files, one of them I never wanted to discuss…. but to explain to those here today who might also just say get rid of the older comments… I need to share something very important.
Our comment library holds 11 years of prayer request threads. It was the first function we added to this little hideout. You may not know it but the admins and I spend a lot of time reading those prayer requests; and yes, we pray for those who write.
Those prayer requests and replies are from real people, our people, our CTH community of people and they are the source of our compass heading. More importantly, we have lost friends who asked for prayers for themselves and/or their loved ones. We have held the hands of those people and we have at times had no words other than just to sit still and let them know we are there.
Many of those friends left us and they are in God’s hands now. Additionally, whether you know it or not the families left behind visit those prayer threads, read the words, and remember the voices of those who are no longer with us. If we were to delete old comments we would be losing many of those prayer threads, and that would just be too high a price to pay. It would feel like we were breaking a promise, I cannot do it.
So the tech team has been told deleting comments by date is not an option. If I need to pay a service to host an archive of those prayer threads, so we can put a link to them, then that’s doable. But we are not deleting them, period.
The tech guys are looking at other options for how to turn-on comments while at the same time lessening the burden on comment data by reducing the library. There has to be a way to do it and that’s the challenge.
I apologize unreservedly for this issue.
I apologize that you cannot comment right now.
This issue within the launch of CTH 2.0 is entirely on me.
The tech people who worked on this massive export and import of data have done outstanding work. I am the person who wouldn’t budge on retention and transfer of the entire CTH data library to include every comment by this community. This 11-year labor of love will not be taken apart.
There’s a solution out there and we are sharp enough to figure it out.
Warmest love and deepest appreciation for your understanding,