If you follow any of the novels posted by Qidian, you've probably noticed the annoying interstitial pages they publish for their RSS feeds, as well as the fact that the latest few chapters of every series are hidden behind a watch-an-ad/paywall.
This is annoying, as it generally means you're a few chapters behind what the site thinks is the "latest" chapter, because you can't actually view those chapters yet, making one's watches somewhat confusing.
As such, I've implemented a feed parser module that both unwraps the feed-interstitial page, and properly ignores releases you can't actually view. This should make following Qidian series much more pleasant, because you won't be falsely alerted for new chapters that aren't yet actually available.
I'm doing more reconfiguration of hardware, so the feed scraper system will be inactive while I physically move hardware in my server closet around for a little while.
Sorry about that!
Server restructuring is mostly complete. I have a few things I still want to change, but they're going to have to wait because I don't have all the parts I need on-hand.
The message broker server that backs the feed system is currently non-responsive, and I'm waiting on the server host to perform an actual, physical intervention.
I'm not sure when things will be fixed, but things may be down for a day or two.
Ok, normal feeds should be resuming now. The issue should be resolved.
For the technically inclined, the problems resulted from the combination of a bug in RabbitMQ (the message brokeer the feeds use), and an issue with the linux tool
mountall. The RabbitMQ issue (specifically the fact that the "Shovel" plugin caused it to get stuck on exiting) caused me to simply decide to reboot the server. The issue with
mountall then caused the server to not start up properly after reboot (it was stuck waiting for a partition that doesn't exist to mount).
I've removed the shovel plugin entirely, and added a work-around to
fstab to prevent the reboot blocker issue, and it shouldn't reoccur.
I need to do some large schema changes on the backend of the feed feeder system, which requires taking it offline for a bit periodically PostgreSQL churns.
Unfortunately, the size of the tables in the database make this a many hour, or possibly multi-day process, so things may be shut down for a little while.
Ok, database updates are done, normal service should resume.
Some of the RSS feeds have been getting stuck this week, due to a combination of a software bug and the fact that I've been away for work and not had time to do any maintenance.
I've been restarting the feed scraper manually when I can, but the issue will probably persist until I have a chance to fix it properly this weekend, when I get back from my trip.
EDIT: I think I've fixed it using SSH.
EDIT EDIT: Ok, it was a priority inversion in the page fetch queue. It looks to be working correctly now