Analysis of the incident affecting the queue this morning is ended.
Everything is currently working normally and under human supervision for the next hours to ensure nothing goes bad again.
Actions will be taken ASAP in order to :
add better monitoring and alerting of parts currently outside of our radar
decoupling two parts of the infrastructure avoiding one parts can affect the other one (Two queue of async tasks are mixed together, one can be processed with a few hours delay, the other one should work near realtime, the mixup of theses two queue, ended to have the background processing of widget queues being too much delayed)
Slow queue processing
Since this morning, we have a performance issue on queuing process for the widget. The entrance to the ticket widget is waiting longer than needed to allow customer to actually choose their tickets and buy them.
The peak of slow performance was at 11:00 (Europe/Paris)
We have found that a backend database has a failing maintenance process which silently delay many requests. We have changed priorities for some process to allow queue management to be fast again. We are investigating the underneath issue to qualify the complete fix.