Date and Time of Incident: June 4, 2025 from approximately 4:57 AM to 5:31 AM PDT
Nature of Incident: Requests routing to an unavailable worker pod
Services Affected: Flatfile Platform UK Regional Environment
At approximately 4:57 AM PDT (11:57 AM BST) on June 4th, Flatfile UK Regional environment infrastructure experienced intermittent failures on file uploads due to the backend worker fleet restarting. This was due to a misconfiguration on the backend worker infrastructure which created a hard dependency on an internal observability component used by FlatFile for monitoring.
The internal observability component had experienced some downtime which led to backend worker fleet restarts. The infrastructure team were alerted to the incident and issued a patch to remove the hard dependency on the worker fleet.
The incident impacted only half of the total worker fleet due as the infrastructure safe guards kicked and rolled out extra workers in order to meet minimum capacity threshold of 50%. The UK regional environment experienced disruption for a total of 33 minutes. The API was unaffected.
The root cause was due to a misconfiguration on the infrastructure worker observability configuration made in a previous release 3 weeks ago.
Flatfile infrastructure team fixed the miconfiguration on the worker infrastructure components by removing the hard depedency on Flatfile’s internal observability stack.
Please be assured that this incident did not compromise the security or integrity of your data. Our commitment to data protection remains a top priority.