Performance Degradation: Experiencing Long Load Times

Incident Report for Flatfile

Postmortem

Incident Overview

Date and Time of Incident: November 4 08:15 PST - 10:51 PST

Nature of Incident: Exhausted Database Connections due to Thawing Expired Workbooks

Services Affected: Flatfile Platform US Region

Details of the Incident

On November 4 a daily event expired a large number of workbooks. Many of the expired workbooks had been backed up and moved to cold storage. The expiry event caused these workbooks to first thaw before being marked as expired and purged. This led to elevated database connections growing for a period of time that ultimately exhausted database resources at approximately 8:15 PST. This caused intermittent service degradations across the platform that manifested as API requests randomly failing when the API was unable to open a database connection.

Flatfile engineering took steps to short circuit the thawing process for expired workbooks and hardened the thaw process so that a large number of workbooks moving from cold storage to “hot” storage would not spike database resources.

Impact Assessment

During the incident window, some API requests would randomly fail leading to unpredictable behavior across the Platform.

Root Cause

The data retention policy feature allows a customer to automatically purge workbooks that have not been active in a set amount of days. This feature would run on a cron-like basis to identify and purge workbooks. It was discovered that this conflicted with the cold storage system for workbooks where any workbook that was backed up and moved to “cold” storage (S3 backup) would first be “thawed” and moved into “hot” storage. In the early morning of the 4th, a large number of workbooks in cold storage were identified for expiry and several thousand workbooks entered the queue to await thaw. Over the course of several hours, the database connection pool became saturated causing API requests to intermittently fail while also causing workbooks in the thaw process to fail and be re-enqueued which exacerbated the connection pool problem.

Resolution

Flatfile engineering took steps to implement a short circuit for expired workbooks to resolve the immediate symptoms. The engineering team followed up with a series of steps to harden the queue, the database connection pool, and the thaw mechanism as well.

Security and Data Integrity

Please be assured that this incident did not compromise the security or integrity of your data. Our commitment to data protection remains a top priority.

Posted Nov 07, 2025 - 08:17 PST

Resolved

This incident has been resolved.
Posted Nov 04, 2025 - 10:51 PST

Update

We are continuing to monitor for any further issues.
Posted Nov 04, 2025 - 09:00 PST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Nov 04, 2025 - 08:57 PST

Identified

The issue has been identified and a fix is being implemented.
Posted Nov 04, 2025 - 08:51 PST

Investigating

We are currently investigating this issue.
Posted Nov 04, 2025 - 08:15 PST
This incident affected: Flatfile Platform (Platform API, Spaces, Dashboard).