Data delivery standards

7 min

1\ delivery frequency and reliability to ensure timely and efficient data processing, partners must finalize the following operational parameters with klavi during the onboarding phase scope & historical data partners are required to coordinate with the klavi business team to define the specific data scope and the historical look back period required for the initial data load incremental update frequency establish the delivery cadence (e g , daily, weekly, biweekly or monthly) for incremental updates based on your business needs customized slas because data volume and complexity vary significantly between different partners, service level agreements (slas) —including data delivery cut off times—are customized these are determined based on the specific volume of data being processed to ensure realistic and reliable delivery windows 2\ schema customization\&data dictionary klavi provides flexible data structures tailored to the specific business focus of our partners customizable schemas data fields and definitions are not fixed partners can customize the schema to align with their internal analytics or reporting requirements business consultation the final data dictionary and field mappings will be documented and confirmed in collaboration with klavi business personnel prior to technical implementation 3\ storage architecture and pathing rules while partners may specify their own s3 bucket and base path for data reception, klavi enforces a standardized directory and naming convention for the sub paths to ensure automated ingestion compatibility 3 1 directory structure data is organized by source type, version, and delivery date using the following partition logic \[base path]/external sources klavi/\[data type]/v\[\<version number>]/delivery ts=yyyy mm ddthh mm ssz/ example s3 //partner bucket/custom prefix/external sources klavi/credit transactions/v1/delivery ts=2026 01 23t17 44 58z/ 3 2 filename convention files are named using the data type and a unix timestamp to ensure uniqueness and auditability \[data type] \[unix timestamp] parquet example credit transactions 1737557926 parquet 4\ technical specifications we utilize high performance storage formats to ensure compatibility with modern data warehouses (like snowflake, bigquery, or redshift) format & compression all data is delivered in apache parquet format using zstd compression strict data typing columns are defined with appropriate native types (integer, boolean, timestamp) we avoid generic string types to ensure data integrity raw data integrity string values (such as merchant names) are delivered as raw strings without normalization or pre processing to ensure the partner receives the original source of truth 5\ data integrity and idempotency to ensure reliability and prevent ingestion of partial data, we implement the following immutability delivered data is immutable any re delivery or correction will be issued under a new timestamped folder completion signal a success marker file is uploaded to the delivery folder only after the data upload is 100% complete ingestion pipelines should be triggered by the appearance of this file timezone standardization time/date fields are the raw values returned by the institution, and klavi does not perform any conversion 6\ record management and deduplication record deduplication the incremental data and historical data distributed to you by klavi have been removed, and you do not need to consider the removal work record delay due to the delay of 1 2 days in some institutions' credit card transactions, klavi will include the updated data in the file next day