Sizing Sequences with SQL and ClickHouse
Recently, I was planning a data extraction strategy from an API and the goal was to schedule the frequency of data extraction to avoid cached responses, but also to be within rate limits. In order to have data for analysis, I have collected API response every minute for 3 hours. It resulted in 185 files (3 full hours plus several minutes more) with a total size of 6.15 MB saved in JSON New Lines format (and compressed with gzip) and 55746 records. As an analysis tool, I used clickhouse-local. This utility helps us to run SQL against local files without setting up a database, creating tables, and loading data.