1) How does this deal with backups? Presumably the deltalake tables can't be backed up by Postgres itself, so I guess there's some special way to d backups?
2) Similarly for replication (physical or logical). Presumably that's not supported, right? I guess logical replication is more useful for OLTP databases from which the data flow to datalakes, so that's fine. But what's the HA story without physical replication?
3) Presumably all the benefits are from compression at the storage level? Or are there some tweaks to the executor to do columnar stuff? I looked at the hooks in pg_analytics, but I see only stuff to handle DML. But I don't speeak rust, so maybe I missed something.
1) and 2) Backups and replication are on the roadmap. It's the next major feature we're working on.
3) We store data in Parquet files, which is a heavily compressed file format for columnar data. This ensures good compression and compatibility with Arrow for in-memory columnar processing. We also hook at the executor level and route queries on deltalake tables to DataFusion query engine, which processes the data in a vectorized fashion for much faster execution
Thanks. I wonder how you plan to replicate stuff, considering how heavily it relies on WAL (I haven't found the answer in the code, but I'm not very familiar with rust).
How large part of the plan you route to the datalake tables? Just scans or some more complex part? Can you point me to the part of the code doing that? I'm intrigued.
1) How does this deal with backups? Presumably the deltalake tables can't be backed up by Postgres itself, so I guess there's some special way to d backups?
2) Similarly for replication (physical or logical). Presumably that's not supported, right? I guess logical replication is more useful for OLTP databases from which the data flow to datalakes, so that's fine. But what's the HA story without physical replication?
3) Presumably all the benefits are from compression at the storage level? Or are there some tweaks to the executor to do columnar stuff? I looked at the hooks in pg_analytics, but I see only stuff to handle DML. But I don't speeak rust, so maybe I missed something.