Postgres: Copying large subsets of data(10k+rows) from one schema to another for immediate use

postgresql sqlbulkcopy


I'm needing to push rows from one Postgres schema to another on demand. One schema stores the content and the other is a copy of a subset of data that the end user will add, modify, delete for their custom use. There are 10+million records and growing in each of the schema tables as more content is added and more content is absorbed and customized by the end user. The data moving would be several thousand records from multiple tables to multiple tables in the other schema. These schemas are in the same database. This data will be moving all the time and would need to be extremely quick so the end user can start customizing immediately after choosing the content. I've looked into the COPY function as well as trigger replication but none seem to perform as quickly as needed. I cannot remove indexes before insert as these tables are used constantly and heavily by consumers. Any advice would be greatly appreciated.

6/13/2017 5:57:59 PM

Accepted Answer

Since you are copying from a bunch of tables in one schema of a database to another schema of the same database, and cannot start until your fastest way to do so is going to be by using INSERT INTO ... SELECT FROM, run as a user who has access to both schema, e.g.:

INSERT INTO schema2.table_y ( field, field, field )
SELECT field, field, field
FROM schema1.table_y
WHERE user filter condition

Since you have to do this over multiple tables, you can either wrap it in one big transaction, or execute multiple insert statements using Writeable CTEs in one big query.

That is really going to be the fastest way to get a copy of the data. However it will still take non-zero time; both the SELECT queries to extract the data will take some time, as will inserting thousands of rows across multiple tables. Your ways to reduce this time are general query and performance optimization.

If the manipulation works such that the user doesn't need a copy of the data, but just a live subset of the data, then you could explore using VIEWs, including updateable ones. But it sounds like you need a copy.

6/13/2017 8:31:02 PM

Related Questions

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow