0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-11-28 18:26:15 +01:00
posthog/ee/clickhouse/migrations
Harry Waye a819069128
chore(pdi): add data migration for pdi to pdi2 (#7792)
* chore(pdi): add data migration for pdi to pdi2

This adds an async migration to copy the latest non-deleted
`(team_id, person_id, distinct_id)` tuples from `pdi` to `pdi2`.

Note that this has already be performed for tead_id = 2 on posthog,
cloud so we ensure we're maintaining parity with however this
migration was performed. I've done this by running:

```
SELECT * FROM <old_query>
FULL JOIN <new_query> new
    ON old.distinct_id = new.distinct_id
WHERE old.person_id <> new.person_id
```

specifically for team_id = 2.

* Rename migration

* Skip 0003_fill_person_distinct_id2 on fresh installs

* Clarify version requirements

* Run async migrations using a while-loop instead of tail recursion

Python has a stack limit of 1000, which we might easily run into for
0003 migration

* Use built-in progress tracking

* Make description fit into database 400 char limit

* Add correctness test for new async migration

* Migrate person_distinct_id2 team-by-team

* Remove dead code

* Update migration notes

* Fix foss tests

Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
2022-01-04 15:34:12 +02:00
..
__init__.py
0001_initial.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0002_events_materialized.py
0003_person.py chore(pdi): add data migration for pdi to pdi2 (#7792) 2022-01-04 15:34:12 +02:00
0004_kafka.py
0006_session_recording_events.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0007_static_cohorts_table.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0008_plugin_log_entries.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0009_person_deleted_column.py
0010_cohortpeople.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0011_cohortpeople_no_shard.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0012_person_id_deleted_column.py
0013_persons_distinct_ids_column.py
0014_persons_distinct_ids_column_remove.py
0015_materialized_column_comments.py
0016_collapsing_person_distinct_id.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0017_events_dead_letter_queue.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0018_group_analytics_schema.py Add snapshot tests for clickhouse table schemas (#7572) 2021-12-08 16:07:34 +02:00
0019_group_analytics_materialized_columns.py
0020_session_recording_events_window_id.py
0021_session_recording_events_materialize_full_snapshot.py
0022_person_distinct_id2.py Create and populate person_distinct_id2 table, add versioning to person_distinct_id (#7576) 2021-12-08 16:47:57 +02:00