0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-12-01 12:21:02 +01:00
posthog/ee/clickhouse
James Greenhill 1ed6263a71
Create Omni-Person model for managing people in Clickhouse (#1712)
* Create Omni-Person model for managing people in Clickhouse

* type fixes

* rebase all the things

* cleanups

* id -> uuid for events in clickhouse

* cleanups and type checks

* Further cleanups and uuid conversions

* kafka fix

* break out serializer across kafka clients

* fix a few bugs w/ datetime types

* basic fix for people kafka table

* fix migration errors (copy pasta errors)

* Use KafkaProducer for Omni Person emitting

* setup mock kafka producer

* undo some work for inserting

* Test TestKafkaProducer

* change if order, obvious mistake

* remove unnecessary function arg

* Fix getters for new column

* Test fixes

* mirror columns across element queries

* firm up handling of timestamps

* only return timestamps for handle_timestamp

* Correct heroku config for Kafka
2020-09-25 11:05:50 +01:00
..
migrations Create Omni-Person model for managing people in Clickhouse (#1712) 2020-09-25 11:05:50 +01:00
models Create Omni-Person model for managing people in Clickhouse (#1712) 2020-09-25 11:05:50 +01:00
queries Fix Master EE code (#1701) 2020-09-24 06:14:17 -04:00
sql Create Omni-Person model for managing people in Clickhouse (#1712) 2020-09-25 11:05:50 +01:00
test Clickhouse Elements Dedup (based on master) (#1698) 2020-09-24 06:47:28 -04:00
views Organizations – models (#1674) 2020-09-24 00:53:51 +02:00
__init__.py Fix Master EE code (#1701) 2020-09-24 06:14:17 -04:00
clickhouse_test_runner.py Fix Master EE code (#1701) 2020-09-24 06:14:17 -04:00
client.py Fix Master EE code (#1701) 2020-09-24 06:14:17 -04:00
process_event.py Create Omni-Person model for managing people in Clickhouse (#1712) 2020-09-25 11:05:50 +01:00
README.md first chunk of clickhouse framework (#1613) 2020-09-08 16:12:27 -07:00
util.py Clickhouse Elements Dedup (based on master) (#1698) 2020-09-24 06:47:28 -04:00

Clickhouse Support (Enterprise Feature)

To accomodate high volume deployments, Posthog can use Clickhouse instead of Postgres. Clickhouse isn't used by default because Postgres is easier to deploy and maintain on smaller instances and on platforms such as Heroku.

Clickhouse Support works by swapping in separate queries and classes in the system for models that are most impacted by high volume usage (ex: events and actions).

Migrations and Models

The django_clickhouse orm is used to manage migrations and models. The ORM is used to mimic the django model and migration structure in the main folder.

Queries

Queries parallel the queries folder in the main folder however, clickhouse queries are written in SQL and do not utilize the ORM.

Tests

The tests are inherited from the main folder. The Clickhouse query classes are based off BaseQuery so their run function should work just as the Django ORM backed query classes. These classes are called with the paramterized tests declared in the main folder which allows the same suite of tests to be run with different implementations.

Views

Views contain Viewset classes that are not backed by models. Instead the views query Clickhouse tables using SQL. These views match the interface provide by the views in the main folder.