0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-11-30 19:41:46 +01:00
posthog/ee/clickhouse
Eric Duong a0327587cb
Clickhouse setup (#1463)
* initial

* migration command

* migrations working

* add modelless views for clickhouse

* initial testing structure

* use test factory

* scaffold for all tests

* add insight and person api

* add basic readme

* add client

* change how migrations are run

* add base tables

* ingesting events

* restore delay

* remove print

* updated testing flow

* changed sessions tests

* update tests

* reorganized sql

* parametrize strings

* element list query

* change to seralizer

* add values endpoint

* retrieve with filter

* pruned code to prepare for staged merge

* working ingestion again

* tests for ee

* undo unneeded tests right now

* fix linting

* more typing errors

* fix tests

* add clickhouse image to workflow

* move to right job

* remove django_clickhouse

* return database url

* run super

* remove keepdb

* reordered calls

* fix type

* fractional seconds

* fix type error

* add checks

* remove retention sql

* fix tests

* add property storage and tests

* merge master

* fix tests

* fix tests

* .

* remove keepdb

* format python files

* update CI env vars

* Override defaults and insecure tests

* Update how ClickHouse database gets evaluated

* remove bootstrapping clickhouse database routine

* Don't initialize the clickhouse connection unless we say it's primary

* .

* fixed id generation

* remove dump

* black settings

* empty client

* add param

* move docker-compose for ch to ee dir

* Add _public_ key to repo for verifying self signed cert on server

* update ee compose file for ee dir

* fix a few issues with tls in migrations

* update migrations to be flexible about storage profile and engine

* black settings

* add elements prop tables

Co-authored-by: James Greenhill <jams@uber.com>
2020-09-03 10:27:45 -07:00
..
migrations Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
models Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
queries Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
sql Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
test Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
views Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
__init__.py Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
clickhouse_test_runner.py Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
client.py Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
process_event.py Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
README.md Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00
util.py Clickhouse setup (#1463) 2020-09-03 10:27:45 -07:00

Clickhouse Support (Enterprise Feature)

To accomodate high volume deployments, Posthog can use Clickhouse instead of Postgres. Clickhouse isn't used by default because Postgres is easier to deploy and maintain on smaller instances and on platforms such as Heroku.

Clickhouse Support works by swapping in separate queries and classes in the system for models that are most impacted by high volume usage (ex: events and actions).

Migrations and Models

The django_clickhouse orm is used to manage migrations and models. The ORM is used to mimic the django model and migration structure in the main folder.

Queries

Queries parallel the queries folder in the main folder however, clickhouse queries are written in SQL and do not utilize the ORM.

Tests

The tests are inherited from the main folder. The Clickhouse query classes are based off BaseQuery so their run function should work just as the Django ORM backed query classes. These classes are called with the paramterized tests declared in the main folder which allows the same suite of tests to be run with different implementations.

Views

Views contain Viewset classes that are not backed by models. Instead the views query Clickhouse tables using SQL. These views match the interface provide by the views in the main folder.