* Hotfix: Use materialized columns on cloud
This was broken since default_kind was different on Distributed tables
on cloud
* Improve __init__.py
* Make reverting DEFAULT column async, only ON CLUSTER for events table
* Use proper interval calculation in the funnel trends query
* Add some comments
* Update `test_filter`
* Rework `NULL_SQL` to use CH `INTERVAL` too
* Fix week-based relative `date_from` support not existing
* Make use of `toInterval*` functions and inject less
* Add fallback for `date_from` in `ClickhouseSessionsAvg`
* Revert "Revert "Add is_deleted column to person_distinct_id (#5151)" (#5193)"
This reverts commit 401268bdba.
* A tweak for docker-compose builds
Co-authored-by: James Greenhill <fuziontech@gmail.com>
* Update PERSONS_ACTIVE_USER_SQL query
* Remove dead import
* Update lifecycle queries
* Update BREAKDOWN_ACTIVE_USER_INNER_SQL to use new persons query
* Update STICKINESS_SQL
* Update STICKINESS_PEOPLE_SQL
* Update STICKINESS_ACTIONS_SQL
* Update paths query
* Update events query
* Update CALCULATE_COHORT_PEOPLE_SQL
* Update retention queries
* Update TOP_PERSON_PROPS_ARRAY_OF_KEY_SQL
* Update EVENT_JOIN_PERSON_SQL
* Update GET_PERSON_ID_BY_ENTITY_COUNT_SQL
* Remove remaining references to old get latest person query
* Update GET_DISTINCT_IDS_BY_PROPERTY_SQL
* Fix code style issue
* Update table engine for person_distinct_id table
* don't select team_id
* Make person deletion work
* Use replacingmergetree over collapsing with is_deleted
Replacing an existing engine is hard, let's not do it
* Update query in test
* add migration
* set database on materialized views
* Update plugin server to 1.1.6
Co-authored-by: James Greenhill <fuziontech@gmail.com>
Co-authored-by: posthog-bot <posthog-bot@users.noreply.github.com>
* Make DDLs more friendly towards running on a cluster
* Use primary CLICKHOUSE host for migrations and DDL
* loose ends on person kafka create
* posthog -> cluster typo
* add cluster to KAFKA create for plugin logs
* Feed the type monster
* clusterfy local clickhouse
* test docker-compose backed github action
* run just clickhouse and postgres from docker-compose
* move option to between up and <services>
* posthog all the things
* suggest tests run on cluster
* posthog cluster for ci
* use deploy path for docker-compose
* fix for a clickhouse bug 🐛
* complete CH bug fixes
* 5439 the github actions pg configs
* remove CLICKHOUSE_DATABASE (handled automatically)
* update DATABASE_URL for code quality checks
* Missed a few DDLs on Person
* 5439 -> 5432 to please the people
* cleanup persons and use f strings <3 f strings
* remove auto parens
* Update requirements to use our fork of infi.clickhouse_orm
* fix person.py formatting
* Include boilerplate macros for a cluster
* wip: pagination for persons on clickhouse funnels
* wip: added offset support for getting a list of persons; added support for conversion window;
* fixed mypy exception
* helper function to insert data for local testing
* moved generate code into separate class for more functionality later
* corrected person_distinct_id to use the person id from postgres
* minor corrections to generate local class along with addition of data cleanup via destroy() method
* reduce the number of persons who make it to each step
* moved funnel queries to a new folder for better organization; separated funnel_persons and funnel_trends_persons into individual classes;
* funnel persons and tests
* initial implementation
* invoke the funnel or funnel trends class respectively
* add a test
* add breakdown handling and first test
* add test stubs
* remove repeats
* mypy corrections and PR feedback
* run funnel test suite on new query implementation
* remove imports
* corrected tests
* minor test updates
* correct func name
* fix types
* func name change
* Make `SHELL_PLUS_PRINT_SQL` clearer
* Add ClickhouseFunnelTrendsNew
* Create test_funnel_trends_new.py
* Create test_funnel_trends_v2.py
* move builder functions to funnel base
* add test classe for new funnel
* Inherit from `ClickhouseFunnelNew` and fix intervals
* Add proper formatting of trends results
* Clean tests up a little bit
* Group `FunnelWindowDaysMixin` tests in `test_funnel_persons`
* Rename `ClickhouseFunnelTrendsNew` things for clarity
* Port some original `ClickhouseFunnel` trends tests for the new query
* Only fetch initial page (100) of persons in trends query
* Describe assumptions and rename things
* Finish porting old ClickhouseFunnelTrends tests and add some new ones
* Remove unused imports
* Try to fix `test_period_not_final`
* Try to fix `test_period_not_final` again
* remove persons lists
* rename
* fix test
* add timezone to results
* add funnel trends new to api path
* revert random change
Co-authored-by: Buddy Williams <buddy@posthog.com>
Co-authored-by: eric <eeoneric@gmail.com>
* checkpoint: refactoring funnel trends so that they work correctly
* wip: refactoring funnel trends query to return the results we actually need
* wip: added in new query for testing
* wip: moved sql into a separate file, converted list to dictionary, and added several tests to check data quality
* wip: with a better understaning of funnel trends I've refactored the query so that I can write a data transformer in python
* moved code into funnel_trends for both logic and tests to isolate the concern
* reordered methods for readability
* wip: refactoring funnel trends to support filters
* wip: added updated SQL which will replace the existing FUNNEL_TREND_SQL
* correct column name so that it's clearer
* added substitution variables to new query
* fixed missing substitution variable
* wip: integrating new query with correct params, added mixins for funnel_window, and working toward working test
* several query corrections
* summarize funnel trends
* moved method down
* removed unused code
* added data quality checks
* corrected cohort size for tests
* test window size and incomplete status
* corrected a few names
* removed unnecessary comment
* commented out old funnel trends tests
* removed print statement
* removed old funnel trend code
* made funnel trends response match existing data structure layout
* removed unused imports
* removed more unused imports
* fixed mypy errors
* Added ClickhouseFunnelBase to extract common methods for both ClickhouseFunnelTrends and ClickhouseFunnel; this also fixes issues with tests;
* removed unused type comment
* corrected test to account for new funnel_window_days mixin
* fixed clickhouse funnel tests
* fixes for automated tests
* changed team_id to use client substitution to avoid sql injection attempts in the future but since it's not user supplied it's not currently an issue
* corrections prompted by PR review
* corrected test to dict test with funnel_window_days
* fixing bug
* conslidate breakdown query
* remove clause
* add missing param
* fix name and bracket
* fix tests add none clause
* fix more tests
* more test adjustments
* add person prop filter test
* only use None when offset is 0 (first query)
* add null check and fix api helper
* swap out distinct_id table in queries with a subquery that will only consider latest distinct_ids
* wrong import
* fix missin params
* more missing params
* MV -> View for events_with_array_props_view
* sort
* convert another mv into view because mv's are not triggered by views
* pedantic - no mat in naming
* remove unused tables
* Finish the local dev w/ proto setup
* WIP manage events view
* Add task, add interface etc
* Move everything to 'manage events' view
* Move all settings into single dropdown (can be reverted)
* Urls for tabs
* Fix migration
* Clickhouse and humanize volume
* Fix cypress test
* Fix sidebar cypress
* Fix cypress again
* Fix some small issues
* Address comments
* Corect naming
* Fix test'
Co-authored-by: James Greenhill <fuziontech@gmail.com>
* remove person properties up to date
* remove person props mv
* move latest person
* prune rest of person materialized
* missing parenth
* add type
* remove migration
* Fix feature flags clickhouse
* Fix feature flags clickhouse
* Fix types
* Fix stuff
* Silly me
Co-authored-by: Eric <eeoneric@gmail.com>
* Clickhouse use elements chain
* Fix stuff
* Add action tests and start regex
* Progress
* Progress part deux
* Fix everything
* Add tag name filtering
* Fix funnels
* Fix tag name regex
* Fix ordering
* Fix type issues
* Fix empty nth-child
* Remove commented code
* Split with semicolon and escaped quotes
* Specify all select columns
* Use toDate to speed up ordering
* Add another check
* Try debugging test
* Bla
* aaaaa
* cba
* Only return 100 results
* not even sure
* order by desc both
* Add proper test
Co-authored-by: Eric <eeoneric@gmail.com>
* change parsing to include operators'
* make properties test into factory
* add clickhouse test implementation and fix another test
* add custom test to clickhouse filter tests
* all tests besides json filtering
* add json test
* fix tests
* fix type errors
* Clickhouse use elements chain
* Fix stuff
* Add action tests and start regex
* Progress
* Progress part deux
* Fix everything
* Add tag name filtering
* Fix funnels
* Fix tag name regex
* Fix ordering
* Fix type issues
* Fix empty nth-child
* Remove commented code
* Split with semicolon and escaped quotes
* Specify all select columns
* convert sessions table logic to TS
* convert rest of sessions to TS
* sessions table logic refactor, store date in the url
* add back/forward buttons
* load sessions based on the URL, not after mount --> avoids duplicate query if opening an url with a filter
* prevent multiple queries
* throw error if failed instead of returning an empty list
* date from filters
* rename offset to nextOffset
* initial limit/offset block
* indent sql
* support limit + offset
* load LIMIT+1 sessions in postgres, pop last and show load more sign. (was: show sign if exactly LIMIT fetched)
* based offset is always 0
* default limit to 50
* events in clickhouse sessions
* add elements to query results
* add person properties to sessions query response
* show seconds with two digits
* fix pagination, timestamp calculation and ordering on pages 2 and beyond
* mypy
* fix test
* add default time to fix test, fix some any(*) filter issues
* remove reverse
* WIP event list
* Events progress
* Finish off event listing, skip live actions for now
* Fix mypy
* Fix mypy again
* Try fixing mypy
* Fix assertnumqueries
* Fix tests
* Fix tests
* fix test
* Fix tests
* Fix tests
* Fix tests again
* Fix person querying
* Fix flake
* Fix person stuff
* Fix test
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* convert sessions table logic to TS
* convert rest of sessions to TS
* sessions table logic refactor, store date in the url
* add back/forward buttons
* load sessions based on the URL, not after mount --> avoids duplicate query if opening an url with a filter
* prevent multiple queries
* throw error if failed instead of returning an empty list
* date from filters
* rename offset to nextOffset
* initial limit/offset block
* indent sql
* support limit + offset
* load LIMIT+1 sessions in postgres, pop last and show load more sign. (was: show sign if exactly LIMIT fetched)
* based offset is always 0
* default limit to 50
* events in clickhouse sessions
* add elements to query results
* add person properties to sessions query response
* show seconds with two digits
* fix pagination, timestamp calculation and ordering on pages 2 and beyond
* mypy
* fix test
* add default time to fix test, fix some any(*) filter issues
* remove reverse
* WIP event list
* Events progress
* Finish off event listing, skip live actions for now
* Fix mypy
* Fix mypy again
* Try fixing mypy
* Fix assertnumqueries
* Fix tests
* Fix tests
* fix test
* Fix tests
* Fix tests
* Fix tests again
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* initial
* migration command
* migrations working
* add modelless views for clickhouse
* initial testing structure
* use test factory
* scaffold for all tests
* add insight and person api
* add basic readme
* add client
* change how migrations are run
* add base tables
* ingesting events
* restore delay
* remove print
* updated testing flow
* changed sessions tests
* update tests
* reorganized sql
* parametrize strings
* element list query
* change to seralizer
* add values endpoint
* retrieve with filter
* pruned code to prepare for staged merge
* working ingestion again
* tests for ee
* undo unneeded tests right now
* fix linting
* more typing errors
* fix tests
* add clickhouse image to workflow
* move to right job
* remove django_clickhouse
* return database url
* run super
* remove keepdb
* reordered calls
* fix type
* fractional seconds
* fix type error
* add checks
* remove retention sql
* fix tests
* add property storage and tests
* merge master
* fix tests
* fix tests
* .
* remove keepdb
* format python files
* update CI env vars
* Override defaults and insecure tests
* Update how ClickHouse database gets evaluated
* remove bootstrapping clickhouse database routine
* Don't initialize the clickhouse connection unless we say it's primary
* .
* fixed id generation
* remove dump
* black settings
* empty client
* add param
* move docker-compose for ch to ee dir
* Add _public_ key to repo for verifying self signed cert on server
* update ee compose file for ee dir
* fix a few issues with tls in migrations
* update migrations to be flexible about storage profile and engine
* black settings
* add elements prop tables
* add elements prop tables
* working filter
* refactored
* better url handling
* add mapping table
* add processing to worker task
* working cohort with actions
* add cohort property filtering
* add cohort property filtering
* reformat and add cohort processing
* prop clauses
* add util
* add more util
* add clickhouse modifier
* Clickhouse Sessions (#1623)
* sessions sql
* skeleton
* add endpoint
* better tests
* sessions list
* merge clickhouse-actions
* added session endpoint
* sessions sql working again
* add clickhouse modifier
* session avg with props working
* add dist
* tests working (no list)
* list working
* add formatting
* more formatting
* fix tests
* dummy commit
* fix types
* remove unnecessary improt
* ignore type when importing from ee in task
* fix test running
* Clickhouse Trends Base (#1609)
* initial working
* date param almost working
* fix date range and labels
* fixed monthly math
* handle compare
* change table
* using new event ingestion
* direct query actions working
* remove interface
* fix date range
* properties initial working
* handle operator
* handle operator
* move timestamp parse
* move more to util
* inital breaking down working
* working cohort breakdown
* some tests running
* fix sessions
* cohort tests
* action and interval test
* reorder cohort filtering
* rename retention test
* fix inits
* change multitenancy tests
* fix types
* fix optional types
* replace ch_client.execute with sync_execute
* replace ch_client.execute with sync_execute, part 2
* Clickhouse Stickiness + Process Event (#1654)
* generate clickhouse uuid script
* set CLICKHOUSE_SECURE=False by default if running in TEST or DEBUG
* convert person_id to UUID, make adding `person_id` optional, add distinct_ids already in the `create_person` function
* Fix test_process_event_ee.py, remove all calls to Person.objects.*
* add back util
* fix broken imports
* improve process_event test clickhouse queries
* Basic stickiness query
* Clickhouse Stickiness tests
* stickiness test [WIP, actions fail]
* generate clickhouse uuid script
* change default test runner if PRIMARY_DB=clickhouse
* fix stickiness test for actions
* fix merge bug
* remove _create_person stub; cohort person_id is UUID now
* fix typing
* Clickhouse trends process math (#1660)
* most of process math works
* all process math
* fix ordering issue
* unusued imports
* update property comparison for process_event_ee
* indentation wrong missing calls
* demo users and events (#1661)
* finish breakdown filtering tests and reformat label function
* add increment to demo_data
* update demo data populating
* Add people endpoint for ch (#1670)
* add people endpoint for ch
* stickiness people
* fix value padding
* add process math to breakdown and
* add limit
* fix tests
* condensed code
* converted test to factory
* add people tests
* add month handling
* add typing fix
* change people test handling
* fix tests
* Clickhouse funnels 2 (#1668)
* add elements to create_event
* WIP closes #1663 Add funnels to clickhouse
* Make funnels work
* Clean up
* Move filtering around
* Add mypy tests and fix
* Performance improvements
* fix person tests again
* add people for funnel endpoint
* fix prop numbering
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* merge master
* add retention
* update types
* more typing errors
* fix types
* bug with kafka payload, elements insert, and demo data
* Clickhouse Paths (#1657)
* paths clickhouse test (fails)
* add elements to create_event
* make this fail for clickhouse
* hardcoded query that returns good results for $pageviews, no filters yet
* clean up queries
* bound by time, fix 30min new session boundary
* support screen and custom events
* add properties filter
* paths url
* filter by path start
* better path start test
* even better path start test
* start from the first "path start" in a group
* test for person_id in paths
* partition by person_id for POSTGRES paths
* partition by person_id for Clickhouse paths
* clean up order in paths test
* clean up order in paths test
* join elements
* force element order on element group creation
* remove "order" when creating elements in tests and demo
* get list of elements for paths
* add limit to paths query
* use materialized view
* rename "element_hash" to "elements_hash" (no change in db)
* cull rows that are definitely unused
* simplify query
* New highly optimized paths clickhouse query
* start_point for $autocapture paths
* extract event property values from clickhouse
* prevent crash
* select one element sql
* get elements for event
* remove lodash
* remove host from $pageview path elements if same domain as incoming path
* show metadata based on loaded paths filter, not in flight filter
* fix order (all soures and targets in order, not all sources first, then all targets after) - makes for a better looking graph
* add test that makes the Postgres paths query fail
* fix postgres paths --> no fuzzy matching, breaks "starts with" for urls and gives too many incorrect start points
* create automatic /demo urls that match the real urls (no ending /)
* fix elements queries
* path element joins
* create persons via postgres in paths test
* change serializers back to id
* fix tests with uuid
* fix demo
* more bugs
* fix type
* change now to timezone aware
* [clickhouse] retention filters (#1725)
* implemented target entity and prop filtering
* add insight view override
* fix endpoint and filters
* include tests
* fix tests
* add period filtering
* .
* fix pg param name
* add filtering params to both queries in retention sql
* fix param again
* change to todatetime
* change tz to timezone
* add back timezone in model/event
* [clickhouse] feature flag endpoint requests (#1731)
* add feature flags to endpoints
* add flags to endpoints that check on request
* remove magic strings and fill in missing flags
* fix types
* add missing flag
* change from iso
* fix more timestamps and comparator
* change _people to get_people in actions view
* remove action and cohort populating
* change inheritance
* "Clickhouse Features V2 (#1565)"
This reverts commit 0b371d43ec.
* fix types
* change to super
* change to super x2
Co-authored-by: Eric <eeoneric@gmail.com>
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Tim Glaser <tim.glaser@hiberly.com>
* initial
* migration command
* migrations working
* add modelless views for clickhouse
* initial testing structure
* use test factory
* scaffold for all tests
* add insight and person api
* add basic readme
* add client
* change how migrations are run
* add base tables
* ingesting events
* restore delay
* remove print
* updated testing flow
* changed sessions tests
* update tests
* reorganized sql
* parametrize strings
* element list query
* change to seralizer
* add values endpoint
* retrieve with filter
* pruned code to prepare for staged merge
* working ingestion again
* tests for ee
* undo unneeded tests right now
* fix linting
* more typing errors
* fix tests
* add clickhouse image to workflow
* move to right job
* remove django_clickhouse
* return database url
* run super
* remove keepdb
* reordered calls
* fix type
* fractional seconds
* fix type error
* add checks
* remove retention sql
* fix tests
* add property storage and tests
* merge master
* fix tests
* fix tests
* .
* remove keepdb
* format python files
* update CI env vars
* Override defaults and insecure tests
* Update how ClickHouse database gets evaluated
* remove bootstrapping clickhouse database routine
* Don't initialize the clickhouse connection unless we say it's primary
* .
* fixed id generation
* remove dump
* black settings
* empty client
* add param
* move docker-compose for ch to ee dir
* Add _public_ key to repo for verifying self signed cert on server
* update ee compose file for ee dir
* fix a few issues with tls in migrations
* update migrations to be flexible about storage profile and engine
* black settings
* add elements prop tables
* add elements prop tables
* working filter
* refactored
* better url handling
* add mapping table
* add processing to worker task
* working cohort with actions
* add cohort property filtering
* add cohort property filtering
* reformat and add cohort processing
* prop clauses
* add util
* add more util
* add clickhouse modifier
* Clickhouse Sessions (#1623)
* sessions sql
* skeleton
* add endpoint
* better tests
* sessions list
* merge clickhouse-actions
* added session endpoint
* sessions sql working again
* add clickhouse modifier
* session avg with props working
* add dist
* tests working (no list)
* list working
* add formatting
* more formatting
* fix tests
* dummy commit
* fix types
* remove unnecessary improt
* ignore type when importing from ee in task
* fix test running
* Clickhouse Trends Base (#1609)
* initial working
* date param almost working
* fix date range and labels
* fixed monthly math
* handle compare
* change table
* using new event ingestion
* direct query actions working
* remove interface
* fix date range
* properties initial working
* handle operator
* handle operator
* move timestamp parse
* move more to util
* inital breaking down working
* working cohort breakdown
* some tests running
* fix sessions
* cohort tests
* action and interval test
* reorder cohort filtering
* rename retention test
* fix inits
* change multitenancy tests
* fix types
* fix optional types
* replace ch_client.execute with sync_execute
* replace ch_client.execute with sync_execute, part 2
* Clickhouse Stickiness + Process Event (#1654)
* generate clickhouse uuid script
* set CLICKHOUSE_SECURE=False by default if running in TEST or DEBUG
* convert person_id to UUID, make adding `person_id` optional, add distinct_ids already in the `create_person` function
* Fix test_process_event_ee.py, remove all calls to Person.objects.*
* add back util
* fix broken imports
* improve process_event test clickhouse queries
* Basic stickiness query
* Clickhouse Stickiness tests
* stickiness test [WIP, actions fail]
* generate clickhouse uuid script
* change default test runner if PRIMARY_DB=clickhouse
* fix stickiness test for actions
* fix merge bug
* remove _create_person stub; cohort person_id is UUID now
* fix typing
* Clickhouse trends process math (#1660)
* most of process math works
* all process math
* fix ordering issue
* unusued imports
* update property comparison for process_event_ee
* indentation wrong missing calls
* demo users and events (#1661)
* finish breakdown filtering tests and reformat label function
* add increment to demo_data
* update demo data populating
* Add people endpoint for ch (#1670)
* add people endpoint for ch
* stickiness people
* fix value padding
* add process math to breakdown and
* add limit
* fix tests
* condensed code
* converted test to factory
* add people tests
* add month handling
* add typing fix
* change people test handling
* fix tests
* Clickhouse funnels 2 (#1668)
* add elements to create_event
* WIP closes #1663 Add funnels to clickhouse
* Make funnels work
* Clean up
* Move filtering around
* Add mypy tests and fix
* Performance improvements
* fix person tests again
* add people for funnel endpoint
* fix prop numbering
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* merge master
* add retention
* update types
* more typing errors
* fix types
* bug with kafka payload, elements insert, and demo data
* Clickhouse Paths (#1657)
* paths clickhouse test (fails)
* add elements to create_event
* make this fail for clickhouse
* hardcoded query that returns good results for $pageviews, no filters yet
* clean up queries
* bound by time, fix 30min new session boundary
* support screen and custom events
* add properties filter
* paths url
* filter by path start
* better path start test
* even better path start test
* start from the first "path start" in a group
* test for person_id in paths
* partition by person_id for POSTGRES paths
* partition by person_id for Clickhouse paths
* clean up order in paths test
* clean up order in paths test
* join elements
* force element order on element group creation
* remove "order" when creating elements in tests and demo
* get list of elements for paths
* add limit to paths query
* use materialized view
* rename "element_hash" to "elements_hash" (no change in db)
* cull rows that are definitely unused
* simplify query
* New highly optimized paths clickhouse query
* start_point for $autocapture paths
* extract event property values from clickhouse
* prevent crash
* select one element sql
* get elements for event
* remove lodash
* remove host from $pageview path elements if same domain as incoming path
* show metadata based on loaded paths filter, not in flight filter
* fix order (all soures and targets in order, not all sources first, then all targets after) - makes for a better looking graph
* add test that makes the Postgres paths query fail
* fix postgres paths --> no fuzzy matching, breaks "starts with" for urls and gives too many incorrect start points
* create automatic /demo urls that match the real urls (no ending /)
* fix elements queries
* path element joins
* create persons via postgres in paths test
* change serializers back to id
* fix tests with uuid
* fix demo
* more bugs
* fix type
* change now to timezone aware
* [clickhouse] retention filters (#1725)
* implemented target entity and prop filtering
* add insight view override
* fix endpoint and filters
* include tests
* fix tests
* add period filtering
* .
* fix pg param name
* add filtering params to both queries in retention sql
* fix param again
* change to todatetime
* change tz to timezone
* add back timezone in model/event
* [clickhouse] feature flag endpoint requests (#1731)
* add feature flags to endpoints
* add flags to endpoints that check on request
* remove magic strings and fill in missing flags
* fix types
* add missing flag
* change from iso
* fix more timestamps and comparator
* change _people to get_people in actions view
* remove action and cohort populating
Co-authored-by: James Greenhill <jams@uber.com>
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Tim Glaser <tim.glaser@hiberly.com>
* Do not shadow Kafka default columns _timestamp and _offset
* drop the columns only on the kafka_ tables
* include data types
* fields cleanups
* datetime -> datetime64
* double import datetime
* Create Omni-Person model for managing people in Clickhouse
* type fixes
* rebase all the things
* cleanups
* id -> uuid for events in clickhouse
* cleanups and type checks
* Further cleanups and uuid conversions
* kafka fix
* break out serializer across kafka clients
* fix a few bugs w/ datetime types
* basic fix for people kafka table
* fix migration errors (copy pasta errors)
* Use KafkaProducer for Omni Person emitting
* setup mock kafka producer
* undo some work for inserting
* Test TestKafkaProducer
* change if order, obvious mistake
* remove unnecessary function arg
* Fix getters for new column
* Test fixes
* mirror columns across element queries
* firm up handling of timestamps
* only return timestamps for handle_timestamp
* Correct heroku config for Kafka
* Use ReplacingMergeTree for elements, remove element_groups and use elements_hash as a virtual "pk"
* remove unused ELEMENT_GROUP_TABLE_SQL
* merge fixes
* use redis cache to avoid writing duplicate elements to clickhouse
* move fakeredis to requirements.txt
* add team_id to cache key
* remove elements_group kafka table references
* add elements_hash to clickhouse element serializer
* fix cache key
* rename few keys
* add test runner to ease pycharm dev
* fix a some mypy errors
* remove typo
Co-authored-by: Eric <eeoneric@gmail.com>
* add test runner to ease pycharm dev
* fix broken import
* drop and recreate the clickhouse test db before running tests
* fix person uuid str json serialization issue
* make kafka optional in tests
* fix inits
* remove need for kafka in person.py
* fix a bunch of mypy errors
* fix function and add process_event to pipeline
* fixed missing params and tests
* change uuid and fix types
* types
* optimize for merge prop test
* make ClickhouseProducer to produce to clickhouse one way or another
* annotate types
Co-authored-by: Eric <eeoneric@gmail.com>
Co-authored-by: James Greenhill <fuziontech@gmail.com>
* Query optimizations
* more sql optimizations
* checkpoint
* fix migration
* add UUID field to person
* use django signals to signal that clickhouse needs to be updated
* cleanup person logic
* cleanups
* update migration
* Don't setup the django signals unless we are for sure using ee setup
* expecting back to 30 queries for capturing with new person
* add .venv to .gitignore
* add env back to .gitignore
Co-authored-by: Ubuntu <ubuntu@ip-172-31-73-18.ec2.internal>
* Publish events to Kafka for consumption
* Commit avro idl's for event schemas
* convert client to use github.com/dpkp/kafka-python
* events loaded into clickhouse from Kafka
* remove cruft
* Publish events to Kafka for consumption
* convert client to use github.com/dpkp/kafka-python
* remove cruft
* include kafka migrations
* bugfixes for migrations
* use constants for consistency
* wrap up local migrations
* small fixes
* tune ups
* generate clickhouse uuid script
* set CLICKHOUSE_SECURE=False by default if running in TEST or DEBUG
* convert person_id to UUID, make adding `person_id` optional, add distinct_ids already in the `create_person` function
* Fix test_process_event_ee.py, remove all calls to Person.objects.*
* add back util
* fix broken imports
* improve process_event test clickhouse queries
* change property parsing
* indentation wrong missing calls
* uuid4 instead of call to CH
Co-authored-by: Eric <eeoneric@gmail.com>
Co-authored-by: James Greenhill <fuziontech@gmail.com>