* Remove dead argument
* Make allow_denormalized_props always explicit
* Change prop_clauses default
* Create a testing decorator for checking materialized columns
This makes it easier to have proper test coverage for materialized
columns and make sure no bugs creep up :)
* Fix event_query
* Test more materialized columns in trends
* Add materialized column tests for funnels
* Cleanup path_event_query
* Fix default
* Fix issue with clashing parameter names
* WIP: port process_math to support materialized columns
* Add skipped test showing trend breakdowns dont use materialized columns
* Simplify testing and test&fix math property aggregation w/ materialized columns
* Add (failing) test for filtering with materialized action props
* Add test around materialized property filtering
* Refactor entity.math materialization impl
* Make trends breakdowns work with materialized columns
* Simplify process_math further
* Handle denormalized properties in format_action_filter for step.properties
Note the following files all called this method:
ee/clickhouse/views/events.py
ee/clickhouse/views/actions.py
ee/clickhouse/queries/trends/util.py
ee/clickhouse/queries/trends/lifecycle.py
ee/clickhouse/queries/trends/breakdown.py
ee/clickhouse/queries/funnels/base.py
ee/clickhouse/queries/sessions/util.py
ee/clickhouse/queries/clickhouse_stickiness.py
ee/clickhouse/queries/clickhouse_retention.py
ee/clickhouse/models/cohort.py
I verified all of them are OK since they query events table directly
with the passed filter
* Handle materialized $current_url in action step filtering
* Remove now unneeded clause
* Update test helper
* Allow denormalized props for filtering with breakdowns
* Allow denormalized props for filtering with lifecycle
* Allow denormalized props for some views
* Fix entity math yet again
* Query materialized columns in insights > sessions
* Handle breakdown edge case
* Allow denormalized props for more views
* PR feedback
* reformat
* Make DDLs more friendly towards running on a cluster
* Use primary CLICKHOUSE host for migrations and DDL
* loose ends on person kafka create
* posthog -> cluster typo
* add cluster to KAFKA create for plugin logs
* Feed the type monster
* clusterfy local clickhouse
* test docker-compose backed github action
* run just clickhouse and postgres from docker-compose
* move option to between up and <services>
* posthog all the things
* suggest tests run on cluster
* posthog cluster for ci
* use deploy path for docker-compose
* fix for a clickhouse bug 🐛
* complete CH bug fixes
* 5439 the github actions pg configs
* remove CLICKHOUSE_DATABASE (handled automatically)
* update DATABASE_URL for code quality checks
* Missed a few DDLs on Person
* 5439 -> 5432 to please the people
* cleanup persons and use f strings <3 f strings
* remove auto parens
* Update requirements to use our fork of infi.clickhouse_orm
* fix person.py formatting
* Include boilerplate macros for a cluster
* Load session events asynchronously from a separate endpoint
This mirrors the behavior of postgres query
* Simplify backend & query
event_count is unused
don't select unused columns in list query
* Rename filter_by_session_recordings to filter_by_session_recordings
This is more in-line with what the function actually does
* Update types, handle start/end url properly
* start_url / end_url to session result
* Update sessions list builder tests
* Remove some `session.events` references
* Remove unneeded code
* Simplify filteredSessions
* Fix type issue
* Add test for session properties
* Test and fix start_url/end_url
* Add test for the new sessions API endpoint
* Improve types
* Update py types again
* Fix bug
* Fix limit of events in CSV export
* Limit to 100 instead of 101
* Optimize listing events
* Fix typing error
* Limit events to 100 better
* Fix len condition for using broader events queryset
* Add regression test
* Adjust ClickhouseEventsViewSet
* Fix CH events limit in CSV export
* Fix typing and missing +1
* Use limit in _query_events_list
* Fix and test more sessions pagination cases
Pagination previously did not work correctly on CH/postgres due to
LIMIT clauses
Simplifying the sessions query helps on clickhouse, though might
introduce new issues.
* Make sessions list pagination work
The key idea is divorce distinct_id lookups from result lookups.
This now works in the scenario where 101 users match person filter/have done
an event in time range, but only the 101st has a session matching
action/event filter (see tests)
This will perform even on superdaily, though it might slow down for very
specialized queries.
Potential future speedups:
- apply action/event filters on the distinct_id query -
only return those which who have the possibility of matching.
- Make distinct_id LIMIT higher if we know action/event limit is
involved
- Caching the distinct_id query heavily
* Reorganize code
* Make session list tests pass w/ pagination
* Add tests, fix another corner case for postgres sessions list
distinct_ids were not always returned in the right order.
* Include distinct_id in sessions query
This should now solve https://github.com/PostHog/posthog/issues/3055
* Add model for session recording viewed
* Save view when querying for session recording data
* Send information to FE about whether session has been viewed
* Allow filtering by "recording unseen"
* Rename property
* Update migration
* Remove (apparently) useless person joins from sessions sql
* WIP: Make sessions list query use python iteration
* WIP: Show loader while session data is loading
* Aggregate together sessions
* Calculate start and end points of session separately
* Remove cruft code
* Load session events asynchronously for self-hosted
Note clickhouse behavior is unchanged.
* Update pagination logic for sessions
In addition to offset, postgres now returns a dict containing person_id,
timestamp which is used to make sure we filter events on different pages
correctly
* Add some tests to SessionListBuilder
* Fix typing errors
* Fix pagination limit
* Move tests to right file
* Query less events for sessions list
Since we're ordering by end_time we know events before last end_time are
all processed.
* Add test for current_url behavior
* Make sure old tests remain working
* Remove unused base class
* Move sessions-related queries to separate subfolder
* Extract sessions list code to separate file(s)
* Sort sessions by end time in ch as well
* List end time in sessions table
* Return person email when querying sessions lists on postgres
This gets used by the view
* Return email over all user properties for sessions in clickhouse and view
* Fix an ordering bug
* Fix a pagination bug
* Fix endpoint
* Fix basic sessions tests for pagination
* Sort consistently for sessions list builder
* Roll pagination into filters
* Remove a dead TODO
This was solved in PR #1849
* Make sessions query more readable
* Remove dead code from sessions
* Make it possible to test with imports in jest
* Fix a bug and add tests to createActionFromEvent
I added tests when initially solving #2848, but the original solution
had to be scrapped. The tests are still valid though
* Duplicate existing endpoints under /api/event/sessions
* Use different endpoint for sessions table logic in sessions page
* drop irrelevant code for insights endpoint
* update documentation
* extract method
* move code under only relevant path
* Extract clickhouse sessions logic
* Refactor: Separate session list logic from rest of sessions
* move sessions list to separate file
* Move sessions list tests to separate file
* add clickhouse session list tests
* add support for duration filtering on postgres
* move api tests to right place
* kill dead code
* Use return value of add_session_recording_ids
The mutation here is incidental, this is symmetrical with how postgres
behaves.
* Implement sessions_filter
This will hold session-recording specific filters
* WIP: make filtering by session recording length possible on ch
* Kill dead method
* Allow filtering by session length
* Make logic a bit smoother around filtering by recording duration
* More common code for clickhouse_session_recording
* Add filtering tests for ch
* Resolve mypy issues
* Only run duration tests for ch
* Put new filters logic behind a feature flag
* Kill dead const
* Solve operators-related comment
* Review feedback: rename variable
* Fix test failures
* Nest endpoints under /project/ with StructuredViewSetMixin
* Rewrite URLs
* isort
* Update utils.py
* Fix errors
* Fix almoast all the errors
Last left to do: shared dashboards and permission classes.
* isort
* Adjust for master
* Add compatbility with shared dashboards
* Debug ClickHouse
* Remove some # type: ignores
* Simplify CursorPagination
* Move test base from posthog.api.test to posthog.test
* Improve API structure
* Bring back legacy endpoints
* Fix legacy compatibility
* Fix bugs and typing
* isort
* Fix hooks test
* Try fixing errors
* Fix oversight
* isort
* Fix problems
* isort
* Be more tolerant
* Fix naming and remove redundant code
* Fix imports
* Update deleteWithUndo
* Roll back
* Roll back more
* Update .gitignore
* Rollll back
* Rollllllll
* back
* Betterify
* Address feedback
* remove person properties up to date
* remove person props mv
* move latest person
* prune rest of person materialized
* missing parenth
* add type
* remove migration
* Fix feature flags clickhouse
* Fix feature flags clickhouse
* Fix types
* Fix stuff
* Silly me
Co-authored-by: Eric <eeoneric@gmail.com>
* Add scheduled task to wipe session recordings
* Create a new table for session recording
* Save snapshot events to different table
* Use SessionRecordingEvent over Events everywhere
We can remove a ton of cruft this way as well
* Add missing signature
* Extract util from models/event
* Attempt to update ingest side of clickhouse session recording events
Note that it's using main kafka topic - not sure if a good idea.
* Get separate table in ch working for session recording events
* WIP: query sessions
* Make both session recording queries work
* Make linter happy
* Rebase migration
* Make tests work
* Apply a TTL to session recordings and other configuration:
- toYYYYMMDD partitioning should be smoother with TTL setup
- TTL achieves not needing to archive the data ourselves
- index_granularity will enable smaller reads per session_id
- ORDER BY clause is to make single session as well as time range query
reasonable
* Convert retention cronjob to new model
* Add tests to process_event changes
* Add test for ee_capture change
* Fixup migration
* Make clickhouse tests drop/create session recording tables
* Make TTL not be there in tests
Otherwise writes get eaten by it during tests when mocking time
* Fix retention task
Co-authored-by: Tim Glaser <tim@glsr.nl>
* Revert "Revert "Use postgres to grab person (#1957)" (#1963)"
This reverts commit 94f44bdf46.
* Filter by team_id
Co-authored-by: Tim Glaser <tim@glsr.nl>
* Backend changes to implement #1461
* Added the missing migration files
* Fixes Typecheck errors
* Refactor request.user.team_set.get() to use request.user.team
* Updated user patch method to change current_team on team id instead of name
* Merged migration file
* Changes team property to return first item in queryset
* Fixes failing tests
* Changed User api to return the ids of the teams they are currently part of
* Frontend for changing teams
* Update and simplify migrations
* Improve team and user buttons
* Make team changing backend more logical
* Improve current_team mechanics
* Update test_team.py
* Fix Team.objects.create_with_data
* Update migration
* Update tests
* Make setup_review more convenient
* Add Organization and OrganizationMembership
* Replace is_admin with level
* Extend API
* Update team.py
* Improve modeling
* Improve handling of new mechanics
* Add proper migration
* Remove _ensure_organization_and_team
* Update 0084_org_team_user.py
* Improve user, org and team creation
* Make MembershipLevel more flexible for the future
* Add member deletion
* Fix naive datetime warnings
* Update setup_review.py
* Update API route
* Make PersonalAPIKey changes
* Update models and migrations, fix typing
* Fix typing
* Use MAC-less UUID v1 instead of v4 for better performance
* Add abstract UUIDModel
* Update utils.py
* Update utils.py
* Fix multi/unicast bit
* Update APIs, frontend and tests
* Update pull_request_template.md
* Fix comment
* Fix migration error
* Compress migrations
* Updates with minimal renaming
* More updates
* Make further updates
* Update test_team_user.py
* Fix issues
* Add migration
* Satisfy mypy
* Remove Signup redirect on logged in
* Use uuid1_macless in Person
* Fix typing
* Update tests
* Update /api/team/signup to /api/organization/signup
* Fix mypy issues and update tests
The remaining failures are actually missing functionality (TDD applied), so filling these in.
* Update 0086_org_live
* Make small improvements
* Implement permissions
* Remove now unneccesary membership check
* Update setup_dev.py
* Make small frontend improvements
* Add drf-nested-routers as requirement
* Remove unused import
* Implemented nested routes
* Remove cruft
* Add relevant org/proj/user name to headings
* Fix imports
* Update migration
* Replace unreliable drf-nested-routers with drf-extensions
* Improve unset team handling
* Make org and team creation proper
* Update migration
* Fix migration
* Update TopContent
* Update command palette for new sidebar structure
* Remove deprecated demo data deletion
* Assume that each org has a project and fix typing
* Require paid plan for multiple orgs and projects
* Make HogFlix demo a separate team
* Update migration
* Slightly improve style
* Adjust page layout bottom padding
* Make user dropdown nicer
* Fix base app tests
* Satisfy mypy
* Fix test_leave_organization
* Improve wording
* Possibly fix import
* Remove misplaced None check
* Enhance org and teams APIs and add tests
* Fix /api/projects for particular Team
* Improve invites and demo data
* Address feedback
* Put everything related to billing on Organization
* Fix minor issues
* Simplify invitation creation
* Update team model
* Make orgs and projects premium only on self-hosted
* Improve testing
* Update migration
* Remove extra License import
* Fix minor issues
* Fix Django tests
* Fix Cypress
* Fix yarn build
* Fix TeamSignupViewset
* Fix posthog-production incompatibility
* Remove extraneous insight endpoint registration
* Adjust tests for posthog-production
* Simplify invitations and fix email validation
* Address all feedback
* Satisfy mypy
* Update migration
* Fix constraint removal in migration
* Update tests
* Fix test creation edge case
* Run posthog-production CI tests against this branch and teams-live
* Ensure that js_posthog_api_key is always passed
* Fix preflight check pre-login
* Update cypress tests
* Update instanceStatus.js
* Bring ee tets up to par
* Bring actions-ux-201012 back
* Cypress retry in cypress.json
* Revert "Run posthog-production CI tests against this branch and teams-live"
This reverts commit d79cb844d8.
Co-authored-by: anna <ms.annaphilips@gmail.com>
Co-authored-by: Anna Philips <aphilips@matmacorp.com>
* Clickhouse use elements chain
* Fix stuff
* Add action tests and start regex
* Progress
* Progress part deux
* Fix everything
* Add tag name filtering
* Fix funnels
* Fix tag name regex
* Fix ordering
* Fix type issues
* Fix empty nth-child
* Remove commented code
* Split with semicolon and escaped quotes
* Specify all select columns
* Use toDate to speed up ordering
* Add another check
* Try debugging test
* Bla
* aaaaa
* cba
* Only return 100 results
* not even sure
* order by desc both
* Add proper test
Co-authored-by: Eric <eeoneric@gmail.com>
* Clickhouse use elements chain
* Fix stuff
* Add action tests and start regex
* Progress
* Progress part deux
* Fix everything
* Add tag name filtering
* Fix funnels
* Fix tag name regex
* Fix ordering
* Fix type issues
* Fix empty nth-child
* Remove commented code
* Split with semicolon and escaped quotes
* Specify all select columns
* convert sessions table logic to TS
* convert rest of sessions to TS
* sessions table logic refactor, store date in the url
* add back/forward buttons
* load sessions based on the URL, not after mount --> avoids duplicate query if opening an url with a filter
* prevent multiple queries
* throw error if failed instead of returning an empty list
* date from filters
* rename offset to nextOffset
* initial limit/offset block
* indent sql
* support limit + offset
* load LIMIT+1 sessions in postgres, pop last and show load more sign. (was: show sign if exactly LIMIT fetched)
* based offset is always 0
* default limit to 50
* events in clickhouse sessions
* add elements to query results
* add person properties to sessions query response
* show seconds with two digits
* fix pagination, timestamp calculation and ordering on pages 2 and beyond
* mypy
* fix test
* add default time to fix test, fix some any(*) filter issues
* remove reverse
* WIP event list
* Events progress
* Finish off event listing, skip live actions for now
* Fix mypy
* Fix mypy again
* Try fixing mypy
* Fix assertnumqueries
* Fix tests
* Fix tests
* fix test
* Fix tests
* Fix tests
* Fix tests again
* Fix person querying
* Fix flake
* Fix person stuff
* Fix test
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* convert sessions table logic to TS
* convert rest of sessions to TS
* sessions table logic refactor, store date in the url
* add back/forward buttons
* load sessions based on the URL, not after mount --> avoids duplicate query if opening an url with a filter
* prevent multiple queries
* throw error if failed instead of returning an empty list
* date from filters
* rename offset to nextOffset
* initial limit/offset block
* indent sql
* support limit + offset
* load LIMIT+1 sessions in postgres, pop last and show load more sign. (was: show sign if exactly LIMIT fetched)
* based offset is always 0
* default limit to 50
* events in clickhouse sessions
* add elements to query results
* add person properties to sessions query response
* show seconds with two digits
* fix pagination, timestamp calculation and ordering on pages 2 and beyond
* mypy
* fix test
* add default time to fix test, fix some any(*) filter issues
* remove reverse
* WIP event list
* Events progress
* Finish off event listing, skip live actions for now
* Fix mypy
* Fix mypy again
* Try fixing mypy
* Fix assertnumqueries
* Fix tests
* Fix tests
* fix test
* Fix tests
* Fix tests
* Fix tests again
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* initial
* migration command
* migrations working
* add modelless views for clickhouse
* initial testing structure
* use test factory
* scaffold for all tests
* add insight and person api
* add basic readme
* add client
* change how migrations are run
* add base tables
* ingesting events
* restore delay
* remove print
* updated testing flow
* changed sessions tests
* update tests
* reorganized sql
* parametrize strings
* element list query
* change to seralizer
* add values endpoint
* retrieve with filter
* pruned code to prepare for staged merge
* working ingestion again
* tests for ee
* undo unneeded tests right now
* fix linting
* more typing errors
* fix tests
* add clickhouse image to workflow
* move to right job
* remove django_clickhouse
* return database url
* run super
* remove keepdb
* reordered calls
* fix type
* fractional seconds
* fix type error
* add checks
* remove retention sql
* fix tests
* add property storage and tests
* merge master
* fix tests
* fix tests
* .
* remove keepdb
* format python files
* update CI env vars
* Override defaults and insecure tests
* Update how ClickHouse database gets evaluated
* remove bootstrapping clickhouse database routine
* Don't initialize the clickhouse connection unless we say it's primary
* .
* fixed id generation
* remove dump
* black settings
* empty client
* add param
* move docker-compose for ch to ee dir
* Add _public_ key to repo for verifying self signed cert on server
* update ee compose file for ee dir
* fix a few issues with tls in migrations
* update migrations to be flexible about storage profile and engine
* black settings
* add elements prop tables
* add elements prop tables
* working filter
* refactored
* better url handling
* add mapping table
* add processing to worker task
* working cohort with actions
* add cohort property filtering
* add cohort property filtering
* reformat and add cohort processing
* prop clauses
* add util
* add more util
* add clickhouse modifier
* Clickhouse Sessions (#1623)
* sessions sql
* skeleton
* add endpoint
* better tests
* sessions list
* merge clickhouse-actions
* added session endpoint
* sessions sql working again
* add clickhouse modifier
* session avg with props working
* add dist
* tests working (no list)
* list working
* add formatting
* more formatting
* fix tests
* dummy commit
* fix types
* remove unnecessary improt
* ignore type when importing from ee in task
* fix test running
* Clickhouse Trends Base (#1609)
* initial working
* date param almost working
* fix date range and labels
* fixed monthly math
* handle compare
* change table
* using new event ingestion
* direct query actions working
* remove interface
* fix date range
* properties initial working
* handle operator
* handle operator
* move timestamp parse
* move more to util
* inital breaking down working
* working cohort breakdown
* some tests running
* fix sessions
* cohort tests
* action and interval test
* reorder cohort filtering
* rename retention test
* fix inits
* change multitenancy tests
* fix types
* fix optional types
* replace ch_client.execute with sync_execute
* replace ch_client.execute with sync_execute, part 2
* Clickhouse Stickiness + Process Event (#1654)
* generate clickhouse uuid script
* set CLICKHOUSE_SECURE=False by default if running in TEST or DEBUG
* convert person_id to UUID, make adding `person_id` optional, add distinct_ids already in the `create_person` function
* Fix test_process_event_ee.py, remove all calls to Person.objects.*
* add back util
* fix broken imports
* improve process_event test clickhouse queries
* Basic stickiness query
* Clickhouse Stickiness tests
* stickiness test [WIP, actions fail]
* generate clickhouse uuid script
* change default test runner if PRIMARY_DB=clickhouse
* fix stickiness test for actions
* fix merge bug
* remove _create_person stub; cohort person_id is UUID now
* fix typing
* Clickhouse trends process math (#1660)
* most of process math works
* all process math
* fix ordering issue
* unusued imports
* update property comparison for process_event_ee
* indentation wrong missing calls
* demo users and events (#1661)
* finish breakdown filtering tests and reformat label function
* add increment to demo_data
* update demo data populating
* Add people endpoint for ch (#1670)
* add people endpoint for ch
* stickiness people
* fix value padding
* add process math to breakdown and
* add limit
* fix tests
* condensed code
* converted test to factory
* add people tests
* add month handling
* add typing fix
* change people test handling
* fix tests
* Clickhouse funnels 2 (#1668)
* add elements to create_event
* WIP closes #1663 Add funnels to clickhouse
* Make funnels work
* Clean up
* Move filtering around
* Add mypy tests and fix
* Performance improvements
* fix person tests again
* add people for funnel endpoint
* fix prop numbering
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* merge master
* add retention
* update types
* more typing errors
* fix types
* bug with kafka payload, elements insert, and demo data
* Clickhouse Paths (#1657)
* paths clickhouse test (fails)
* add elements to create_event
* make this fail for clickhouse
* hardcoded query that returns good results for $pageviews, no filters yet
* clean up queries
* bound by time, fix 30min new session boundary
* support screen and custom events
* add properties filter
* paths url
* filter by path start
* better path start test
* even better path start test
* start from the first "path start" in a group
* test for person_id in paths
* partition by person_id for POSTGRES paths
* partition by person_id for Clickhouse paths
* clean up order in paths test
* clean up order in paths test
* join elements
* force element order on element group creation
* remove "order" when creating elements in tests and demo
* get list of elements for paths
* add limit to paths query
* use materialized view
* rename "element_hash" to "elements_hash" (no change in db)
* cull rows that are definitely unused
* simplify query
* New highly optimized paths clickhouse query
* start_point for $autocapture paths
* extract event property values from clickhouse
* prevent crash
* select one element sql
* get elements for event
* remove lodash
* remove host from $pageview path elements if same domain as incoming path
* show metadata based on loaded paths filter, not in flight filter
* fix order (all soures and targets in order, not all sources first, then all targets after) - makes for a better looking graph
* add test that makes the Postgres paths query fail
* fix postgres paths --> no fuzzy matching, breaks "starts with" for urls and gives too many incorrect start points
* create automatic /demo urls that match the real urls (no ending /)
* fix elements queries
* path element joins
* create persons via postgres in paths test
* change serializers back to id
* fix tests with uuid
* fix demo
* more bugs
* fix type
* change now to timezone aware
* [clickhouse] retention filters (#1725)
* implemented target entity and prop filtering
* add insight view override
* fix endpoint and filters
* include tests
* fix tests
* add period filtering
* .
* fix pg param name
* add filtering params to both queries in retention sql
* fix param again
* change to todatetime
* change tz to timezone
* add back timezone in model/event
* [clickhouse] feature flag endpoint requests (#1731)
* add feature flags to endpoints
* add flags to endpoints that check on request
* remove magic strings and fill in missing flags
* fix types
* add missing flag
* change from iso
* fix more timestamps and comparator
* change _people to get_people in actions view
* remove action and cohort populating
* change inheritance
* "Clickhouse Features V2 (#1565)"
This reverts commit 0b371d43ec.
* fix types
* change to super
* change to super x2
Co-authored-by: Eric <eeoneric@gmail.com>
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Tim Glaser <tim.glaser@hiberly.com>
* initial
* migration command
* migrations working
* add modelless views for clickhouse
* initial testing structure
* use test factory
* scaffold for all tests
* add insight and person api
* add basic readme
* add client
* change how migrations are run
* add base tables
* ingesting events
* restore delay
* remove print
* updated testing flow
* changed sessions tests
* update tests
* reorganized sql
* parametrize strings
* element list query
* change to seralizer
* add values endpoint
* retrieve with filter
* pruned code to prepare for staged merge
* working ingestion again
* tests for ee
* undo unneeded tests right now
* fix linting
* more typing errors
* fix tests
* add clickhouse image to workflow
* move to right job
* remove django_clickhouse
* return database url
* run super
* remove keepdb
* reordered calls
* fix type
* fractional seconds
* fix type error
* add checks
* remove retention sql
* fix tests
* add property storage and tests
* merge master
* fix tests
* fix tests
* .
* remove keepdb
* format python files
* update CI env vars
* Override defaults and insecure tests
* Update how ClickHouse database gets evaluated
* remove bootstrapping clickhouse database routine
* Don't initialize the clickhouse connection unless we say it's primary
* .
* fixed id generation
* remove dump
* black settings
* empty client
* add param
* move docker-compose for ch to ee dir
* Add _public_ key to repo for verifying self signed cert on server
* update ee compose file for ee dir
* fix a few issues with tls in migrations
* update migrations to be flexible about storage profile and engine
* black settings
* add elements prop tables
* add elements prop tables
* working filter
* refactored
* better url handling
* add mapping table
* add processing to worker task
* working cohort with actions
* add cohort property filtering
* add cohort property filtering
* reformat and add cohort processing
* prop clauses
* add util
* add more util
* add clickhouse modifier
* Clickhouse Sessions (#1623)
* sessions sql
* skeleton
* add endpoint
* better tests
* sessions list
* merge clickhouse-actions
* added session endpoint
* sessions sql working again
* add clickhouse modifier
* session avg with props working
* add dist
* tests working (no list)
* list working
* add formatting
* more formatting
* fix tests
* dummy commit
* fix types
* remove unnecessary improt
* ignore type when importing from ee in task
* fix test running
* Clickhouse Trends Base (#1609)
* initial working
* date param almost working
* fix date range and labels
* fixed monthly math
* handle compare
* change table
* using new event ingestion
* direct query actions working
* remove interface
* fix date range
* properties initial working
* handle operator
* handle operator
* move timestamp parse
* move more to util
* inital breaking down working
* working cohort breakdown
* some tests running
* fix sessions
* cohort tests
* action and interval test
* reorder cohort filtering
* rename retention test
* fix inits
* change multitenancy tests
* fix types
* fix optional types
* replace ch_client.execute with sync_execute
* replace ch_client.execute with sync_execute, part 2
* Clickhouse Stickiness + Process Event (#1654)
* generate clickhouse uuid script
* set CLICKHOUSE_SECURE=False by default if running in TEST or DEBUG
* convert person_id to UUID, make adding `person_id` optional, add distinct_ids already in the `create_person` function
* Fix test_process_event_ee.py, remove all calls to Person.objects.*
* add back util
* fix broken imports
* improve process_event test clickhouse queries
* Basic stickiness query
* Clickhouse Stickiness tests
* stickiness test [WIP, actions fail]
* generate clickhouse uuid script
* change default test runner if PRIMARY_DB=clickhouse
* fix stickiness test for actions
* fix merge bug
* remove _create_person stub; cohort person_id is UUID now
* fix typing
* Clickhouse trends process math (#1660)
* most of process math works
* all process math
* fix ordering issue
* unusued imports
* update property comparison for process_event_ee
* indentation wrong missing calls
* demo users and events (#1661)
* finish breakdown filtering tests and reformat label function
* add increment to demo_data
* update demo data populating
* Add people endpoint for ch (#1670)
* add people endpoint for ch
* stickiness people
* fix value padding
* add process math to breakdown and
* add limit
* fix tests
* condensed code
* converted test to factory
* add people tests
* add month handling
* add typing fix
* change people test handling
* fix tests
* Clickhouse funnels 2 (#1668)
* add elements to create_event
* WIP closes #1663 Add funnels to clickhouse
* Make funnels work
* Clean up
* Move filtering around
* Add mypy tests and fix
* Performance improvements
* fix person tests again
* add people for funnel endpoint
* fix prop numbering
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Eric <eeoneric@gmail.com>
* merge master
* add retention
* update types
* more typing errors
* fix types
* bug with kafka payload, elements insert, and demo data
* Clickhouse Paths (#1657)
* paths clickhouse test (fails)
* add elements to create_event
* make this fail for clickhouse
* hardcoded query that returns good results for $pageviews, no filters yet
* clean up queries
* bound by time, fix 30min new session boundary
* support screen and custom events
* add properties filter
* paths url
* filter by path start
* better path start test
* even better path start test
* start from the first "path start" in a group
* test for person_id in paths
* partition by person_id for POSTGRES paths
* partition by person_id for Clickhouse paths
* clean up order in paths test
* clean up order in paths test
* join elements
* force element order on element group creation
* remove "order" when creating elements in tests and demo
* get list of elements for paths
* add limit to paths query
* use materialized view
* rename "element_hash" to "elements_hash" (no change in db)
* cull rows that are definitely unused
* simplify query
* New highly optimized paths clickhouse query
* start_point for $autocapture paths
* extract event property values from clickhouse
* prevent crash
* select one element sql
* get elements for event
* remove lodash
* remove host from $pageview path elements if same domain as incoming path
* show metadata based on loaded paths filter, not in flight filter
* fix order (all soures and targets in order, not all sources first, then all targets after) - makes for a better looking graph
* add test that makes the Postgres paths query fail
* fix postgres paths --> no fuzzy matching, breaks "starts with" for urls and gives too many incorrect start points
* create automatic /demo urls that match the real urls (no ending /)
* fix elements queries
* path element joins
* create persons via postgres in paths test
* change serializers back to id
* fix tests with uuid
* fix demo
* more bugs
* fix type
* change now to timezone aware
* [clickhouse] retention filters (#1725)
* implemented target entity and prop filtering
* add insight view override
* fix endpoint and filters
* include tests
* fix tests
* add period filtering
* .
* fix pg param name
* add filtering params to both queries in retention sql
* fix param again
* change to todatetime
* change tz to timezone
* add back timezone in model/event
* [clickhouse] feature flag endpoint requests (#1731)
* add feature flags to endpoints
* add flags to endpoints that check on request
* remove magic strings and fill in missing flags
* fix types
* add missing flag
* change from iso
* fix more timestamps and comparator
* change _people to get_people in actions view
* remove action and cohort populating
Co-authored-by: James Greenhill <jams@uber.com>
Co-authored-by: Marius Andra <marius.andra@gmail.com>
Co-authored-by: Tim Glaser <tim.glaser@hiberly.com>