* cherry pick client from metahog
* cherrypick celery and requirements from metahog
* Change key to be based on hash of query and add test
* test that caching works
* black formatting
* Remove last references to uuid since it's not a uuid anymore
* don't statically set CLICKHOUSE_DATABASE
* mypy fixes (oof)
* add more tests!
* last test
* black format again
* only a bit of feedback incorporated - more to come
* add query_id override, force, and tests
* black format
* Flake8 and test docs
* black format tests
* mypy fixes
* from_json typing pains
* Feedbacked
* mypy feedback
* pin redis to 6.x
* fix(sharding): Improve the 0004 sharding migration
This change:
- Makes the rollbacks always work by
- Fixing some operations from before
- Creating/deleting materialized views correctly
- Ensuring zookeeper paths are unique
- Handles an edge case around moving tables by retrying
* Rename a parameter
* Verify migration status
* Update a test
* >=
* Update ee/clickhouse/errors.py
Co-authored-by: Tiina Turban <tiina303@gmail.com>
* Update code comment
* Update log message
* Type:ignore
Co-authored-by: Tiina Turban <tiina303@gmail.com>
* Make CLICKHOUSE_REPLICATION default to True
* Update some insert statements
* Create distributed tables during tests
* Delete from sharded_events
* Update test_migrations_not_required.py
* Improve 0002_events_sample_by is_required
1. SHOW CREATE TABLE is truncated if table has tens of materialized
columns, reasonably causing failures
2. We need to handle CLICKHOUSE_REPLICATED setups
* Update test_schema to handle CLICKHOUSE_REPLICATED, better test naming
* Fix issue with materialized columns
Note: Should make sure that these tests have coverage both ways
* Update test for recordings TTL
* Reorder table creation
* Correct schema for materialized columns on distributed tables
* Do correct setup in test_columns
* Lazily decide table to delete data from
* Make test_columns resilient to CLICKHOUSE_REPLICATION
* Make inserts resilient to CLICKHOUSE_REPLICATION
* Reset CLICKHOUSE_REPLICATION
* Create distributed tables conditionally
* Update snapshots, tests
* Fixup conftest
* Remove Event dependency on action api tests
* Remove a dead function
* Remove BaseQuery
* Remove dead imports
* Remove Event creation from posthog/test/test_person_model.py
* Remove Event.earliest_timestamp function
* Remove some unused event model methods
* Remove query_db_by_action + associated migration code
* Remove dead filtering methods from Events model
* Remove a dead test class
* Remove some event model usage
* Remove events model usage from actions test
* Remove session recording related views
* Remove model usage in posthog/queries/session_recordings/session_recording.py
* Remove old pg-session recording code
* Remove dead import
* Re-add missing dependency
* Make lint/tests pass
* Make filter tests uuid-based
* Remove event admin
* Move posthog/tasks/test/test_org_usage_report.py clickhouse version inline
* Remove postgres-specific code from org usage report
* Kill dead on_perform method
* Remove dead EventSerializer
* Remove a dead import
* Remove a dead command
* Clean up test, dont create a model
* Remove dead code
* Clean up test_element
* Clean up test event code
* Remove a dead function
* Clean up dead imports
* Remove dead code
* Code style cleanup
* Fix foss test
* Simplify fn
* Org usage fixup #3
* Add logging to all postgresql queries with query context
Uses the exact same pattern as we do currently for clickhouse, just
hooking in there differently
* Support psycopg2.sql.SQL
* Better docs
* update a test
* implement multi property breakdown as an array from the spike
* correct type hint on method
* really resolve the conflict
* don't break groups
* refactor test assertions for breakdown cases
* adds a test to prove that funnels can receive a string and not an array
* protect saved dashboards from multi property changeover
* WIP
* multi breakdown working with funnel step breakdown
* prove funnel step person breakdown works with multi property breakdown
* don't need to protect cached dashboards from multi property breakdowns when they can't be set from the UI
* capitalise keywords in SQL
* convert a single test to journey helper
* wip
* account for funnel step breakdown sometimes being an array sent as a string
* safer handling of funnel step breakdown
* convert a test
* revert commits that made things worse
* simpler handling of funnel step breakdown
* no need to change funnel step breakdown type hint
* update imports
* guard against integer properties
* compare funnel step breakdown differently now there are arrays involved
* look for strict intersection for funnel step breakdown
* update test snapshots
* need to set breakdown_values earlier in processing
* remove tests that cover speculative functionality
* update snapshot
* move setup of breakdown values back out of update_filters
* update snapshots
* remove a sql parameter that was never assigned to
* Update ee/clickhouse/models/test/test_property.py
Co-authored-by: Harry Waye <harry@posthog.com>
* Update ee/clickhouse/queries/funnels/base.py
Co-authored-by: Harry Waye <harry@posthog.com>
* address review comment to simplify reading json expressions for breakdown
* clarify why some uses of get_property_string_expr escape params before passing
* add keyword arguments for calls to getting property string expressions in funnels
* switch to keyword arguments in test helper method
* fix parameterised test
* add multi property materialized column tests
* introduce the shim to allow new API for breakdown properties
* can't remove the naive funnel step breakdown list detection
* move funnel step breakdown list handling
* better handling of numeric funnel step breakdown values
* update snapshots
Co-authored-by: Harry Waye <harry@posthog.com>
* dev(clickhouse): strip out comments before executing sql
This is so we can easily copy/paste from e.g. Metabase by querying the
system.query_log. In metabase is doesn't display new lines (although you
can download to file for this), but it's not very practical.
* test(clickhouse): use `capture_select_queries` in comment strip test
* test(clickhouse): only sub. params if non-insert query
This parallels `clickhouse_driver` behaviour.
* chore(clickhouse): move sql preparation to dedicated function
* refactor: rearrange func and type definitions
* Show clickhouse disk and system.stats on /instance/status
Part of https://github.com/PostHog/vpc/issues/45
* Show stats on clickhouse table sizes, remove postgres table size stats
* Add metric for whether clickhouse is alive
* Move clickhouse stats above redis
* Compile requirements-dev.txt with latest pip-tools
* Install pytest
* Avoid picking up factories as tests
* New runner
* Always set TEST env variable running tests
Some of our tests rely on it.
* Remove repetition
* Fix a broken test
* Cut down noise from bin/tests
* Rename test factory
* Fix stickiness filter
* Skip a broken test
This has been broken since numpy removal PR. Sadly tests were not
running for this submodule
* Fix import on ee
* Run ee tests properly
The django_db_setup fixture will be automatically run when running ee/
module tests.
* Make tests run on CI
* Include REDIS_URL, fix cloud
* Set TEST env variable
* Hack cloud tests to work
* Attempt at workflow fix
* Import Person model when running ee tests
This module implicitly adds hooks, so this is needed when running tests
* Respect reuse-db for clickhouse
* Add custom markers to avoid warnings
* pytest: use ch test database always
Accidentally wiped by ch setup a few times without this. Oops
* Remove repetition in tests
* Pytest: Always run migrations
Testing a state cleanup fix
* Use same DB in conftest and main code
* Pytest: autoset TEST setting without env variable
* fix broken test
Co-authored-by: eric <eeoneric@gmail.com>
* Debug CH queries
* tests
* Logout when impersonated session
* Put "Debug ClickHouse queries" in its own command
* Clean up ClickHouse modal
Co-authored-by: Michael Matloka <dev@twixes.com>
* Finish the local dev w/ proto setup
* WIP manage events view
* Add task, add interface etc
* Move everything to 'manage events' view
* Move all settings into single dropdown (can be reverted)
* Urls for tabs
* Fix migration
* Clickhouse and humanize volume
* Fix cypress test
* Fix sidebar cypress
* Fix cypress again
* Fix some small issues
* Address comments
* Corect naming
* Fix test'
Co-authored-by: James Greenhill <fuziontech@gmail.com>
* Basic caching for Clickhouse to redis
* Use redis for caching results
* add tests and fix bugs
* fix mypy
* add fakeredis as req
* add fakeredis to github action for testing
* add fakeredis to cloud tests too
* pickle -> json
* bytes
* json in tests
* tuplefy
* skip celery for ee path
* mypy fixes
* take celery out and fix types as cleanly and performant as possible
* add timing
* setup statsd, need to clean this up
* use sane defaults for statsd
* Add scheduled task to wipe session recordings
* Create a new table for session recording
* Save snapshot events to different table
* Use SessionRecordingEvent over Events everywhere
We can remove a ton of cruft this way as well
* Add missing signature
* Extract util from models/event
* Attempt to update ingest side of clickhouse session recording events
Note that it's using main kafka topic - not sure if a good idea.
* Get separate table in ch working for session recording events
* WIP: query sessions
* Make both session recording queries work
* Make linter happy
* Rebase migration
* Make tests work
* Apply a TTL to session recordings and other configuration:
- toYYYYMMDD partitioning should be smoother with TTL setup
- TTL achieves not needing to archive the data ourselves
- index_granularity will enable smaller reads per session_id
- ORDER BY clause is to make single session as well as time range query
reasonable
* Convert retention cronjob to new model
* Add tests to process_event changes
* Add test for ee_capture change
* Fixup migration
* Make clickhouse tests drop/create session recording tables
* Make TTL not be there in tests
Otherwise writes get eaten by it during tests when mocking time
* Fix retention task
Co-authored-by: Tim Glaser <tim@glsr.nl>
* add new table migrations and change table names
* include necessaray config for new tables in tests
* fix tests and table
* fix table name param
* add populate clause
* added table for key value person props
* adjust person filtering to use new table
* .
* add ordering on updated_at
* add back all the condition handling on persons filtering endpoint
* fix typgin
* remove print
* re-order sort key for persons_up_to_date
Co-authored-by: James Greenhill <fuziontech@gmail.com>
* Clickhouse use elements chain
* Fix stuff
* Add action tests and start regex
* Progress
* Progress part deux
* Fix everything
* Add tag name filtering
* Fix funnels
* Fix tag name regex
* Fix ordering
* Fix type issues
* Fix empty nth-child
* Remove commented code
* Split with semicolon and escaped quotes
* Specify all select columns