* launch celery with debug logging
* autoimport a single task which decides what type of export to run
* still need to manually inject root folder so tests can clear up
* fix mock
* sketch the interaction
* correct field type
* explode a filter to steps of day length
* write to object storage maybe
* very shonky storing of gzipped files
* doesn't need an export type
* mark export type choices as deprecated
* order methods
* stage to temporary file
* can manually export the uncompressed content
* shonkily export as a csv
* wip
* with test for requesting the export
* with polling test for the API
* put existing broken CSV download back before implementing UI change
* open api change
* even more waffle
* less passive waffle
* sometimes less specific is more correct
* refactor looping
* okay snapshots
* remove unused exception variable
* fix mocks
* Update snapshots
* Update snapshots
* lift storage location to the exported asset model
* split the export tasks
* improve the temp file usage in csv exporter
* delete the test files we're creating
* add a commit to try and trigger github actions
Co-authored-by: pauldambra <pauldambra@users.noreply.github.com>
* refactor(ingestion): establish setup for json consumption from kafka into clickhouse [nuke protobuf pt. 1]
* address review
* fix kafka table name across the board
* Update posthog/async_migrations/test/test_0004_replicated_schema.py
* run checks
* feat(persons-on-events): add required person and group columns to events table
* rename
* update snapshots
* address review
* Revert "update snapshots"
This reverts commit 63d7126e08.
* address review
* update snapshots
* update more snapshots
* use runpython
* update schemas
* update more queries
* some improvements :D
* fix naming
* fix breakdown prop name
* update snapshot
* fix naming
* fix ambiguous test
* fix queries'
* last bits
* fix typo to retrigger tests
* also handle kafka and mv tables in migration
* update snapshots
* drop tables if exists
Co-authored-by: eric <eeoneric@gmail.com>
* Check async migrations instead of CLICKHOUSE_REPLICATION for mat columns
* Update a comment
* Default for CLICKHOUSE_REPLICATION
* add replication file
* Assert is replicated in tests
* Remove DROP TABLE query from cohortpeople migration
* Update snapshots
* Ignore migration in typechecker
* Truncate right table
* Add KAFKA_COLUMNS to distributed tables
* Make CLICKHOUSE_REPLICATION default to True
* Update some insert statements
* Create distributed tables during tests
* Delete from sharded_events
* Update test_migrations_not_required.py
* Improve 0002_events_sample_by is_required
1. SHOW CREATE TABLE is truncated if table has tens of materialized
columns, reasonably causing failures
2. We need to handle CLICKHOUSE_REPLICATED setups
* Update test_schema to handle CLICKHOUSE_REPLICATED, better test naming
* Fix issue with materialized columns
Note: Should make sure that these tests have coverage both ways
* Update test for recordings TTL
* Reorder table creation
* Correct schema for materialized columns on distributed tables
* Do correct setup in test_columns
* Lazily decide table to delete data from
* Make test_columns resilient to CLICKHOUSE_REPLICATION
* Make inserts resilient to CLICKHOUSE_REPLICATION
* Reset CLICKHOUSE_REPLICATION
* Create distributed tables conditionally
* Update snapshots, tests
* Fixup conftest
* Remove event admin
* Move posthog/tasks/test/test_org_usage_report.py clickhouse version inline
* Remove postgres-specific code from org usage report
* Kill dead on_perform method
* Remove dead EventSerializer
* Remove a dead import
* Remove a dead command
* Clean up test, dont create a model
* Remove dead code
* Clean up test_element
* Clean up test event code
* Remove a dead function
* Clean up dead imports
* Remove dead code
* Code style cleanup
* Fix foss test
* Simplify fn
* Org usage fixup #3
* version insights
* version and lock update
* make sure all tests work
* restore exception
* fix test
* fix test
* add specific id
* update plugin server test utils
* cleanup
* match filtering
* use timestamp comparison
* make tests work
* one more test field
* fix more tests
* more cleanup
* lock frontend when updating and restore refresh
* pass undefined
* add timestamp to background update
* use incrementer
* add field
* snapshot
* types
* more cleanup
* update tests
* remove crumbs
* use expressions
* make nullable
* batch delete
* fill null for static cohorts
* batch_delete
* typing
* remove queryset function
* working for unique_groups math
* fix types
* add null check
* update snapshots
* update payload
* update snapshots
* use constructor
* adjust queries
* introduce base class
* consolidate querying
* shared serializer and typed
* sort imports
* snapshots
* typing
* change name
* Add group model
```sql
BEGIN;
--
-- Create model Group
--
CREATE TABLE "posthog_group" ("id" serial NOT NULL PRIMARY KEY, "group_key" varchar(400) NOT NULL, "group_type_index" integer NOT NULL, "group_properties" jsonb NOT NULL, "created_at" timestamp with time zone NOT NULL, "properties_last_updated_at" jsonb NOT NULL, "properties_last_operation" jsonb NOT NULL, "version" bigint NOT NULL, "team_id" integer NOT NULL);
--
-- Create constraint unique team_id/group_key/group_type_index combo on model group
--
ALTER TABLE "posthog_group" ADD CONSTRAINT "unique team_id/group_key/group_type_index combo" UNIQUE ("team_id", "group_key", "group_type_index");
ALTER TABLE "posthog_group" ADD CONSTRAINT "posthog_group_team_id_b3aed896_fk_posthog_team_id" FOREIGN KEY ("team_id") REFERENCES "posthog_team" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "posthog_group_team_id_b3aed896" ON "posthog_group" ("team_id");
COMMIT;
```
* Remove a dead import
* Improve typing for groups
* Make groups updating more generic, avoid mutation
This simplifies using the same logic for groups
Note there's a behavioral change: We don't produce a new kafka message
if nothing has been updated anymore.
* Rename a function
* WIP: Handle group property updates
... by storing them in postgres
Uses identical pattern to person property updates, except we handle
first-seen case within updates as well.
* Get rid of boolean option
* WIP continued
* fetchGroup() and upsertGroup()
* Test more edge cases
* Add tests for upsertGroup() in properties-updater
* Rename to PropertyUpdateOperation
* Followup
* Solve typing issues
* changed implementation to use pg
* unusd
* update type
* update snapshots
* rename and remove inlining
* restore bad merge code
* adjust types
* add flag
* remove var
* misnamed
* change to uuid
* make sure to use string when passing result
* remove from columnoptimizer logic and have group join logic implemented by event query classes per insight
* remove unnecessary logic
* typing
* remove dead imports
* remove verbosity
* update snapshots
* typos
* remove signals
* remove plugin excess
Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
* Extract GroupsJoinQuery
* Add test for breakdown filtering
* Unify breakdown mixins
* Allow passing breakdown_type == 'group' with breakdown_group_type_index
* Allow breakdown by group props in trends
* Add tests for trends breakdown_props function on group breakdowns
* Solve common issues
* Output snapshot diff into console
* Clean up materialized columns after tests
* Add zero protection
* Solve test failure
* Type math in Entity
* Allow passing group_type_index from FE to BE
* Get a initial query running
* Add group value filter if aggregating by groups
* Add snapshot testing for trends queries
* isort
* Update tests
* Add test for column_optimizer
* Update ee/clickhouse/queries/trends/util.py
Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>