0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-12-01 04:12:23 +01:00
Commit Graph

156 Commits

Author SHA1 Message Date
Karl-Aksel Puulmann
fda07665e5
Update test snapshots and test code (#8220) 2022-01-24 14:18:34 +02:00
Eric Duong
12da43034f
yeet(actions): Consolidate clickhouse actions to actions (#8150)
* actions working

* tests working

* types

* snapshots

* update snapshots
2022-01-21 09:42:18 -05:00
Rick Marron
2b9917a915
Recordings in paths (#8015)
* add recordings to path query

* uncomment cache

* add clarifying comment

* works for start/end paths

* move to extra fields/properties

* add tests

* cleanup

* update ff name

* fix flaky test

* test and handle path_dropoff_key case
2022-01-18 15:29:52 -08:00
Karl-Aksel Puulmann
cae9b59779
Fix lifecycle rounding logic (#8057) 2022-01-18 09:52:49 +02:00
Paolo D'Amico
2c5d9997ca
Extra sessions cleanup (#8037) 2022-01-13 19:20:47 -06:00
Karl-Aksel Puulmann
d9bc06b7dd
Speed up lifecycle query (#8021)
* refactor(lifecycle): simplify clickhouse sql logic

This updates the SQL to be comprised of two queries, one for getting
new, returning, and resurrecting periods of activity, one for getting
dormant periods right after periods of activity.

Refers to https://github.com/PostHog/posthog/issues/7382

* refactor(lifecyle): use `ClickhouseEventQuery` to build event query

* format

* Use bounded_person_activity_by_period for both sides of dormant join

* refactor(lifecycle): reduce pdi2 join by one

This means we're now under the current query memory limit for orgs with
around 20m distinct_ids. It does remove some readability though :(

* update snapshot

* Add further comments to query

* Add further comments to query

* Add further comments to query

* Remove dead variables

* Refactor person_query overriding

* Lifecycle refactoring continued

* Update lifecycle tests (except people ones)

* Make lifecycle people endpoint happy

* Remove django lifecycle tests

* Add some edge case tests

* Add missing type

Co-authored-by: Harry Waye <harry@posthog.com>
2022-01-13 16:31:09 +02:00
Eric Duong
34d45e3436
[cohort] insight cohorts (#7569)
* initial working'

* fix tests

* correct tests

* typing

* update snapshot

* add funnel test

* remove unnused

* rest of tests

* function name

* typing

* raise if debugging otherwise send to sentry

* no limit option for cohorts

* remove duplicate

* propogate types correctly

* add param

* cleanup

* update snapshots

* add comment

* change var name

* reverse arg

* use func

* fix tests and types

* add simplify

* move simplification

* adjust checks

* explicit type

* don't init
2022-01-06 10:38:29 -05:00
Karl-Aksel Puulmann
9141996e1c
Proposal: Set time bounds for "all of time" filter (#7849)
* Set time bounds for "all of time" filter

We won't display data points from before 2015 anymore, avoiding
confusion like in https://github.com/PostHog/posthog/issues/7626

* Disable dates before 2015, add tooltip
2022-01-05 14:34:33 +02:00
Karl-Aksel Puulmann
e5ee7b4270
Read from and write to person_distinct_id2 if async migration is done (#7846)
* Run queries against person_distinct_id2 when async migration is done

* Only write to clickhouse_person_unique_id topic if async migration is incomplete

* Update query snapshots

* Update plugin-server

* Adjust caching logic
2022-01-05 13:11:33 +02:00
Karl-Aksel Puulmann
afce8efafb
Add benchmark for funnel query (#7813)
* Add benchmark for funnel query

Testing new sorting order takes the benchmark from 3.5s -> 1.5s \o/

* Update snapshots
2021-12-21 13:09:00 +02:00
Harry Waye
fdb4255303
chore(lifecycle): add comments and CTEs to clickhouse sql (#7773)
* chore(lifecycle): add comments and CTEs to clickhouse sql

It was really big and confusing, but hopefully this clarifies a little
what is going on. As a followup PR I'll be doing some work to make the
query faster :fingerscrossed: but I think worth at least getting this
in, assuming I haven't broken any tests!

* update snapshots

* remove day references

* update snapshots
2021-12-17 17:09:28 +00:00
Eric Duong
30b5658fb3
[fix] Make sure ordering is consistent (#7738)
* make sure ordering is consistent

* ordering
2021-12-15 15:25:56 -05:00
Harry Waye
c595976779
fix(retention): fix breakdown people urls (#7642)
* fix(retention): fix breakdown people urls

This change returns people_url for each breakdown cohort in the
response. We also merge the initial and returning queries together,
as this makes it easier to align the people query also.

Note that I'm talking about person_id as opposed to actor_type etc.
but perhaps that can be a followup.

* clean up clickhouse params

* tidy up a little

* remove import

* remove non-breakdown specific code

* make cohort by initial event date a special breakdown case

* keep date for backwards compat

* Remove unused sql

* make test stable

* wip

* Get most of the tests working

* test(retention): remove graph retention test

We no longer need this, we have all the information we need from the
table response for retention, and can construct this on the frontend.

* revert any changes to posthog/queries/retention.py

* revert any changes to ee/clickhouse/models/person.py

* Revert posthog/queries/retention.py to merge-base

* Ensure actor id is a str

* Add type for actor serialiser for type narrowing

* run black

* sort imports

* Remove retention_actors.py

* fix typings

* format

* reverse str type

* sort imports

* rename

* split out functions

* remove deuplicate logic

* working

* fix type

* don't stringify

* fix test

* ordering doesn't matter

* trigger ci

Co-authored-by: eric <eeoneric@gmail.com>
2021-12-15 18:20:56 +00:00
Eric Duong
a051f7ee1f
[actors] retention actors (#7495)
* convert to actor form

* change var name

* remove unused imports

* typing issue

* use subquery

* bad import

* groups for general retention query

* actor in period

* update imports

* update test

* remove comment
2021-12-09 10:11:21 -05:00
Karl-Aksel Puulmann
3a4beda2ad
Speed up person_distinct_id queries for select clients (#7577)
* Create a new way to get distinct id queries thats gated by team_id

* Update most cases to use the new query

* Convert EVENT_JOIN_PERSON_SQL to new query

* Mostly convert GET_DISTINCT_IDS_BY_PERSON_ID_FILTER

* Mostly convert GET_DISTINCT_IDS_BY_PROPERTY_SQL

* Convert GET_PERSON_IDS_BY_FILTER

* Flag benchmarks

* Resolve circular imports

* Update a snapshot test

* Add a test for the new query logic
2021-12-08 15:59:41 +02:00
Eric Duong
cb470c160e
Change timeout to check execution time to 60 seconds (#7562)
* change timeout to 60 seconds

* update snapshots
2021-12-07 18:23:39 +00:00
Tim Glaser
180be54132
Don't capture error messages for trends (#7557) 2021-12-07 16:55:33 +00:00
Eric Duong
110aa9460b
[actor] Funnel correlation actor (#7456)
* match filter var name

* temp

* working tests

* wrong param

* types

* funnel correlation groups working

* typing

* funnel_people -> funnel_actors

* change naming of people -> actors

* update snapshots

* remove dirty change

* typing

* add check and types

* compare with master

* trigger tests
2021-12-06 09:43:58 -05:00
Tim Glaser
7fe96b813d
Fix error too deep recursion (#7445)
* Fix error too deep recursion

* Add space

* space no 'AND'

* update snapshots

* fix test

* fix test

* Remove team_id from prop_clause
2021-12-02 12:12:14 +01:00
Eric Duong
f6b300b1be
[actors] convert paths to actor pattern (#7437)
* convert paths to actor pattern

* used serialized result
2021-12-01 08:46:14 +02:00
Harry Waye
f914e77f4f
refactor(stickiness): refactor one stickiness test to use api (#7398)
* refactor(stickiness): refactor one stickiness test to use api

This change demonstrates how to migrate a `Query` object level test to
an api based test. It purely focuses on the method of action invocation
and not on any of the e.g. setup or assertions. The StickinessQuery
object is only used by the REST API (and benchmarking), where as the
REST stickiness API is used by external users including our own frontend
developers, so makes sense to test at this level.

* Migrate stickiness query tests to api

This doesn't touch the stickiness people API however

* Migrate clickhouse specific stickiness tests

* Migrate stickiness people query tests to http api level

NOTE! This isn't just a straight migration, but also makes one important
change to application code that would otherwise result in a test
failure. Specifically, when trying to find an action based on the
`entity_id` query param, we need to consider that the entity_id is a
string. This is fine for when trying to find events, as we are comparing
event ids which are strings, but for actions the id is an int, so we
need to ensure we cast the action id to a string before comparison.

* Move stickiness query tests to api tests location

* make stickiness tests stable across postgres/clickhouse

* Add comment regarding casting action ids to strings
2021-11-30 13:40:15 +00:00
Paolo D'Amico
87e91e1352
Remove legacy sessions (#7401)
Co-authored-by: Rick Marron <rcmarron@gmail.com>
2021-11-29 21:11:10 -08:00
Karl-Aksel Puulmann
065947877e
Groups: Make test_account_filters work with groups (sessions, lifecycle support) (#7387)
* Extract common subquery into a variable

* BE: handle group properties in more cases

* Add tests for lifecycle and sessions query changes

* Better docs

* Stable date range
2021-11-29 14:41:10 +02:00
Eric Duong
7979f52e8a
[groups persons] API for returning groups based on trend results (#7144)
* working for unique_groups math

* fix types

* add null check

* update snapshots

* update payload

* update snapshots

* use constructor

* adjust queries

* introduce base class

* consolidate querying

* shared serializer and typed

* sort imports

* snapshots

* typing

* change name

* Add group model

```sql
BEGIN;
--
-- Create model Group
--
CREATE TABLE "posthog_group" ("id" serial NOT NULL PRIMARY KEY, "group_key" varchar(400) NOT NULL, "group_type_index" integer NOT NULL, "group_properties" jsonb NOT NULL, "created_at" timestamp with time zone NOT NULL, "properties_last_updated_at" jsonb NOT NULL, "properties_last_operation" jsonb NOT NULL, "version" bigint NOT NULL, "team_id" integer NOT NULL);
--
-- Create constraint unique team_id/group_key/group_type_index combo on model group
--
ALTER TABLE "posthog_group" ADD CONSTRAINT "unique team_id/group_key/group_type_index combo" UNIQUE ("team_id", "group_key", "group_type_index");
ALTER TABLE "posthog_group" ADD CONSTRAINT "posthog_group_team_id_b3aed896_fk_posthog_team_id" FOREIGN KEY ("team_id") REFERENCES "posthog_team" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "posthog_group_team_id_b3aed896" ON "posthog_group" ("team_id");
COMMIT;
```

* Remove a dead import

* Improve typing for groups

* Make groups updating more generic, avoid mutation

This simplifies using the same logic for groups

Note there's a behavioral change: We don't produce a new kafka message
if nothing has been updated anymore.

* Rename a function

* WIP: Handle group property updates

... by storing them in postgres

Uses identical pattern to person property updates, except we handle
first-seen case within updates as well.

* Get rid of boolean option

* WIP continued

* fetchGroup() and upsertGroup()

* Test more edge cases

* Add tests for upsertGroup() in properties-updater

* Rename to PropertyUpdateOperation

* Followup

* Solve typing issues

* changed implementation to use pg

* unusd

* update type

* update snapshots

* rename and remove inlining

* restore bad merge code

* adjust types

* add flag

* remove var

* misnamed

* change to uuid

* make sure to use string when passing result

* remove from columnoptimizer logic and have group join logic implemented by event query classes per insight

* remove unnecessary logic

* typing

* remove dead imports

* remove verbosity

* update snapshots

* typos

* remove signals

* remove plugin excess

Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
2021-11-18 11:58:48 -05:00
Karl-Aksel Puulmann
88db4845f4
Speed up stickiness queries & allow aggregating/filtering by groups (#7117)
* Refactor stickiness to have its own event_query

This will speed up queries significantly and allow for filtering by
group properties

* Use same event_query for stickiness people

* Minor cleanup

* Add tests (and missing file) to group filtering in stickiness

* Allow aggregating by groups in stickiness

* Show group property filters in FE for stickiness
2021-11-17 12:49:49 +02:00
Eric Duong
b7667f528f
fix issue when breaking down on group property and trying to return persons (#7145) 2021-11-16 09:39:48 +02:00
Neil Kakkar
13a93098cc
Enable different Property lists based on group type for Property Correlations (#7093)
* Enable different property lists based on group type

* fix test again
2021-11-12 11:45:31 +00:00
Neil Kakkar
7a5bb93e2a
Enable Correlations for Groups (#7056)
* enable correlations for groups

* address comments, refactor bits

* address comments
2021-11-12 11:32:55 +00:00
Eric Duong
e5503cfd26
remove changing timestamp (#7046) 2021-11-11 11:30:50 +00:00
Eric Duong
5d264b6c7d
[insights—persons] Trends person urls (#6966)
* basic backend implementation

* fix backend url formatting

* frontend

* provide params alongside url

* remove console log

* remove load function

* aggregate handling

* aggregate test

* update tests

* add pie handling

* type

* aggregate breakdown

* check jsonextract on properties

* make all tests work

* actionsbarvalue

* use new function

* remove url

* typing and cleanup

* use api.get

* typing

* cumulative

* add breakdown
2021-11-10 16:54:51 -05:00
Neil Kakkar
62af72bf22
Paths filtering by Groups backend (#7008)
* Paths filtering by groups backend

* update correlation tests, now that CTEs are included in sqls

* use decorator for materialising to ensure clean up happens

* cleanup offending tests
2021-11-10 09:47:02 +00:00
Neil Kakkar
82352c4a62
Funnel Groups filtering (#6940)
* enable groups filtering

* rerun snapshots

* update team

* fix tests
2021-11-09 08:32:55 +00:00
Karl-Aksel Puulmann
a39596c092
Groups: Use materialized columns for groups (#6938)
* Migration to use materialized columns for groups

Workaround for https://github.com/PostHog/posthog/issues/6422

* Use groups materialized columns in queries

* Update mat column creation tests

* Simplify aggregation_target_field

* Fix migration

* Update snapshots
2021-11-08 15:49:39 +02:00
Karl-Aksel Puulmann
0353b2f26f
BE (Groups/Trends): Testing follow-up PR (#6923)
* Improve process_math

* Add test for overlapping group keys

* Improve event query tests

* Add test for filtering by person properties together with groups

* Avoid flaky tests due to cohort_id changing

* Update queries and snapshots
2021-11-08 10:52:10 +02:00
Karl-Aksel Puulmann
7493f76802
BE (Groups/Retention): Support aggregating by groups (#6922)
* Add groups stuff

* Rename column from person_id to `target` in retention queries

No behavioral change, preparing for groups work :)

* Remove dead if statement

* WIP: Retention aggregation by groups

* Handle aggregation by groups in retention

Also handles the case where not every event has a property defined

* Test groups validation mixin

* Reformat

* Improve test for aggregation in retention
2021-11-08 09:23:02 +02:00
Karl-Aksel Puulmann
ea956fecae
Support filtering by group properties in retention (#6904) 2021-11-05 13:56:27 +02:00
Karl-Aksel Puulmann
42192e07c7
BE (Groups/Trends): Make breakdowns work with groups (#6899)
* Extract GroupsJoinQuery

* Add test for breakdown filtering

* Unify breakdown mixins

* Allow passing breakdown_type == 'group' with breakdown_group_type_index

* Allow breakdown by group props in trends

* Add tests for trends breakdown_props function on group breakdowns

* Solve common issues

* Output snapshot diff into console

* Clean up materialized columns after tests

* Add zero protection

* Solve test failure
2021-11-05 13:47:41 +02:00
Karl-Aksel Puulmann
fa79f8ea67
BE (Groups/Trends): Allow aggregating by groups (#6894)
* Type math in Entity

* Allow passing group_type_index from FE to BE

* Get a initial query running

* Add group value filter if aggregating by groups

* Add snapshot testing for trends queries

* isort

* Update tests

* Add test for column_optimizer

* Update ee/clickhouse/queries/trends/util.py

Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>

Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
2021-11-05 13:08:32 +02:00
Eric Duong
d423a3ad09
[proposal] suggestion for reducing runtimes (#6725)
* suggestion for reducing runtimes

* refactor tests ingestion

* django tests

* fix types

* add comment
2021-11-04 11:34:40 -04:00
Karl-Aksel Puulmann
273228cdcf
BE: Allow filtering by group properties in trends (#6761)
* Add group type, group_type_index

* Raise an error when handling unsupported properties in CH

* Improve repr

* Fix is_superset function

This was previously broken - sorting and zipping doesn't really work for
this intent.

* Add group_type_index to analysis results

* Add `group_types_to_query`

* Minor typing fixes

* Create groups tables in tests

* Simple first filter by groups query

* isort

* Use snapshot testing in event_query tests, add test for groups
2021-11-03 20:43:22 +02:00
Eric Duong
df583d528b
6331 pie chart persons bug (#6642)
* backend fixes and test

* add breakdown value to pie chart

* adjust test

* fix faulty test

* fill param

* fix formula tests

* more date passing

* more cleanup

* all tests working

* make test data explicit and add better checks

* support both ee and postgres

* length checks
2021-10-26 14:53:34 -04:00
Rick Marron
f2076f4008
Paginate recording compression (#6563)
* paginate recording compression

* some tests

* more accurate duration calculation

* add tests and types

* tons of decompression fixes

* rename test file to avoid conflict

* move decompression to helper

* add test for helper

* type fix

* rename method

* simplify paginated decomression

* handle case where offset exceeds length

* clean up

* test fixes

* clean up on aisle 12

* Add surrounding object for metadata response
2021-10-25 12:59:54 -07:00
Eric Duong
1f143d109e
Path cleaning integration (#6488)
* initial refactoring

* popup UI

* refactor path cleaning logic

* add nullable

* all ui working

* fix migration

* use regex replacement from team object

* add flag

* add switch

* fix type

* fix type

* UI update

* restore removed arg

* add local path cleaning filters to api

* add test for local path filters

* working new UI

* reduced repeated code

* fix numbering

* minor refactoring

* update copy

* add under advanced features

* address comments, minor cleanup

Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
2021-10-22 11:51:45 -04:00
Eric Duong
276200429d
Paths: API for regex replacement (#6397)
* api for regex replacement

* typing

* change to None for default
2021-10-14 13:15:01 -04:00
Karl-Aksel Puulmann
457e151f58
Push person predicates down (#6346)
* Refactor column_optimizer to work differently

* WIP: Use counter over set

* Handle person filters in person query

* Remove a dead argument

* Use enum over parameter for determining behavior

* Allow excluding person properties mode when handled in person query

* Fix _get_person_query type

* Use correct table for funnel_event_query

* Remove unneeded override

* Add extra typing

* Filter by entity.properties in person query for trends

* Handle error 184 due to naming clash

* Better default for prop_filter_json_extract

* Update column_optimizer tests for Counter

* Handle person_props as extra_fields

* Handle breakdowns and person property filter pushdown

* Transform values correctly

* Simplify get_entity_filtering_params

* Fix funnel correlations

* Solve caching issues in trend people queries

* Remove @skip test

* Add syrupy tests for parse_prop_clauses

Can update these via --snapshot-update

* Add snapshot tests for person queries

* Add a few notes

* Update test to avoid collision

* Kill dead code

* Handle PR comments

* Update ee/clickhouse/queries/person_query.py

Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>

Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
2021-10-13 14:00:47 +00:00
Karl-Aksel Puulmann
7461f90153
Simplify cohort filters (#6277)
* WIP: Create new property types for simplified cohorts

* Add documentation on simplified_cohort_filter_properties

* Handle static-cohort/precalculated-cohort property types

* Handle new property filters properly

* Add casting

* Test cohorts in more cases

* Fix a bug

* Fix benchmark simplifying

* Avoid redoing work every setup for benchmarks

* Update typing;

* Remove unneeded scope

* Add tests for simplifying and cohorts

* Roll more of "do we need to join persons table" behavior into ClickhousePersonQuery class

* Handle precalculated cohort logic in sessions

* Simplify event query

* More tests without any JSONExtract

* Simplify entity properties as well

* Improve docstring

* Add test for breakdown & precalculated cohorts

* Add test for filtering sessions by precalculated cohorts

* Reset unneeded change

* Update cohort

* Solve some typing issues

* Update benchmarking

* Fix cohort filtering tests

* Fix cohort tests

* Fix a caching issue

* Typecheck

* Handle exclusion filters
2021-10-08 10:51:11 +03:00
Karl-Aksel Puulmann
ef7f31c482
Simplify test accounts (#6221)
* Simplify filters code

* Simplify filters ASAP if filter is created

* Simplify route

* Remove simplification-specific logic from queries

* Remove recursion, update tests

* Pass team in more cases

* Update column optimizer specs

* Test simplify

* Update trends test

* Fix rebase fail
2021-10-07 23:14:35 +03:00
Neil Kakkar
899ad24722
Support multiple correlation properties (#6312)
* Support multiple correlation properties

* extensive comments, now that they get stripped at runtime

* rename funnel_correlation_values
2021-10-07 17:22:10 +01:00
Neil Kakkar
8b2cc5d02e
Write few more tests for Correlation + ColumnOptimizer (#6309) 2021-10-07 13:29:16 +00:00
Eric Duong
1a9eafe0ed
paths: remove funnel limit when querying path funnels (#6210)
* remove funnel limit when querying path funnels

* add limit test and fix args

* fix args

* typos
2021-10-04 09:56:20 -04:00