0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-11-28 18:26:15 +01:00
Commit Graph

33 Commits

Author SHA1 Message Date
Tim Glaser
7fe96b813d
Fix error too deep recursion (#7445)
* Fix error too deep recursion

* Add space

* space no 'AND'

* update snapshots

* fix test

* fix test

* Remove team_id from prop_clause
2021-12-02 12:12:14 +01:00
Paolo D'Amico
87e91e1352
Remove legacy sessions (#7401)
Co-authored-by: Rick Marron <rcmarron@gmail.com>
2021-11-29 21:11:10 -08:00
Karl-Aksel Puulmann
065947877e
Groups: Make test_account_filters work with groups (sessions, lifecycle support) (#7387)
* Extract common subquery into a variable

* BE: handle group properties in more cases

* Add tests for lifecycle and sessions query changes

* Better docs

* Stable date range
2021-11-29 14:41:10 +02:00
Eric Duong
df583d528b
6331 pie chart persons bug (#6642)
* backend fixes and test

* add breakdown value to pie chart

* adjust test

* fix faulty test

* fill param

* fix formula tests

* more date passing

* more cleanup

* all tests working

* make test data explicit and add better checks

* support both ee and postgres

* length checks
2021-10-26 14:53:34 -04:00
Karl-Aksel Puulmann
7461f90153
Simplify cohort filters (#6277)
* WIP: Create new property types for simplified cohorts

* Add documentation on simplified_cohort_filter_properties

* Handle static-cohort/precalculated-cohort property types

* Handle new property filters properly

* Add casting

* Test cohorts in more cases

* Fix a bug

* Fix benchmark simplifying

* Avoid redoing work every setup for benchmarks

* Update typing;

* Remove unneeded scope

* Add tests for simplifying and cohorts

* Roll more of "do we need to join persons table" behavior into ClickhousePersonQuery class

* Handle precalculated cohort logic in sessions

* Simplify event query

* More tests without any JSONExtract

* Simplify entity properties as well

* Improve docstring

* Add test for breakdown & precalculated cohorts

* Add test for filtering sessions by precalculated cohorts

* Reset unneeded change

* Update cohort

* Solve some typing issues

* Update benchmarking

* Fix cohort filtering tests

* Fix cohort tests

* Fix a caching issue

* Typecheck

* Handle exclusion filters
2021-10-08 10:51:11 +03:00
Karl-Aksel Puulmann
ef7f31c482
Simplify test accounts (#6221)
* Simplify filters code

* Simplify filters ASAP if filter is created

* Simplify route

* Remove simplification-specific logic from queries

* Remove recursion, update tests

* Pass team in more cases

* Update column optimizer specs

* Test simplify

* Update trends test

* Fix rebase fail
2021-10-07 23:14:35 +03:00
Karl-Aksel Puulmann
c9003a8260
Better test coverage for materialized columns (#5682)
* Remove dead argument

* Make allow_denormalized_props always explicit

* Change prop_clauses default

* Create a testing decorator for checking materialized columns

This makes it easier to have proper test coverage for materialized
columns and make sure no bugs creep up :)

* Fix event_query

* Test more materialized columns in trends

* Add materialized column tests for funnels

* Cleanup path_event_query

* Fix default

* Fix issue with clashing parameter names
2021-08-23 17:17:24 +03:00
Karl-Aksel Puulmann
0f58482a66
Handle denormalized properties everywhere* (#5635)
* WIP: port process_math to support materialized columns

* Add skipped test showing trend breakdowns dont use materialized columns

* Simplify testing and test&fix math property aggregation w/ materialized columns

* Add (failing) test for filtering with materialized action props

* Add test around materialized property filtering

* Refactor entity.math materialization impl

* Make trends breakdowns work with materialized columns

* Simplify process_math further

* Handle denormalized properties in format_action_filter for step.properties

Note the following files all called this method:
ee/clickhouse/views/events.py
ee/clickhouse/views/actions.py
ee/clickhouse/queries/trends/util.py
ee/clickhouse/queries/trends/lifecycle.py
ee/clickhouse/queries/trends/breakdown.py
ee/clickhouse/queries/funnels/base.py
ee/clickhouse/queries/sessions/util.py
ee/clickhouse/queries/clickhouse_stickiness.py
ee/clickhouse/queries/clickhouse_retention.py
ee/clickhouse/models/cohort.py

I verified all of them are OK since they query events table directly
with the passed filter

* Handle materialized $current_url in action step filtering

* Remove now unneeded clause

* Update test helper

* Allow denormalized props for filtering with breakdowns

* Allow denormalized props for filtering with lifecycle

* Allow denormalized props for some views

* Fix entity math yet again

* Query materialized columns in insights > sessions

* Handle breakdown edge case

* Allow denormalized props for more views

* PR feedback

* reformat
2021-08-19 16:09:40 +03:00
Michael Matloka
c2bc2fecd0
Use proper interval calculation in the funnel trends query (#5467)
* Use proper interval calculation in the funnel trends query

* Add some comments

* Update `test_filter`

* Rework `NULL_SQL` to use CH `INTERVAL` too

* Fix week-based relative `date_from` support not existing

* Make use of `toInterval*` functions and inject less

* Add fallback for `date_from` in `ClickhouseSessionsAvg`
2021-08-06 11:29:35 +02:00
Karl-Aksel Puulmann
a1dd96e47d
Improve cached_property typing (#5465)
* Improve cached_property typing

Noticed that e.g. `filter.breakdown` was getting inferred to be `Any`
which is not correct. Added generics to fix it :)

* Proposed fix: make filter_test_accounts return a bool always

* Fix warning in clickhouse_sessions.py

* Cast in session.events

* Cast 2x in funnel queries

* Ignore error in session recording

We know the valid values here

* Add assertions in stickiness filters

* Cast in more funnel queries

* Untyped dict where inferred type is wrong

* Add types to abstract methods

* Type prop_vals

* Add a lot of casts

These are correct. We should really validate while parsing instead

* Add more casts to funnel trends

* Last fixes
2021-08-05 22:31:31 +03:00
Karl-Aksel Puulmann
ac0ef1b30e
Load session events async on clickhouse (#5056)
* Load session events asynchronously from a separate endpoint

This mirrors the behavior of postgres query

* Simplify backend & query

event_count is unused
don't select unused columns in list query

* Rename filter_by_session_recordings to filter_by_session_recordings

This is more in-line with what the function actually does

* Update types, handle start/end url properly

* start_url / end_url to session result

* Update sessions list builder tests

* Remove some `session.events` references

* Remove unneeded code

* Simplify filteredSessions

* Fix type issue

* Add test for session properties

* Test and fix start_url/end_url

* Add test for the new sessions API endpoint

* Improve types

* Update py types again

* Fix bug
2021-07-12 10:43:40 +03:00
Michael Matloka
a599056234
Handle Action.DoesNotExist more (#5028)
* Handle `Action.DoesNotExist` more

* Reuse try-except logic for `Action.DoesNotExist`

* Fix circular import

* Add `test_funnel_invalid_action_handled`
2021-07-08 21:47:14 +00:00
Eric Duong
c40a0716ce
Insight session entities (#3582)
* Fix session tests

* fix sesisons test

* add postgres functionality

* clickhouse logic

* fix test

* fix pg test

* fix type

* add default condition

* change prompt

* make prompt better

* fix test

* add or tag

Co-authored-by: Tim Glaser <tim@glsr.nl>
2021-03-29 16:03:20 -04:00
Karl-Aksel Puulmann
e234b8dbd5
Speed up filtering by events in sessions (#3707)
* restructure the code

* Clean up complicated function using namedtuple

* Speed up action/event filtering in clickhouse

For very rare events, filtering by them in sessions previously was slow.
This was because the the distinct_id query contained a lot of users who
hadn't done the action, resulting in more looping.

This commit speeds things up by querying users who have done any events matching the
event/action filter.
2021-03-19 20:16:27 +02:00
Tim Glaser
bf2c4429b5
Auto filter test accounts (#3492)
* WIP auto filter test accounts

* finish off

* Fix tests

* Non generic emails

* add list of generic emails

* Move location to below property filters

* Fix typescript errors

* as any

* fix tests

* filters

* fix tests

* Featureflag doesn't really makes sense for this feature

* fix tests

* fix test

* Add clickhouse + tests for each insight

* Fix lifecycle and paths

* Fix sessions

* Fix session tests

* fix sesisons test

* fix migrations

* fix migration chain

* refactor path & remove stale console.log

* adjust useAnchor & minor copy

* rename to avoid confusion with inline component

* test account filter tweaks

* fix filters

* hardcode

* Add tests for funnel trends

* Make generic emails super fast

* Fix migrations

* Default to false for now

* Default to false, remember a user's preference

Co-authored-by: Paolo D'Amico <paolodamico@users.noreply.github.com>
2021-03-11 18:16:38 +01:00
Karl-Aksel Puulmann
828c301299
Fix filtering sessions by action with property filters (#3500)
Fixes #3499
2021-02-26 11:24:31 +02:00
Karl-Aksel Puulmann
09fe2859ae
Rework sessions list query to account for pagination (#3296)
* Fix and test more sessions pagination cases

Pagination previously did not work correctly on CH/postgres due to
LIMIT clauses

Simplifying the sessions query helps on clickhouse, though might
introduce new issues.

* Make sessions list pagination work

The key idea is divorce distinct_id lookups from result lookups.

This now works in the scenario where 101 users match person filter/have done
an event in time range, but only the 101st has a session matching
action/event filter (see tests)

This will perform even on superdaily, though it might slow down for very
specialized queries.

Potential future speedups:
- apply action/event filters on the distinct_id query -
  only return those which who have the possibility of matching.

- Make distinct_id LIMIT higher if we know action/event limit is
  involved

- Caching the distinct_id query heavily

* Reorganize code

* Make session list tests pass w/ pagination

* Add tests, fix another corner case for postgres sessions list

distinct_ids were not always returned in the right order.

* Include distinct_id in sessions query

This should now solve https://github.com/PostHog/posthog/issues/3055
2021-02-12 09:41:10 +02:00
Karl-Aksel Puulmann
fba542af8a
Fix filtering by user property in CH sessions list (#3285)
Fixes https://github.com/PostHog/posthog/issues/3282

The problem was in pagination as expected.

Ran the query against our team in production clickhouse - it now returns
expected data and speed did not seem to change much.
2021-02-10 10:37:16 +02:00
Eric Duong
97f0cbd27a
Round sessions value (#2925)
* add scaling function

* adjust test

* update tests

* patch errors

* fix import
2021-01-21 13:24:59 -05:00
Karl-Aksel Puulmann
29c1ed954d
Allow filtering by unseen recordings (#3000)
* Add model for session recording viewed

* Save view when querying for session recording data

* Send information to FE about whether session has been viewed

* Allow filtering by "recording unseen"

* Rename property

* Update migration
2021-01-21 09:42:00 +02:00
Karl-Aksel Puulmann
9354b7ce64
Highlight filtered events in events table and in session recording (#2954)
* Highlight rows from sessions which are matched by the filter

* Send start_time of recording to frontend

This helps us calculate offsets a bit better

* Use timestamp by server for date shown

* Load session events via kea logic

* Update rrweb, rrweb-player

* Highlight events user is filtering for in sessions player

* Handle empty case properly

* Add positive test

* Order session recording events in query

* Fix filtering by multiple events with differing names

previously only the first would have been used due to overlapping
params.

* Return all highlighted times as action_filter_times

* Send back ids not timestamps

This avoids weird rounding errors causing issues

* Show skeleton for longer

* update typing

Co-authored-by: Paolo D'Amico <paolodamico@users.noreply.github.com>
2021-01-19 21:16:42 -06:00
Karl-Aksel Puulmann
c245af6a3e
Filters design followups (#2993)
Co-authored-by: Paolo D'Amico <paolodamico@users.noreply.github.com>
2021-01-19 20:58:21 -06:00
Karl-Aksel Puulmann
3009e0aa2e
Support multiple action filters in sessions (#2946)
* Make it possible to filter by (multiple) action filters in postgres

Session will now contain "action_filter_times" key which lists when each
action filter occurred for the first time within the session.

This will be used to highlight rows/show special values in sessions
player.

* Clickhouse: support multiple action filters

* Remove dead code
2021-01-15 11:28:46 +01:00
Karl-Aksel Puulmann
17a31f0b43
Speed up sessions list query (#2934)
* Remove (apparently) useless person joins from sessions sql

* WIP: Make sessions list query use python iteration

* WIP: Show loader while session data is loading

* Aggregate together sessions

* Calculate start and end points of session separately

* Remove cruft code

* Load session events asynchronously for self-hosted

Note clickhouse behavior is unchanged.

* Update pagination logic for sessions

In addition to offset, postgres now returns a dict containing person_id,
timestamp which is used to make sure we filter events on different pages
correctly

* Add some tests to SessionListBuilder

* Fix typing errors

* Fix pagination limit

* Move tests to right file

* Query less events for sessions list

Since we're ordering by end_time we know events before last end_time are
all processed.

* Add test for current_url behavior

* Make sure old tests remain working

* Remove unused base class

* Move sessions-related queries to separate subfolder

* Extract sessions list code to separate file(s)

* Sort sessions by end time in ch as well

* List end time in sessions table

* Return person email when querying sessions lists on postgres

This gets used by the view

* Return email over all user properties for sessions in clickhouse and view

* Fix an ordering bug

* Fix a pagination bug

* Fix endpoint

* Fix basic sessions tests for pagination

* Sort consistently for sessions list builder

* Roll pagination into filters
2021-01-15 01:53:28 +02:00
Eric Duong
a0ab699588
Proper interval rounding on normal trends (#2901)
* do proper interval rounding on normal trends

* patch inconsistency

* remove microseconds

* conditionall round interval

* adjust how date_from is handled

* add retention test

* edits and split test
2021-01-13 14:35:46 +01:00
Karl-Aksel Puulmann
bcffd30092
Allow going from insights -> sessions (on cloud) (#2790)
* extract code to format entity filter

* make it possible to filter by action

* Hack to make filtering by actions subfilters possible

Example url:
http://localhost:8000/api/event/sessions?date_from=2020-12-10&date_to=2020-12-10&action_filter={%22type%22:%22events%22,%22id%22:%22test-event%22,%22properties%22:[{%22key%22:%22email%22,%22value%22:%22example.com%22,%22operator%22:%22icontains%22,%22type%22:%22person%22}]}

Not sure if the team filtering is legit here

* Use discriminated union in types

* Add kludge support for action_filter on sessions via url

* Reduce code in buildURL

* Add link to sessions page from persons modal

* Add muted overview of the invisible filter

* Add link to cohort sessions from persons page

* drop irrelevant test code

* Test clickhouse filtering by action filter

* put filter behind a cloud-only conditional

* remove dead import

* Add icons/data-attr to sessions links

* Appease linter gods
2021-01-06 12:59:52 +02:00
Eric Duong
4ece4ce3e8
Major filter refactor (#2736)
* split params into mixins

* add typing

* split func

* dedicated paths filter

* make basefilter

* restructure directories

* move prop mixin

* implement cached_property and fix circular import

* remove unused imports

* stickiness filter converted

* temp patch for breakdown

* correct naming for derived mixins in paths

* fix types

* fix types

* fix casing

* fix breakdown arg

* merge master

* add to_dict

* DRY to_dict

* scoped to_dict ability

* fix types

* add missing fields

* refactor session filter

* remove unused imports

* remove unused imports

* add stickiness filter test

* fix dict for pathfilter

* properly load strs

* add default

* standardize loading

* retention filter separate

* add update filter to retention

* change back decorator

* don't allow setting on filter date_from date_to

* remove seconds on default timestamp

* remove derived values

* fix date formatting

* fix typo
2021-01-05 12:15:24 +01:00
Tim Glaser
3f7e95d14a
Improved insights history (#2745)
* Deprecate dashboard item type and move to display

* Mypy rerors

* fix test

* fix

* Fix test

* Fix another test

* WIP save history refactor

* Remove determineInsightType

* Progress

* Get rid of RetentionTable display types

* Sync all together

* Fix update dashboard

* Improved saving

* More bugfixes

* Fix insight caching and filters

* Fix insights

* Fix tests

* Fix import

* Fix saving issues

* Don't duplicate

* fix

* Use session instead of session_type everywhere for consistency

* Remove prints

* Use get_filter

* Fix retention filters

* Fix UI issues

Co-authored-by: Michael Matloka <dev@twixes.com>
2020-12-16 16:37:55 +01:00
Karl-Aksel Puulmann
3f0f051758
Sessions filtering postgres support (#2728)
* Duplicate existing endpoints under /api/event/sessions

* Use different endpoint for sessions table logic in sessions page

* drop irrelevant code for insights endpoint

* update documentation

* extract method

* move code under only relevant path

* Extract clickhouse sessions logic

* Refactor: Separate session list logic from rest of sessions

* move sessions list to separate file

* Move sessions list tests to separate file

* add clickhouse session list tests

* add support for duration filtering on postgres

* move api tests to right place

* kill dead code
2020-12-11 11:01:14 +02:00
Karl-Aksel Puulmann
b03d4da7e0
Allow filtering sessions by recording duration (#2721)
* Use return value of add_session_recording_ids

The mutation here is incidental, this is symmetrical with how postgres
behaves.

* Implement sessions_filter

This will hold session-recording specific filters

* WIP: make filtering by session recording length possible on ch

* Kill dead method

* Allow filtering by session length

* Make logic a bit smoother around filtering by recording duration

* More common code for clickhouse_session_recording

* Add filtering tests for ch

* Resolve mypy issues

* Only run duration tests for ch

* Put new filters logic behind a feature flag

* Kill dead const

* Solve operators-related comment

* Review feedback: rename variable

* Fix test failures
2020-12-10 10:26:47 +02:00
Eric Duong
8fbbe679f5
Stickiness improvement and filter refactor (#2638)
* stickiness filter refactor

* stickiness clickhouse

* parametrize clickhouse trunc

* add interval tests

* fix type and casing

* change name to interval not period

* change defaults

* remove offsets

* move stickinses people endpoint

* move imports

* remove unused imports

* fix time defaults

* swap endpoint

* add interval tests

* move api test

* fix all time calculation and add team_id filter to earliest timestamp ch
2020-12-04 20:42:01 +01:00
Eric Duong
997ec36916
Lifecycle Graph (#2460)
* intial working

* intial working with test

* return tests

* fix interval and add monthly test

* add in person_id join

* small restyle

* postgres lifecycle working

* add action handling to postgres query

* add action handling and tests

* fix typing

* visualizing with temp params and added negatives for dormant

* frontend

* fix intervals

* remove unnecessary import

* add person enppoint

* add person endpoint and tests

* add next

* add pagianted test

* fix types

* add frontend logic'

* fixed date range

* disable compare filter on lifecycle

* add diasbled to chartfilter

* return class to personviewset

* added constant

* fix distinct_id and new event within period condition

* replace people queries and fix ch query too

* DRY

* fix null states

* comparefilter nullstate

* add wrapper back to endpoint

* fix datetime formatting

* remove extra stack flag

* reduce filters when it's lifecycle and replace constants

* add default for lifecycle into trendlogic
2020-12-02 16:53:06 +01:00
Eric Duong
ec64e467ee
Organize logic (#2358)
* clean up reused function

* remove fluff and use constants

* standardizing compare

* segment logic for clickhouse queries that have a lot

* add default to filter.date_to

* mino unwrap

* fix test
2020-11-18 13:12:26 +01:00