* working for unique_groups math
* fix types
* add null check
* update snapshots
* update payload
* update snapshots
* use constructor
* adjust queries
* introduce base class
* consolidate querying
* shared serializer and typed
* sort imports
* snapshots
* typing
* change name
* Add group model
```sql
BEGIN;
--
-- Create model Group
--
CREATE TABLE "posthog_group" ("id" serial NOT NULL PRIMARY KEY, "group_key" varchar(400) NOT NULL, "group_type_index" integer NOT NULL, "group_properties" jsonb NOT NULL, "created_at" timestamp with time zone NOT NULL, "properties_last_updated_at" jsonb NOT NULL, "properties_last_operation" jsonb NOT NULL, "version" bigint NOT NULL, "team_id" integer NOT NULL);
--
-- Create constraint unique team_id/group_key/group_type_index combo on model group
--
ALTER TABLE "posthog_group" ADD CONSTRAINT "unique team_id/group_key/group_type_index combo" UNIQUE ("team_id", "group_key", "group_type_index");
ALTER TABLE "posthog_group" ADD CONSTRAINT "posthog_group_team_id_b3aed896_fk_posthog_team_id" FOREIGN KEY ("team_id") REFERENCES "posthog_team" ("id") DEFERRABLE INITIALLY DEFERRED;
CREATE INDEX "posthog_group_team_id_b3aed896" ON "posthog_group" ("team_id");
COMMIT;
```
* Remove a dead import
* Improve typing for groups
* Make groups updating more generic, avoid mutation
This simplifies using the same logic for groups
Note there's a behavioral change: We don't produce a new kafka message
if nothing has been updated anymore.
* Rename a function
* WIP: Handle group property updates
... by storing them in postgres
Uses identical pattern to person property updates, except we handle
first-seen case within updates as well.
* Get rid of boolean option
* WIP continued
* fetchGroup() and upsertGroup()
* Test more edge cases
* Add tests for upsertGroup() in properties-updater
* Rename to PropertyUpdateOperation
* Followup
* Solve typing issues
* changed implementation to use pg
* unusd
* update type
* update snapshots
* rename and remove inlining
* restore bad merge code
* adjust types
* add flag
* remove var
* misnamed
* change to uuid
* make sure to use string when passing result
* remove from columnoptimizer logic and have group join logic implemented by event query classes per insight
* remove unnecessary logic
* typing
* remove dead imports
* remove verbosity
* update snapshots
* typos
* remove signals
* remove plugin excess
Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
* convert the smallest of the funnel test files
* convert unorder funnel person test to journey for helper
* convert another file
* convert another file
* convert another file
* convert another file
* convert another file
* undelete snapshot files
* undelete snapshot files
* Revert "convert another file"
This reverts commit ef08511509.
* remove cohort comment
* feat(funnels): update frontend to use new people_urls in response
This change updates the `FunnelBarGraph` and `FunnelStepTable`
components to use the new `converted_people_url` and
`dropped_poeple_url` in the funnels response.
I need to check that this covers all the cases for funnels. I've only
added people urls for funnels of type step for ordered/unordered/strict
and haven't yet touched the time bins and conversion time funnel
endpoint variants.
* add /some/people/url/ to tests
* chore: add action back, update tests to reference it
* fix typing
* adds a first draft of a test helper for clearer event setup in tests
* one spelling of journeys
* obey the type checker
* convert all of breakdown_cases to the new test helper
* Removes unused helper method
* test(funnel-people): add tests for converted/dropped people urls
As per https://github.com/PostHog/posthog/issues/6935 we are adding urls
to responses to attempt to improve consistency between insight responses
and the people we display in the UI.
This simply adds tests for these urls, just for funnel step response
shape. The other two types of response will be handled separately.
* feat(funnel-people): return converted/dropped people url in funnel
Here I have simply added converted / dropped urls in the base funnel.
This touches both clickhouse and postgres functionality, so I will
probably add in tests for postgres also, or just disable to postgres.
It doesn't pass yet as there is an issue in that the order appears to be
ignored by the people endpoint.
* fix(funnels): actually use strict/unordered persons querying
Previously we were always using the standard `ClickhouseFunnelPersons`
class for retrieving people from the `/api/persons/funnel/` endpoint.
This change selects from the unordered and strict variants based on the
`funnel_order_type` setting.
Refers to https://github.com/PostHog/posthog/issues/7058 although there
is a frontend component to add that will truely resolve the issue.
* feat(funnel-people): add people urls for funnels with breakdown
In this I have also removed any changes from the non-clickhouse code.
Note that there are two places I've added the url generation code, one
in the `ClickhouseFunnelBase` and another in the `ClickhouseFunnel`
I think the former covers the unordered and strict cases, and the later
for the breakdown case. I think there are further test that I'll need to
add to validate e.g. if breakdown + strict work as expected.
* Fix people urls for unordered funnels + breakdown
* test: add test for strict breakdown with people urls
* make funnel response assertions not check people url equality
* fix typing
* fix tests
* remove new line in postgres funnel.py
* clear cache on insight test start
* no really, clear the cache
* remove flakiness from strict funnel test
* correct the unordered test
* use absolute uris
* use step.index not step.order
* remove out of date comment
* use step.index, not step.order
* use step.index, remove unordered funnels comment
* use journeys_for instead of create_events
* add test_helper_methods
* move all the tests where the properties are the same for all events to the journey helper
* compare funnel results without caring about person order
* spell words correcterly
* Revert "spell words correcterly"
This reverts commit befb83b183.
* Revert "compare funnel results without caring about person order"
This reverts commit 268927b8ba.
* correct types for test props
* feat(correlation): use people_url to load people modal
This change adds a new method to the `personsModelLogic` to allow for
specifying a url to be used to retrieve people from. I've avoided
overloading the existing `loadPeople` and rather added a new one,
`lostPeopleFromUrl`.
Currenlty this is only being used by correlations, but the intention is
to extend it's usage to anywhere else that is performing drilldowns to
people from, for instance, the trends aggregation.
* add paycard back
* make labels look right
* remove unused code
* fix typing in test
* fix typing in personsModelLogic
* refactor to make eventUsageLogic work
* fix typing
* fix typing again
* remove some funnel stories
* get storybook working with paycard
* cleaned up the interface a little
* chore(correlation): use `api.get` for fetching people
This updates to use api.get, and also converts the returned uri from the
correlation endpoint to an absolut uri, instead of just an absolute
path.
The `api.get` method expects either an abolute uri, or a relative path.
I'd rather not use an relative path as it's not obvious what it should
be relative to.
* fix tests
* make storybook uris absolute
* Paths filtering by groups backend
* update correlation tests, now that CTEs are included in sqls
* use decorator for materialising to ensure clean up happens
* cleanup offending tests
* feat(correlation): add people drill down urls for event correlation
This change simply adds urls in the response of the correlation endpoint
for each correlation returned. Note that this only adds support when
using correlation_type=events, as the
correlation_type=events_with_properties requires further changes that
I'd like the separate out to keep this PR small.
* feat(correlations): add people by property to correlation response
This commit covers the PROPERTIES case for making sure we return a url
that can be used to retrieve the people associated with a correlation,
either success or failure. This case is a completely different url than
for the events people endpoint as we are reusing the funnel people
endpoint, and filtering by "funnel_step_breakdown".
I'm uncertain if this is actually the correct url at this point in time,
as I have just attempted to copy what I have seen in the request from
the live running app.
* fix test
* Remove another param
* fix mypy typing
* Fix tests
* Add url response support for event_with_properties
* remove type assertion
* remove variable hack
* Remove funnel_steps
* use with_data and to_params
* Remove comment regarding complexity, it seems to pass the tests so...
* Add autocapture support for people url
* refactor people url construction a little
* remove properties attr for event people request
* fix if
* no really, fix if not None
* Remove _assert_never, mypy needs updating
* Fix no return mypy error
* Remove json.dumps
* Migration to use materialized columns for groups
Workaround for https://github.com/PostHog/posthog/issues/6422
* Use groups materialized columns in queries
* Update mat column creation tests
* Simplify aggregation_target_field
* Fix migration
* Update snapshots
* smallest change to make aggregation work
* address comments
* add snapshot
* move function to groups model
* update funnel snapshot
* rename person_id to aggregation_target
* update snapshots as well
* dont support persons query mods for now
* update snapshot
* make array orders deterministic
Previously we would attempt to generate a response even though there
were no steps. There appears to be some code paths that blow up if this
happens, so instead we return as soon as we can in this case. This
appears to be the behaviour elsewhere also.
This resolves the sentry error found here:
https://sentry.io/organizations/posthog/issues/2718768248/
* Refactor column_optimizer to work differently
* WIP: Use counter over set
* Handle person filters in person query
* Remove a dead argument
* Use enum over parameter for determining behavior
* Allow excluding person properties mode when handled in person query
* Fix _get_person_query type
* Use correct table for funnel_event_query
* Remove unneeded override
* Add extra typing
* Filter by entity.properties in person query for trends
* Handle error 184 due to naming clash
* Better default for prop_filter_json_extract
* Update column_optimizer tests for Counter
* Handle person_props as extra_fields
* Handle breakdowns and person property filter pushdown
* Transform values correctly
* Simplify get_entity_filtering_params
* Fix funnel correlations
* Solve caching issues in trend people queries
* Remove @skip test
* Add syrupy tests for parse_prop_clauses
Can update these via --snapshot-update
* Add snapshot tests for person queries
* Add a few notes
* Update test to avoid collision
* Kill dead code
* Handle PR comments
* Update ee/clickhouse/queries/person_query.py
Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
This seems to be semi-consistently failing locally for me and
inconsistently on CI. The cause is unknown - debugging locally, this
seems more of a query correctness issue than a test data incomplete
issue.
* fix(correlation): ensure correlation partitioned by team
Previously if distinct_id's were the same between two teams, we'd end up
pulling in the event data between the teams.
* dev(tests): clear cache between partition calls
* fix(correlation): add lower bounds for selected events
Previously we would consider all events for correlation calculation. Now
we use the funnel `date_from` as the lower bounds.
* chore(correlation): exclude funnel steps
* chore(correlation): make sure cache is cleared before each test
* Update funnel success comment
Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
* fix: filter events by team_id
* chore(correlation): remove CTEs from correlation query
There seems to be an issue with the CTEs and production clickhouse, see
https://github.com/ClickHouse/ClickHouse/issues/29748
Instead of risking it, I'm just removing them.
* chore: update entities -> events for funnel step exclusion
* fix team_id = team_id issue
Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
* chore(diagnose): add a stub endpoint for event diagnosis
This adds an insight endpoint that takes a `target_entity` and returns
a list of events ordered by significance of a person reaching
`target_entity`
Followup PRs will add actual calculations but this should act as a
conversation piece around the structure of request and response, as well
as a stub from which UI development can start.
* play around to get mypy typings right
* Sort out test reponse structurea
* refactor: address CR comments
* feat(diagnose): add diagnose stubs for implementation
* feat(funnel): add event correlation calculation implementation
This adds to the `/api/projects/<team_id>/funnel/correlation` endpoint
an implementation that calculates the odds_ratios for each event that a
user that has been part of a funnel, successful or otherwise.
* chore(correlation): get query working
* refactor(correlation): move functions to Query methods
* split out funnel apis and fix histogram paint order bug
* fix broken tests
* simplify some more
* make this fail if no steps returned
Co-authored-by: Marius Andra <marius.andra@gmail.com>
* make timestamp dynamic
* change funnel path type to str
* change include_timestamp_step name
* type issues
* add argument
* add type
* explicit none check
* change field names
* change constant name
* between step test
* use different arguments for defining funnel steps and add tests
* after step test
* fix types
* test for valueerror
* add test for before dropoff
* custom logic for finding paths up to a dropoff
* fix type
* update test
* add comment
* change event names in test to be clearer
* run again
* test without caching
* return caching
* try not caching again
* try not caching again
* try not caching again
* restore caching
* remove funnel_window_days
* remove unused import
* Make breakdown join with person if needed
* Refactor trends to be purely class-based, don't rely on wonky inheritance
* Extract method
* Improve person join behavior
* Remove unneeded parameter
* Mark a function thats always passed as such
* Add test case demonstrating previous non-join case
* Unify two get_breakdown_prop_values methods
* Add test for materialized columns
* Simplify trends breakdown query
* Unify _breakdown_prop_params for events/person breakdowns
* Use shared column_optimizer
* Typing fix
* Cleanup
* Code style cleanup
* Code style cleanup
* Fix param ordering
* ColumnOptimizer: Add functionality for person materialized columns
* WIP: Use person properties in trends/funnels
* Test person materialized props in trends cohort breakdown query
* Make use of materialized person properties in breakdowns
* Mark some cases working with materialized columns
* Test and fix breakdown by person props without filtering
* Make filtering by entity/person props on a join work better
By not assuming everything is under `properties`
* Add test case around breakdown with person properties
* Add test cases around materialization
* Add another materialization test
* Test cohorts and fix breakdowns with person filter
This exposes a limitation in the current implementation
* Fix some cohort tests
* Fix event query tests
* Get a funnel materialized column test running
* Cover more funnel breakdown tests with materialized
* Handle person property breakdowns in funnels
* Fixup funnel typo
* Add tests, fix an indentation issue
* TestFunnelPersons with materialized columns
* Test funnels against actions with person filters
* Add failing test for entity filtering failing
* Add test case for filtering with entity properties
* Show my 'broken' test is actually doing a subquery
* Resolve linting issues