0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-12-01 12:21:02 +01:00
posthog/ee/clickhouse/sql/retention/retention.py
Harry Waye c595976779
fix(retention): fix breakdown people urls (#7642)
* fix(retention): fix breakdown people urls

This change returns people_url for each breakdown cohort in the
response. We also merge the initial and returning queries together,
as this makes it easier to align the people query also.

Note that I'm talking about person_id as opposed to actor_type etc.
but perhaps that can be a followup.

* clean up clickhouse params

* tidy up a little

* remove import

* remove non-breakdown specific code

* make cohort by initial event date a special breakdown case

* keep date for backwards compat

* Remove unused sql

* make test stable

* wip

* Get most of the tests working

* test(retention): remove graph retention test

We no longer need this, we have all the information we need from the
table response for retention, and can construct this on the frontend.

* revert any changes to posthog/queries/retention.py

* revert any changes to ee/clickhouse/models/person.py

* Revert posthog/queries/retention.py to merge-base

* Ensure actor id is a str

* Add type for actor serialiser for type narrowing

* run black

* sort imports

* Remove retention_actors.py

* fix typings

* format

* reverse str type

* sort imports

* rename

* split out functions

* remove deuplicate logic

* working

* fix type

* don't stringify

* fix test

* ordering doesn't matter

* trigger ci

Co-authored-by: eric <eeoneric@gmail.com>
2021-12-15 18:20:56 +00:00

66 lines
1.9 KiB
Python

RETENTION_BREAKDOWN_SQL = """
WITH actor_query AS ({actor_query})
SELECT
actor_activity.breakdown_values AS breakdown_values,
actor_activity.intervals_from_base AS intervals_from_base,
COUNT(DISTINCT actor_activity.actor_id) AS count
FROM actor_query AS actor_activity
GROUP BY
breakdown_values,
intervals_from_base
ORDER BY
breakdown_values,
intervals_from_base
"""
RETENTION_BREAKDOWN_ACTOR_SQL = """
WITH %(period)s as period,
%(breakdown_values)s as breakdown_values_filter,
%(selected_interval)s as selected_interval,
returning_event_query as ({returning_event_query}),
target_event_query as ({target_event_query})
-- Wrap such that CTE is shared across both sides of the union
SELECT
DISTINCT
breakdown_values,
intervals_from_base,
actor_id
FROM (
SELECT
target_event.breakdown_values AS breakdown_values,
datediff(
period,
target_event.event_date,
returning_event.event_date
) AS intervals_from_base,
returning_event.target AS actor_id
FROM
target_event_query AS target_event
JOIN returning_event_query AS returning_event
ON returning_event.target = target_event.target
WHERE
returning_event.event_date > target_event.event_date
UNION ALL
SELECT
target_event.breakdown_values AS breakdown_values,
0 AS intervals_from_base,
target_event.target AS actor_id
FROM target_event_query AS target_event
)
WHERE
(breakdown_values_filter is NULL OR breakdown_values = breakdown_values_filter)
AND (selected_interval is NULL OR intervals_from_base = selected_interval)
"""