0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-11-30 19:41:46 +01:00
Commit Graph

19 Commits

Author SHA1 Message Date
Brett Hoerner
30bafdd382
chore(plugin-server): kafka ack cleanup and metric (#21111)
* cleanup: remove unused team arg from registerLastStep

* cleanup: rename promises to ackPromises to make it more clear thats what they are

* cleanup(plugin-server): make waitForAck explicit/required

* add Kafka produce/ack metrics

* Clarify Kafka produce metric/labels
2024-03-25 13:01:15 +00:00
Paul D'Ambra
31c1cdf301
chore: yeet CH recordings ingestion (#17572)
Removing ClickHouse based recordings

One big yeet for a man, a great yeet for humanity
2023-10-11 14:23:41 +01:00
Harry Waye
f901665bfa
chore: make sure dlqs exist in function tests before consuming (#16550)
In CI it's often the case that we get an error saying the
topic-partition pair doesn't exist. This creates the topic explicitly.
2023-07-13 10:49:01 +01:00
Harry Waye
c85d94266c
chore: disconnect consumer on error handling functional_tests (#16480)
If we don't we end up with a bunch of errors in the logs about imports
happening after jest tests have finished.
2023-07-11 10:40:40 +00:00
Harry Waye
39224b018e
chore: fix flaky dlq functional tests (#15439)
There is a race condition in these tests where the consumer isn't
consuming in time to pick up bad messages, so we ensure that we set the
offsets to the earliest messages.
2023-05-09 14:36:14 +01:00
Harry Waye
2f9e2928fe
chore(plugin-server): use librdkafka producer everywhere (#15314)
* chore(plugin-server): use librdkafka producer everywhere

We say some 10x improvements in the throughput for session recordings.
Hopefully there will be more improvements here as well, although it's a
little less clear cut.

I don't try to provide any improvements in guarantees around message
production here.

* we still need to enable snappy for kafkajs
2023-05-04 13:02:44 +00:00
Harry Waye
3f4c0498df
chore(plugin-server): remove recording forwarding (#15230)
We were forwarding events for backwards compatibility with the old
session recording system. Now that we've removed that, we can remove
this code.
2023-04-25 17:03:43 +01:00
Ben White
fdb2c71a39
feat: S3 backed recording ingestion (take 2) (#14864) 2023-04-25 09:43:07 +00:00
Xavier Vello
92d961835b
fix(plugin-server): don't DLQ session recordings with invalid token (#14500)
* fix(plugin-server): don't DLQ session recordings with invalid token

* chore(session-recordings): add test for not DLQing on no token/team

---------

Co-authored-by: Harry Waye <harry@posthog.com>
2023-03-02 13:03:23 +00:00
Harry Waye
01da91d1e2
test: remove explicit dependencies from functional tests (#14471)
* test: remove explicit dependencies from functional tests

* Install pnpm in devcontainer
2023-03-01 12:03:54 +00:00
Harry Waye
a4a3a0c902
test(plugin-server): use librdkafka for functional tests (#14468)
* test(plugin-server): use librdkafka for functional tests

While trying to port the session recordings to use node-librdkafka I
found it useful to first implement it in the functional tests.

* use obj destructuring to make calls more self explanatory
2023-03-01 11:03:13 +00:00
Harry Waye
dcc9acc47d
chore(recordings): remove hub dependency on recordings ingestion (#14418)
* chore(recordings): remove hub dependency on recordings ingestion

Hub is a grab bag of depencencies that are not all required for
recordings ingestion. To keep the recordings ingestion lean, we
remove the hub dependency and use the postgres and kafka client
directly.

This should increase the availability of the session recordings
workload, e.g. it should not go down it Redis or ClickHouse is down.

* fix capabilities call

* reuse clients if available

* wip

* wip

* wip

* fix tests

* fix healthcheck
2023-02-28 10:23:07 +00:00
Alex Gyujin Kim
4ec1990270
chore: add ttl to performance events table (#13921) 2023-01-30 09:37:31 -05:00
Harry Waye
da482a3cba
refactor(recordings): remove session code from event pipeline (#13919)
* refactor(recordings): remove session code from event pipeline

We have moved session recrodings to a separate topic and consumer. There
may be session recordings in the old topic, but we divert these to the
new logic for processing them.

* refactor to just send to the new topic!

* fix import

* remove empty line

* fix no team_id test

* implement recordings opt in

* remove old $snapshot unit tests

* remove performance tests

* Update plugin-server/functional_tests/session-recordings.test.ts

Co-authored-by: Tiina Turban <tiina303@gmail.com>

* Update plugin-server/functional_tests/session-recordings.test.ts

Co-authored-by: Tiina Turban <tiina303@gmail.com>

* add back $snapshot format test

* Add comment re functional test assumptions

Co-authored-by: Tiina Turban <tiina303@gmail.com>
2023-01-27 12:36:45 +00:00
Harry Waye
5eba9c55dc
chore(session-recordings): add batch size and timestamp metrics (#13772)
The timestamp is a requirement for the alert defined in
https://github.com/PostHog/charts-clickhouse/pull/669

The batch size metric is added because I'm curious about 1. how many
batches we fetch and 2. what effect setting [KafkaJSs
`minBytes`](https://kafka.js.org/docs/consuming#a-name-options-a-options)
might have on the number and size of batches we fetch, perhaps reducing
down the amount of IO we're performing both consuming and on flushing
the producer wrapper.
2023-01-18 13:28:59 +00:00
Harry Waye
49f1dcb9a2
fix(session-recordings): fix missing distinct_id for session recordings (#13757)
* fix(session-recordings): fix missing distinct_id for session recordings

Previously I'd assumed that the distinct_id would be in the event.
That's not true, rather it is at the top level of the Kafka message
value JSON.

This commit fixes that, and also updates all functional tests to not
include the `distinct_id` within the event body.

* Revert "chore(session-recordings): revert to sending events to old topic (#13756)"

This reverts commit 41874de277.

* add test for session without team_id, only token

* pull out event names as variable

* Change info -> debug otherwise its very noisy
2023-01-17 13:42:31 +00:00
Harry Waye
51e134e98c
chore(session-recordings): separate topics for events as recordings (#13654)
* chore(session-recordings): separate topics for events as recordings

WIP

* fix tests

* Use simpler consumer for session recordings

* wip

* still batch things by batchSize

* add tests, improve comments

* rename topic var

* push performance_events to session recordings topic also

* Add completely separate consumer for session-recordings

* wip

* use session_id for partition key

* fix test

* handle team_id/token null

* wip

* fix tests

* wip

* use kafka_topic var in logs

* use logger

* fix test

* Fix $performance_event topic usage

* fix tests

* fix check for null/undefined

* Update posthog/api/capture.py

Co-authored-by: Tomás Farías Santana <tomas@tomasfarias.dev>

* Add test for kafka error handling

* Remove falsy teamId check

* fix statsd error

* kick ci

* Use existing getTeamByToken

* remove partition key from recordings

* Make sure producer is connected !

* fix session id kafka key test

* add back throws!

* set producer on each test

* skip flaky test

* add flush error logs

* wait for persons to be ingested

* fix skip

Co-authored-by: Tomás Farías Santana <tomas@tomasfarias.dev>
2023-01-17 12:04:03 +00:00
Harry Waye
63ba5e2fb7
chore: remove usage of delayUntilEventsIngested (#13509)
Its usage is odd and it's not clear what it's doing.
2022-12-29 21:54:44 +00:00
Harry Waye
0869801a8e
chore(plugin-server): split functional tests into feature based files (#13031)
* chore(plugin-server): split functional tests into feature based files

This is intended to make it more obvious what we are testing, and to try
and identify the major themes of the plugin-server functionality.

As a by product it should make things more parallelizable for jest as
the tests in different files will be isolated, runnable in separate
workers.

* use random api token, avoid db constraints

* make tests silent

* format

* chore: set number of jest workers

These tests should be pretty light given they just hit other APIs and
don't do much themselves. Memory could be an issue on constrained
environments. We shall see.
2022-11-30 12:49:17 +00:00