posthog

mirror of https://github.com/PostHog/posthog.git synced 2024-11-29 03:04:16 +01:00

Author	SHA1	Message	Date
Harry Waye	9dcb8aa030	chore: fix plugin server tests for acitonMatcher (#16548 ) * chore: fix plugin server tests for acitonMatcher I broke these but there was an issue with CI so they were merges broken but now maybe they are fixed? * more fixes * wip * wip * wip	2023-07-13 10:56:43 +01:00
Ben White	f3fedaa91d	fix: Session manager cleanup and deletion (#16521 )	2023-07-13 11:48:49 +02:00
Paul D'Ambra	78a4ade041	chore: switch back to sync commit and gauge actual offset committed (#16540 )	2023-07-12 21:30:09 +01:00
Ben White	cd2f7f398a	feat: Optimised blob storage team loading (#16486 )	2023-07-12 10:21:06 +02:00
Ben White	0a84d018fc	feat: Reduce offset tracking (#16482 )	2023-07-11 16:56:45 +02:00
Harry Waye	634472c3ad	fix: plugin mode string check (#16490 ) * fix: plugin mode string check Previously we had to keep these in sync, but now we can just use the PluginServerMode enum directly. * add test	2023-07-11 14:09:28 +00:00
Ben White	5a636f6bd1	feat: Optimise resource usage for blob ingester (#16478 )	2023-07-11 15:11:36 +02:00
Paul D'Ambra	d068ba9410	feat: don't track byte size, it isn't useful enough (#16481 ) * feat: track file line count not byte size * just remove it	2023-07-11 13:23:01 +01:00
Harry Waye	8adac54130	chore(webhooks): remove abstractions from webhook consumer logic (#16418 ) * chore(webhooks): remove abstractions from webhook consumer logic Previously we were jumping through a few hoops to make webhook calls e.g. still using the piscina abstraction, still using the runner code. This commit removes those abstractions while still maintaining the existing functionality wrt error handling and metrics gathering. I'll leave further refactoring of the webhook consumer code to a separate PR. For example, moving the statsd metrics to be based on OpenMetrics instead. And further adding some tracing around key parts of the webhook matching and firing logic. * fix typing * fix typing * fix typing * fix unit tests * fix tests * chore(plugin-server): simplify action manager deps Previously we were passing in the kitchen sync, but the only dependency is postgres. This should make it easier to e.g. refactor to not need to load the kitchen sync on some deployments. * chore(plugin-server): simplify hook commander deps Previously we passed in DB which is a lot of stuff. Now we just pass in the postgres pool. * fix import	2023-07-10 11:02:04 +00:00
Harry Waye	12d0a29957	chore(plugin-server): simplify action matcher deps (#16429 ) * chore(plugin-server): simplify action matcher deps Specifically this only depends on postgres, so passing in the `DB` object is unnecessary. This should make refactors easier to e.g. only load the required dependencies when they are needed. * pass only postgres to ActionMatcher	2023-07-07 15:02:46 +01:00
Paul D'Ambra	efb77278b4	feat: async commits in the blob ingester (#16387 ) * feat: async commits in the blob ingester * Fix	2023-07-05 14:03:29 +01:00
Tiina Turban	a45c51c8c9	feat: Break up webhooks from onEvent (#16316 )	2023-07-05 12:29:28 +02:00
Tiina Turban	876ce6a43c	fix: Nuke siteUrlManager reducing PG load (#16290 )	2023-07-05 12:19:32 +02:00
Paul D'Ambra	0130bea5be	chore: add some realtime snapshot playback observability (#16363 ) * chore: add some realtime snapshot playback observability * fixes	2023-07-04 12:32:16 +00:00
Harry Waye	055776d461	chore(exports): try calling heartbeat a bit more often (#16295 ) * chore(exports): try calling heartbeat a bit more often Looks like we end up rebalancing often. Possibly because we're not sending the heartbeats in time and the session timing out. * wip * fix tests	2023-06-29 13:55:31 +01:00
Paul D'Ambra	8a29cc679f	feat: estimate size of session on ingestion (#16241 ) * migration to add the new column * and populate it * fix * Update query snapshots * test assertions --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>	2023-06-27 08:12:42 +01:00
Paul D'Ambra	4d0b51f764	fix: fake timer blob consumer tests (#16245 ) * fix: fake timer blob consumer tests * Update plugin-server/tests/main/ingestion-queues/session-recording/session-recordings-blob-consumer.test.ts * remove unnecessary advance time	2023-06-26 12:52:01 +01:00
Paul D'Ambra	2623a77c05	chore: skip test to unblock CI (#16243 ) YOLO to unblock CI	2023-06-26 12:04:13 +01:00
Ben White	34f6dde752	feat: New offset committing logic (#16220 )	2023-06-23 11:18:35 +00:00
Ben White	ffdda1d392	feat: Realtime playback for new ingestion flow (#15627 )	2023-06-23 12:39:07 +02:00
Paul D'Ambra	6480a2d30f	fix: high-water mark committing before updating (#16202 ) * fix: high water mark committing before updating * make it safe to set as well as get before getAll has run * hand the shopkeeper my wallet 😂 * 🤦	2023-06-22 16:10:33 +00:00
Paul D'Ambra	756e2ed91a	feat: Do not re process s3 writes (#15777 )	2023-06-22 11:44:56 +02:00
Paul D'Ambra	ccbbee93e1	feat: control accepted session replay dates (#16142 )	2023-06-20 15:33:16 +01:00
Xavier Vello	3414f8ebbc	chore(ingestion): re-introduce rdkafka consumer alongside kafkajs (#16048 )	2023-06-20 14:29:26 +02:00
Harry Waye	924deae8dc	fix(ingestion): add DLQ for non-retriable errors (#16124 ) * fix(ingestion): add DLQ for non-retriable errors This is due to https://posthog.slack.com/archives/C0185UNBSJZ/p1687006425094159 which is causing some lag on ingestion. * fix error handleing * fix tests	2023-06-17 22:14:00 +01:00
Ben White	27b75226b0	feat: Completely separate ingestion for replay events (#16024 ) --------- Co-authored-by: Paul D'Ambra <paul@posthog.com> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>	2023-06-15 14:13:28 +01:00
Paul D'Ambra	bdc346de41	feat: summarise console logs too (#15954 ) We collect console logs in session recordings but you can't filter to find recordings that have them. Worse, when we move to storing recordings in blob storage it would be impossible to filter for them. Changes adds new columns to the session replay summary table that will let us add counts of levels log, warn, and error from console logs collected with recordings alters the recordings-ingestion consumer to count those three levels of log	2023-06-14 15:26:34 +01:00
Tiina Turban	7376e4fdff	feat: Person creation and update retries (#15925 )	2023-06-12 16:07:13 +02:00
Michael Matloka	1aee725409	feat(webhooks): Support person display name preferences and nested properties (#15882 ) * Use same logic for `[person]` webhook token as person display in the app * Allow accessing nested properties in webhook message * Update hooks.test.ts * Fix team fetching test * Update query snapshots * Update query snapshots --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>	2023-06-05 16:25:39 +00:00
Xavier Vello	6ff188c09e	perf(ingestion): don't block for each event's kafka produce (#15835 )	2023-06-01 15:05:21 +02:00
Paul D'Ambra	dd796c8cab	fix: only track timestamps of completed chunks (#15801 )	2023-05-30 22:00:24 +01:00
Paul D'Ambra	2e2721cb86	fix: idle partitions never flush (#15776 )	2023-05-30 17:35:21 +01:00
Xavier Vello	bece269c32	perf(ingestion): revert rdkafka consumer for both ingestion queues (#15711 ) * Revert "perf(ingestion): use rdkafka consumer for both ingestion queues (#15695)" This reverts commit `fea9e4d77c`. * format * fix split test * no really, fix the test --------- Co-authored-by: Harry Waye <harry@posthog.com>	2023-05-30 17:02:11 +01:00
Xavier Vello	1ea6619e3a	perf(ingestion): re-add: don't batch events by distinct_id when consuming from overflow (#15785 )	2023-05-30 15:17:01 +02:00
Xavier Vello	81c4214f5b	perf(ingestion): revert: don't batch events by distinct_id when consuming from overflow (#15782 ) Revert "perf(ingestion): don't batch events by distinct_id when consuming from overflow (#15744)" This reverts commit `89bd1d30aa`.	2023-05-30 11:39:27 +01:00
Paul D'Ambra	42b54af3bd	chore: very specific logging (#15769 ) * chore: add some very specific logging to figure out the impossible * even more very specific logging * track offsets too * even more careful	2023-05-29 18:50:22 +01:00
Xavier Vello	89bd1d30aa	perf(ingestion): don't batch events by distinct_id when consuming from overflow (#15744 )	2023-05-26 11:18:31 +02:00
Michael Matloka	39ad3cd68c	feat(actions): Support "Link target contains/matches regex" (#15535 ) * Add `ActionStep.href_matching` + `ActionStep.text_matching` * Use `href_matching` and `text_matching` in matching * Show new matching options in the UI * Update query snapshots * Update UI snapshots for `chromium` (1) * Update UI snapshots for `chromium` (2) * Add support in the API * Fix `LemonLabel` overflow * Add support in the toolbar * Update plugin server tests * Add Django support * Update query snapshots * Add Django test * Don't italicize text input placeholder * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (1) * Update query snapshots * Fix typing * Update query snapshots * Fix typing more * Update query snapshots * Update UI snapshots for `chromium` (2) * Update UI snapshots for `chromium` (1) * Update query snapshots --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>	2023-05-25 16:35:02 -07:00
Paul D'Ambra	0edad3ca33	chore: tidy up after blob tests (#15701 )	2023-05-25 12:02:08 +01:00
Ben White	57e7ed7cfa	fix: Still commit offsets for empty buffer (#15710 )	2023-05-25 08:45:59 +00:00
Xavier Vello	fea9e4d77c	perf(ingestion): use rdkafka consumer for both ingestion queues (#15695 )	2023-05-25 10:42:55 +02:00
Xavier Vello	d389cf0ead	chore(ingestion): remove old batching code (#15689 )	2023-05-25 09:58:55 +02:00
Paul D'Ambra	09164ab35d	fix: commit sync and commit one more (#15703 )	2023-05-25 08:43:27 +02:00
Ben White	9dd75c770c	fix: Change the offset tracking logic for testing purposes (#15697 )	2023-05-24 15:24:40 +00:00
Paul D'Ambra	631f799c6a	fix: remove empty session managers (#15683 )	2023-05-24 13:04:30 +00:00
Tiina Turban	bf535847a5	chore: nuke person redis cache (#15458 )	2023-05-24 13:46:12 +02:00
Xavier Vello	6cb7e53893	chore(ingestion): compute and report actual parallelism (#15688 )	2023-05-24 13:31:24 +02:00
Paul D'Ambra	f3f6d9a77f	fix: less blocking chunks (#15687 ) * fix: less blocking chunks * fix	2023-05-24 10:12:04 +01:00
Paul D'Ambra	16d7a605dd	fix: pending chunks idle test (#15681 ) * fix: pending chunks idle test * remove dangling test * fix	2023-05-23 18:28:54 +01:00
Paul D'Ambra	7858d5ad4c	chore: worrying at ingestion still (#15673 ) * less pending chunk logging * the flush threshold multiplies is too confusing to operate * must provide all chunk offsets and the correct chunks when processing pending chunks	2023-05-23 13:48:55 +00:00
Paul D'Ambra	e6551c31c2	feat: key blobs by event times (#15640 ) * feat: key blobs by event times * remove TODO * softly softlly * Update plugin-server/src/main/ingestion-queues/session-recording/blob-ingester/session-manager.ts Co-authored-by: Ben White <ben@posthog.com> --------- Co-authored-by: Ben White <ben@posthog.com>	2023-05-22 16:36:30 +01:00
Paul D'Ambra	10e34efd93	feat: handle chunks less aggresively on flush (#15628 ) Co-authored-by: Ben White <ben@posthog.com>	2023-05-22 14:04:00 +01:00
Paul D'Ambra	473b31bf58	feat: back off buffer age threshold as lag grows (#15626 ) * feat: back off buffer age threshold as lag grows * but not _forever_ * max backoff configurable	2023-05-19 17:20:09 +01:00
Xavier Vello	2f27f43406	perf(ingestion): port overflow logic into eachBatchParallelIngestion (#15621 )	2023-05-19 16:37:27 +02:00
Paul D'Ambra	5f936b45f0	feat: be more tolerant of lag when bundling (#15596 )	2023-05-19 13:48:06 +01:00
Xavier Vello	4f8b981057	perf(ingestion): increase event processing concurrency (#15612 )	2023-05-19 08:46:34 +00:00
Paul D'Ambra	9a7bcfce5c	chore: track blocking session when committing offsets (#15606 ) * chore: track blocking session when committing offsets * track lowest session id even after removal	2023-05-18 11:06:15 +01:00
Paul D'Ambra	aa4ec230c2	fix: respect opt out (#15583 )	2023-05-17 12:19:02 +01:00
Xavier Vello	10f1bcc28d	chore(ingestion): dont wait on logs to be persisted to Kafka (#15579 ) * chore(ingestion): dont wait on logs to be persisted to Kafka With how bad concurrency is atm re. batching, we end up waiting a long time on logs to be persisted to Kafka. We don't need to guarantee logs so instead we let these be async. We still want to await for the message to have been handled by librdkafka queuing internals though so we still await but ask that we resolve as soon as we've handed off the message to librdkafka. --------- Co-authored-by: Harry Waye <harry@posthog.com>	2023-05-17 10:21:26 +00:00
Paul D'Ambra	d068f2fd53	fix: ensure that offsets are sorted (#15580 )	2023-05-17 11:09:09 +01:00
Paul D'Ambra	4509f98628	feat: remove summary config guard (#15572 ) * feat: remove summary config guard * fix	2023-05-17 09:14:48 +01:00
Paul D'Ambra	d1c2fa84fa	fix: support double decoding base64 on decompress (#15542 )	2023-05-15 13:48:17 +01:00
Paul D'Ambra	e962afabe1	fix: from logs to changes (#15539 ) * fix: from logs to changes * Update plugin-server/src/main/ingestion-queues/session-recording/blob-ingester/session-manager.ts * use the key from the map instead of calculating it * 🙈	2023-05-14 11:59:21 +01:00
Paul D'Ambra	9305651289	chore: worrying at what is happening (#15479 ) * chore: worrying at why the blob ingester gets unhappy * log when file deletion succeeds * can you wait for e2e ci step without a specified build name * wait on build posthog cloud? * disable the step for now * rugh	2023-05-11 10:57:39 +01:00
Harry Waye	616389713b	revert: use rdkafka consumer for analytics ingestion and onEvent (#15469 ) Revert "chore: use rdkafka consumer for analytics ingestion and onEvent (#15432)" This reverts commit `85bb582cee`.	2023-05-10 12:22:38 +00:00
Harry Waye	85bb582cee	chore: use rdkafka consumer for analytics ingestion and onEvent (#15432 ) This _should_ give us better performance and reliability, but it's hard to tell without a lot of testing. Will monitor closely on rollout. Note that this will require a delete on the old consumer members as they are using round eager robin partition strategy, whereas this is using the cooperative sticky partition strategy. librdkafka doesn't support mixing the two, unlike the Java Kafka Client. --------- Co-authored-by: Tiina Turban <tiina303@gmail.com>	2023-05-10 11:15:02 +00:00
Ben White	6378d66d30	fix: Correct decision for oldest timestamp in blob consumer (#15452 )	2023-05-09 15:40:18 +00:00
Paul D'Ambra	067d73cb4f	feat: write recording summary events (#15245 ) Problem see #15200 (comment) When we store session recording events we materialize a lot of information using the snapshot data column. We'll soon not be storing the snapshot data so won't be able to use that to materialize that information, so we need to capture it earlier in the pipeline. Since this is only used for searching for/summarizing recordings we don't need to store every event. Changes We'll push a summary event to a new kafka topic during ingestion. ClickHouse can ingest from that topic into an aggregating merge tree table. So that we store (in theory, although not in practice) only one row per session. add config to turn this on and off by team in plugin server add code behind that to write session recording summary events to a new topic in kafka add ClickHouse tables to ingest and aggregate those summary events	2023-05-09 14:41:16 +00:00
Tiina Turban	4cd5447f0e	chore: Nuke buffer pipeline code (#15404 )	2023-05-09 14:50:18 +02:00
Paul D'Ambra	92b04ae84b	fix: flush sessions when idle not when buffer has reached an age (#15405 ) * get rid of the annoying type errors * fix: flush sessions when idle not based on buffer file age * inline * push timestamp into metadata arg	2023-05-05 17:45:12 +01:00
Harry Waye	cff0dab1ee	fix(plugin-server): send headers as well with KafkaProducerWrapper (#15382 ) I forgot to pass this through. I think we nuked the buffer tests so was only apparent in production :grimace:	2023-05-04 15:28:53 +00:00
Harry Waye	2f9e2928fe	chore(plugin-server): use librdkafka producer everywhere (#15314 ) * chore(plugin-server): use librdkafka producer everywhere We say some 10x improvements in the throughput for session recordings. Hopefully there will be more improvements here as well, although it's a little less clear cut. I don't try to provide any improvements in guarantees around message production here. * we still need to enable snappy for kafkajs	2023-05-04 13:02:44 +00:00
Tiina Turban	a5544cf7e4	feat: Async handlers use person info from event (#15307 )	2023-05-04 13:25:56 +02:00
Ben White	83d57c5d77	fix: Fixed the logic for assigning and revoking partitions (#15350 ) * fix: Fixed the logic for assigning and revoking partitions * Fix * reverse the offset manager revoke logic --------- Co-authored-by: Paul D'Ambra <paul@posthog.com>	2023-05-03 17:42:54 +01:00
Harry Waye	7ba6fa7148	chore(plugin-server): remove piscina workers (#15327 ) * chore(plugin-server): remove piscina workers Using Piscina workers introduces complexity that would rather be avoided. It does offer the ability to scale work across multiple CPUs, but we can achieve this via starting multiple processes instead. It may also provide some protection from deadlocking the worker process, which I believe Piscina will handle by killing worker processes and respawning, but we have K8s liveness checks that will also handle this. This should simplify 1. prom metrics exporting, and 2. using node-rdkafka. * remove piscina from package.json * use createWorker * wip * wip * wip * wip * fix export test * wip * wip * fix server stop tests * wip * mock process.exit everywhere * fix health server tests * Remove collectMetrics * wip	2023-05-03 14:42:16 +00:00
Harry Waye	96fe16fd3c	chore(recordings): use cooperative-sticky rebalance strategy (#15260 ) Revert "revert(recordings): use cooperative-sticky rebalance strategy (#15211)" This reverts commit `a40f01138e`.	2023-04-26 13:09:13 +00:00
Ben White	fdb2c71a39	feat: S3 backed recording ingestion (take 2) (#14864 )	2023-04-25 09:43:07 +00:00
Harry Waye	a40f01138e	revert(recordings): use cooperative-sticky rebalance strategy (#15211 ) Revert "chore(recordings): use cooperative-sticky rebalance strategy (#15197)" This reverts commit `3eddb96b9b`.	2023-04-24 15:06:33 +00:00
Harry Waye	3eddb96b9b	chore(recordings): use cooperative-sticky rebalance strategy (#15197 ) * chore(recordings): use cooperative-sticky rebalance strategy This should make rebalances and lag during deploys a little less painful. I'm setting this as the globally used strategy, when we e.g. want to use another strategy for a specific consumer group, we can make this configurable. * disable rebalance_callback * use node-rdkafka-acosom fork instead, for cooperative support	2023-04-24 13:25:24 +00:00
Xavier Vello	5058f71ccf	chore(ingestion): move mmdb database in worker process (#15173 )	2023-04-24 11:34:52 +02:00
Harry Waye	67465fce9c	chore(recordings): Add librdkafka to recordings consumer (#15091 ) * chore(recordings): Add librdkafka to recordings consumer This is the sister PR to the change to use the librdkafka producer in the recordings consumer. Things of interest here: 1. we use offset auto commit 2. we handle storing the offset ourselves, after the message has been processed 3. we do everything concurrently 4. we implement back pressure based on the number of messages in the flight * Update plugin-server/src/kafka/admin.ts Co-authored-by: Xavier Vello <xavier.vello@gmail.com> * Update plugin-server/src/kafka/admin.ts Co-authored-by: Xavier Vello <xavier.vello@gmail.com> * Update plugin-server/src/kafka/admin.ts Co-authored-by: Xavier Vello <xavier.vello@gmail.com> * Update plugin-server/src/kafka/consumer.ts Co-authored-by: Xavier Vello <xavier.vello@gmail.com> * Update plugin-server/src/kafka/consumer.ts Co-authored-by: Xavier Vello <xavier.vello@gmail.com> * add default queued values * clarify linger --------- Co-authored-by: Xavier Vello <xavier.vello@gmail.com>	2023-04-19 14:42:34 +00:00
Harry Waye	997b3ff9dd	fix(plugin-server): only set ssl config when defined (#15071 ) * Revert "Revert "perf(recordings): use node-librdkafka for ingester production" (#15069)" This reverts commit `ac5e084f48`. * fix(plugin-server): only set ssl config when defined Hopefully this means it will use the global CA bundle. * hack: enable debug logs * honor KAFKAJS_LOG_LEVEL envvar * add SegfaultHandler * disable ssl verification * debug -> info * only log brokers * Revert "add SegfaultHandler" This reverts commit `b22f40b802`. --------- Co-authored-by: Xavier Vello <xavier@posthog.com>	2023-04-13 16:52:51 +01:00
Xavier Vello	ac5e084f48	Revert "perf(recordings): use node-librdkafka for ingester production" (#15069 ) Revert "perf(recordings): use node-librdkafka for ingester production (#15041)" This reverts commit `7f852ab618`.	2023-04-13 11:23:22 +01:00
Xavier Vello	7f852ab618	perf(recordings): use node-librdkafka for ingester production (#15041 )	2023-04-13 11:55:16 +02:00
Xavier Vello	adc0acc4bc	chore(recordings): revert: use node-librdkafka for ingester production (#15032 ) Revert "chore(recordings): use node-librdkafka for ingester production (#14460)" This reverts commit `c34979853e`.	2023-04-11 17:10:49 +00:00
Harry Waye	c34979853e	chore(recordings): use node-librdkafka for ingester production (#14460 ) Previously we've been using the KafkaJS Producer with a wrapper around it to handle batching. There are a number of issues with the batching implementation e.g. not having a way to provide guarantees on delivery and rather than fix that, we can simply use the librdkafka Producer which is a lot more mature and battle-tested.	2023-04-11 16:44:39 +01:00
Xavier Vello	2a5f6d3691	fix(tests): make getEventsByPerson output stable to avoid flakes (#15009 )	2023-04-07 12:00:11 +00:00
Tiina Turban	0163b63dad	fix: Remove sentry noise for missing group_type (#14825 )	2023-03-20 16:16:54 +01:00
Tiina Turban	f065757ae7	revert "chore: person props last-op/ts clean-up" (#14741 ) Revert "chore: person props last-op/ts clean-up (#14316)" This reverts commit `4bf7ddd8e9`.	2023-03-14 16:36:44 +00:00
Tiina Turban	4bf7ddd8e9	chore: person props last-op/ts clean-up (#14316 )	2023-03-14 16:35:53 +01:00
Xavier Vello	504a3edab8	chore: rename runLightweightCaptureEndpointEventPipeline to runEventPipeline (#14365 )	2023-03-14 15:38:30 +01:00
Tiina Turban	d35239c658	feat: Add merge_dangerously event (#14625 )	2023-03-08 14:37:48 +01:00
Harry Waye	d840c5ac09	chore: fix flakey session recordings error case test (#14562 ) * chore: fix flakey session recordings error case test We need to make sure that we hit the case where the events are produced for ClickHouse ingestion. The signature of the `queueMessage` function is interesting in that it's behaviour depends on some heiuristics as to if it will flush and therefore reject or not. I would like to change this behavior but my preference would be to update to use rdkafka and update to have more sensible behaviour then. * Add call expects	2023-03-06 13:01:23 +00:00
Harry Waye	fdf97c9a92	chore(kafka): ensure retry on kafkajs produce failure (#14543 ) * chore(kafka): ensure retry on kafkajs produce failure This is a fix to ensure that we do not simply drop events when Kafka is e.g. down. We were previously catching the KafkaJSError but it seems the errors are always run using the `retry` function, which means we always get a KafkaJSNumberOfRetriesExceeded error. * wip * use real timers	2023-03-06 10:37:16 +00:00
Harry Waye	dcc9acc47d	chore(recordings): remove hub dependency on recordings ingestion (#14418 ) * chore(recordings): remove hub dependency on recordings ingestion Hub is a grab bag of depencencies that are not all required for recordings ingestion. To keep the recordings ingestion lean, we remove the hub dependency and use the postgres and kafka client directly. This should increase the availability of the session recordings workload, e.g. it should not go down it Redis or ClickHouse is down. * fix capabilities call * reuse clients if available * wip * wip * wip * fix tests * fix healthcheck	2023-02-28 10:23:07 +00:00
Raquel Smith	bc9b449ae5	fix: set msclkid on person like gclid (#14386 ) * set msclkid on person like gclid * fix typos (thx copilot)	2023-02-24 19:35:35 +00:00
Xavier Vello	25cb653c1b	feat(ingestion): run all analytic events through populateTeamDataStep (#14341 ) * team-manager: expire negative lookups after 5 minutes, improve docs * populateTeamDataStep: don't drop token, keep team_id from capture if present, report results * ingestEvent: run all analytic events through runLightweightCaptureEndpointEventPipeline * continue accepting events with no token but a team_id	2023-02-22 14:24:05 +01:00
Tiina Turban	5f24e281b1	fix: groupidentify to always update props and ignore timestamps (#14240 )	2023-02-21 17:07:47 +01:00
Tomás Farías Santana	4f2412ea9a	fix(slowlane-ingestion): Access db attribute instead of hub (#14292 )	2023-02-17 15:27:36 +01:00
Tomás Farías Santana	66750cf2cc	refactor: Set key to null during batching to process in parallel (#14278 ) * fix(slowlane-ingestion): Clarify in the warning we are still processing * docs(slowlane-ingestion): Clarify we are re-producing when running with ingestionOverflow enabled * refactor(slowlane-ingestion): Set key to null during batching to process in parallel * refactor(slowlane-ingestion): Simplify batching logic and send warning on eachMessage * fix: Add missing whitespace in comment Co-authored-by: Tiina Turban <tiina303@gmail.com> * test(slowlane-ingestion): Assert event pipeline doesn't run if overflowing * fix: Check for batch length bigger than batchSize Co-authored-by: Tiina Turban <tiina303@gmail.com> * refactor(slowlane-ingestion): Raise warning on overflow consumer instead * test(slowlane-ingestion): Add tests for overflow consumer * refactor(slowlane-ingestion): Use groupIntoBatches utility in overflow consumer --------- Co-authored-by: Tiina Turban <tiina303@gmail.com>	2023-02-17 10:42:58 +01:00
Tomás Farías Santana	1e94d8e138	feat(ingestion-slowlane): Re-route events in plugin-server on capacity exceeded (#14211 ) * feat(ingestion-slowlane): Add token-bucket utility * feat(ingestion-slowlane): Re-route overflow events * fix: Import missing stringToBoolean * fix(ingestion-slowlane): Flip around kafka topics according to mode * refactor(ingestion-slowlane): Use dash instead of underscore in filename * fix(ingestion-slowlane): Do not increase tokens beyond bucket capacity * feat(ingestion-slowlane): Add ingestion-overflow mode/capability/consumer * feat(ingestion-slowlane): Add ingestion warning for capacity overflow * test(ingestion-slowlane): Add test for ingestion of overflow events * fix(ingestion-slowlane): Rate limit warnings to 1 per hour * test(ingestion-slowlane): Add a couple more tests for overflow re-route * fix(slowlane-ingestion): Look at batch topic to determine message topic * refactor(slowlane-ingestion): Use refactored consumer model * fix(slowlane-ingestion): Undo topic requirement in eachMessageIngestion * refactor(slowlane-ingestion): Only produce events if ingestionOverflow is also enabled * refactor(slowlane-ingestion): Use an env variable to determine if ingestionOverflow is enabled * chore(slowlane-ingestion): Add a comment explaining env variable	2023-02-16 14:30:13 +01:00
Harry Waye	ce777f7efa	fix: make sort order deterministic in property definitions manager test (#14266 ) * fix: make sort order deterministic in property definitions manager test I'd added an order in a previous PR, but its a UUID. * fix sql statements	2023-02-16 12:10:53 +00:00
Harry Waye	f62041a833	refactor(ingestion): pull out topic/groupid from kafka-queue (#14249 ) * refactor(ingestion): pull out topic/groupid from kafka-queue We have `IngestionConsumer` at the moment that holds a lot of complexity in it regarding topics/groupid/message handlers. This is a step towards moving that logic out of the `IngestionConsumer`, and making the top level of the pluginsServer simpler to reason about. * wip * wip * wip * wip	2023-02-15 14:16:23 +00:00
Harry Waye	235b379707	refactor(plugin-server): remove `nextStep` functionality from pipeline (#13964 ) This is intended to make the pipeline a little more readable by moving the control flow out of the steps and into the runner. It also makes it easier to add new steps to the pipeline. Co-authored-by: Tiina Turban <tiina303@gmail.com>	2023-01-31 10:21:05 +00:00
Harry Waye	da482a3cba	refactor(recordings): remove session code from event pipeline (#13919 ) * refactor(recordings): remove session code from event pipeline We have moved session recrodings to a separate topic and consumer. There may be session recordings in the old topic, but we divert these to the new logic for processing them. * refactor to just send to the new topic! * fix import * remove empty line * fix no team_id test * implement recordings opt in * remove old $snapshot unit tests * remove performance tests * Update plugin-server/functional_tests/session-recordings.test.ts Co-authored-by: Tiina Turban <tiina303@gmail.com> * Update plugin-server/functional_tests/session-recordings.test.ts Co-authored-by: Tiina Turban <tiina303@gmail.com> * add back $snapshot format test * Add comment re functional test assumptions Co-authored-by: Tiina Turban <tiina303@gmail.com>	2023-01-27 12:36:45 +00:00
Tiina Turban	3cdad732fd	feat: PoE placeholder for ingestion and testing enabling (#13881 )	2023-01-26 15:18:25 +01:00
Karl-Aksel Puulmann	8930d9e460	feat: capture person/group property definitions (2/2) (#13816 ) * feat: ingest person and group property definitions 2/2 * Update test	2023-01-20 15:42:00 +02:00
Karl-Aksel Puulmann	15b6ade4a0	feat: capture person/group property definitions (1/2) (#13809 ) * migration for person/group property support in property definitions table * Use database default * Validate correct constraint * Ingest person and group type property definitions * Exclude person/group type definitions from API * Update property definitions test * Ignore $groups * Add new unique index which accounts for type and group_type_index * Run new code only in test * Ignore errors from propertyDefinitionsManager which may occur due to migrations * Update constraint name * Update test describe * ON CONFLICT based on the index expression * Add a -- not-null-ignore * Combine migrations * Remove some test code temporarily * fixup latest_migrations	2023-01-20 14:47:32 +02:00
Harry Waye	51e134e98c	chore(session-recordings): separate topics for events as recordings (#13654 ) * chore(session-recordings): separate topics for events as recordings WIP * fix tests * Use simpler consumer for session recordings * wip * still batch things by batchSize * add tests, improve comments * rename topic var * push performance_events to session recordings topic also * Add completely separate consumer for session-recordings * wip * use session_id for partition key * fix test * handle team_id/token null * wip * fix tests * wip * use kafka_topic var in logs * use logger * fix test * Fix $performance_event topic usage * fix tests * fix check for null/undefined * Update posthog/api/capture.py Co-authored-by: Tomás Farías Santana <tomas@tomasfarias.dev> * Add test for kafka error handling * Remove falsy teamId check * fix statsd error * kick ci * Use existing getTeamByToken * remove partition key from recordings * Make sure producer is connected ! * fix session id kafka key test * add back throws! * set producer on each test * skip flaky test * add flush error logs * wait for persons to be ingested * fix skip Co-authored-by: Tomás Farías Santana <tomas@tomasfarias.dev>	2023-01-17 12:04:03 +00:00
Ben White	cb7e7d5e5e	feat: Added performance API (#13452 )	2023-01-06 09:51:51 +01:00
Harry Waye	a27d452171	feat(person-on-events): add option to delay all events (#13505 ) * feat(person-on-events): add option to delay all events This change implements the option outlined in https://github.com/PostHog/product-internal/pull/405 Here I do not try to do any large structural changes to the code, I'll leave that for later although it does mean the code has a few loose couplings between pipeline steps that probably should be strongly coupled. I've tried to comment these to try to make it clear about the couplings. I've also added a workflow to run the functional tests against both configurations, which we can remove once we're happy with the new implementation. Things of note: 1. We can't enable this for all users yet, not without the live events view and not without verifying that the buffer size is sufficiently large. We can however enable this for the test team and verify that it functions as expected. 2. I have not handled the case mentioned in the above PR regarding guarding against processing the delayed events before all events in the delay window have been processed. wip test(person-on-events): add currently failing test for person on events This test doesn't work with the previous behaviour of the person-on-events implementation, but should pass with the new delay all events behaviour. * add test for KafkaJSError behaviour * add comment re delay * add test for create_alias * chore: increase exports timeout It seems to fail in CI, but only for the delayed events enabled tests. I'm not sure why, but I'm guessing it's because the events are further delayed by the new implementation. * chore: fix test * add test for ordering of person properties * use ubuntu-latest-8-cores runner * add tests for plugin processEvent * chore: ensure plugin processEvent isn't run multiple times * expand on person properties ordering test * wip * wip * add additional test * change fullyProcessEvent to onlyUpdatePersonIdAssociations * update test * add test to ensure person properties do not propagate backwards in time * simplicfy person property tests * weaken guarantee in test * chore: make sure we don't update properties on the first parse We should only be updating person_id and asociated distinct_ids on first parse. * add tests for dropping events * increase export timeout * increase historical exports timeout * increase default waitForExpect interval to 1 second	2023-01-05 16:38:43 +00:00
Tiina Turban	a051f37a7a	feat(plugin-server): track person creation event uuid (#13102 )	2022-12-05 20:11:23 +01:00
Harry Waye	1e6c062095	feat(plugin-server): distribute scheduled tasks i.e. runEveryX (#13124 ) * chore(plugin-server): disrtibute scheduled tasks Changes I've made here from the original PR: 1. add some logging of task run times 2. add concurrency, except only one task of a plugin will run at a time 3. add a timeout to task run times This reverts commit `23db43a0dc`. * chore: add timings for scheduled tasks runtime * chore: add timeouts for scheduled tasks * chore: clarify duration unit * chore: deduplicate tasks in a batch, add partition concurrency * chore: add flag to switch between old and new behaviour This defaults to new, but can be set to old by setting environment variable `USE_KAFKA_FOR_SCHEDULED_TASKS=false` * fix tests * enable USE_KAFKA_FOR_SCHEDULED_TASKS in tests	2022-12-05 12:30:52 +00:00
Harry Waye	23db43a0dc	Revert "Revert "Revert "fix(plugin-server): ignore old cron tasks from graphile-worker """ (#13107 ) Revert "Revert "Revert "fix(plugin-server): ignore old cron tasks from graphile-worker "" (#13100)" This reverts commit `8eec4c9346`.	2022-12-03 00:13:27 +00:00
Harry Waye	8eec4c9346	Revert "Revert "fix(plugin-server): ignore old cron tasks from graphile-worker "" (#13100 ) Revert "Revert "fix(plugin-server): ignore old cron tasks from graphile-worker " (#13095)" This reverts commit `5634ab4d7f`.	2022-12-02 18:11:17 +00:00
Harry Waye	5634ab4d7f	Revert "fix(plugin-server): ignore old cron tasks from graphile-worker " (#13095 ) Revert "fix(plugin-server): ignore old cron tasks from graphile-worker (#13094)" This reverts commit `b079a8cc8e`.	2022-12-02 16:28:30 +00:00
Harry Waye	b079a8cc8e	fix(plugin-server): ignore old cron tasks from graphile-worker (#13094 ) * Revert "Revert "feat(plugin-server): distribute scheduled tasks i.e. runEveryX" (#13087)" This reverts commit `78e6f48660`. * fix(plugin-server): ignore old cron tasks from graphile-worker When we are backed up on jobs, we end up still creating tasks in the graphile-worker job table, i.e. there is no backpressure. This change makes us skip over old tasks, so that we don't get backed up. * fix tests	2022-12-02 15:20:16 +00:00
Harry Waye	78e6f48660	Revert "feat(plugin-server): distribute scheduled tasks i.e. runEveryX" (#13087 ) Revert "feat(plugin-server): distribute scheduled tasks i.e. runEveryX (#13037)" This reverts commit `45912e839c`.	2022-12-02 10:40:58 +00:00
Harry Waye	45912e839c	feat(plugin-server): distribute scheduled tasks i.e. runEveryX (#13037 ) * feat(plugin-server): distribute scheduled tasks i.e. runEveryX At the moment we only run on which ever Graphile worker node picks up the scheduled tasks. Tasks are run in sequence, running through each of the associated pluginConfigIds. We tried to spread the workload by creating a Graphile Worker job for each pluginConfigId, but this caused a lot of load on the Graphile Worker database. One thing this PR doesn't tackle is what happens if we end up having the jobs back up. There is probably some logic we should add to avoid really old scheduled tasks from running. * wip * wip * fix tests * fix tests * types * update unit test * add key * fix order * Update plugin-server/src/main/ingestion-queues/scheduled-tasks-consumer.ts * chore: skip stale scheduled tasks * update comments * add statsd counter	2022-12-02 09:42:55 +00:00
Yakko Majuri	aa89545a66	fix(ingestion): do not create or update person from $snapshot events (#13048 ) * fix(ingestion): do not create or update person from events * fix tests	2022-12-01 10:37:53 -03:00
Yakko Majuri	90f1b16285	feat(ingestion): remove postgres dependency from capture endpoint (#12802 ) * add support for token field in kafka message * formPipelineEvent * rename pipeline files according to new order * wip team_id and anonymize ips * conditional handlers and tests * some plugin server fixes * fix capture bug * fix * more fixes * fix capture tests * pipeline update * fix + investigate database resets * fix import order * testing and typing updates * add test for capture endpoint * testing * python typing * plugin server test * functional test * fix test * another fix * make sure no team ids clash in tests * fix * add more metrics and logs * cache nulls * updates * add more metrics	2022-11-23 09:55:26 -03:00
Tiina Turban	41c983cc93	chore: throw when we ran out of wait time waiting for CH ingestion (#12126 )	2022-11-09 15:26:52 +01:00
Yakko Majuri	469057b905	refactor(plugin-server): rename KafkaQueue to IngestionConsumer (#12540 ) * refactor(plugin-server): rename KafkaQueue to IngestionConsumer * fix * final fix * welp	2022-11-08 13:44:29 -03:00
Harry Waye	ac5a40f5b2	chore(ingestion): remove graphile as dependency of ingestion pipeline (#12551 ) * chore(ingestion): remove graphile as dependency of ingestion pipeline This allows us to run just the ingestion part of the plugin-server without needing to perform any graphile operations e.g. creating connections to the graphile database. This has the advantage that: 1. if the graphile database is down, the ingestion pods can still start up and will function correctly. 2. avoids creating a connection pool to the graphile database for each ingestion pod, which could be a lot of connections and could cause the database to scale. 3. avoids running the graphile migrations on each ingestion pod, which is unnecessary and could cause unnecessary database load. * wip * wip * wip * wip	2022-11-01 16:01:08 +00:00
Karl-Aksel Puulmann	f07f8763e4	fix(person-on-events): Fix groups caching in ingestion (#12547 ) * fix(person-on-events): Fix groups caching in ingestion We were seeing some groups-related events never get ingested in playground. Digging in, it turned out that these events were serialized with invalid timestamps due to cache containing dates in different formats. The bug was introduced in https://github.com/PostHog/posthog/pull/12403 and makes for a good case study for this common class of errors There were multiple practices that could have indicated the error sooner: 1. Tests for the feature mocked out the DB and used a different data format than what is used properly 2. Some methods related to caching were not properly updated to test the caching logic 3. timestamps-as-strings: we deal with both ISO and clickhouse-format timestamps, and the code didn't differentiate between them properly 4. `getGroupsColumns` signature was very loose, allowing for everything to pass by This change fixes the issue as well as updates relevant code to be more in-line with best practices. * Solve minor typing related issue	2022-11-01 14:27:34 +02:00
Yakko Majuri	5aafc7a115	feat(ingestion): buffer events in kafka if postgres is down (#12532 ) * feat(ingestion): buffer events in kafka if postgres is down * also add DependencyUnavailableError to transaction * Update plugin-server/src/utils/db/db.ts	2022-10-31 19:11:10 +00:00
Harry Waye	13bca71383	chore(ingestion): remove old graphile bufferJob handling (#12528 ) * chore(ingestion): remove old graphile bufferJob handling This removes the emitting of graphile-worker events from the ingestion anonymous events path. Note that we still have the graphile worker running on ingestion, as we need to ensure that we have drained all of these jobs. I'll handle this by first enabling the topic for all users on prod then deploying this. For self hosted I suggest we just go with adding a comment that anonymous events that have been send to graphile in the meantime will be lost. Or something else that makes sense. * fix typing * remove test	2022-10-31 12:09:20 +00:00
Harry Waye	cc2f424452	chore(plugins-server): use Kafka to buffer app jobs requests (#12345 ) * chore(plugins-server): use Kafka to buffer app jobs requests To remove the dependency on the Graphile Worker database on things that may be requesting app job runs we push the jobs to a Kafka topic. * chore: use KAFKA_JOBS instead of string literal `'jobs'` * chore: rename startJobsBufferConsumer -> startJobsConsumer * avoid checking eventId * fix lint * fix producer wrapper tests * fix retries test * handle offset sync * wip * wip * remove exports * do better * use Producer not wrapper * reset db * mock once * Add test for raising to the consumer * Update plugin-server/tests/main/ingestion-queues/run-async-handlers-event-pipeline.test.ts Co-authored-by: Yakko Majuri <38760734+yakkomajuri@users.noreply.github.com> * and in the darkness bind them * fix tests * don't forget the name update! * rename DependencyError to DependencyUnavailable * separate dlq * update comment Co-authored-by: Yakko Majuri <38760734+yakkomajuri@users.noreply.github.com>	2022-10-28 11:05:15 +01:00
Yakko Majuri	1c2713a7b9	Revert "feat(scheduler): allow spreading scheduled tasks load across the fleet" (#12482 ) Revert "feat(scheduler): allow spreading scheduled tasks load across the fleet (#12477)" This reverts commit `98a14fc7c8`.	2022-10-27 15:34:16 -03:00
Yakko Majuri	98a14fc7c8	feat(scheduler): allow spreading scheduled tasks load across the fleet (#12477 ) * feat(scheduler): allow spreading scheduled tasks load across the fleet * update test * Update plugin-server/src/main/graphile-worker/worker-setup.ts Co-authored-by: Harry Waye <harry@posthog.com> * tweaks Co-authored-by: Harry Waye <harry@posthog.com>	2022-10-27 17:35:45 +00:00
Yakko Majuri	4f372c05f9	feat(plugin-server): simplify groups caching (#12403 ) * refactor(plugin-server): simplify groups caching * add multi groups test * remove comments * fix type, add debug * fix * stringify * add groups created_at to types * more test fixes * use the right clickhouse timestampo format * update created at to ch format in tests * finally * more fixes	2022-10-25 15:35:47 -03:00
Yakko Majuri	8ed495b327	fix: groups data fetching bugs (#12371 ) * fix: groups data fetching bugs * add tests	2022-10-21 12:33:36 -03:00
Yakko Majuri	c47a73165a	feat(plugin-server): use graphile-worker crontab (#12242 ) * yeet references to redlock * rename jobs/ to graphile-worker/ * feat(plugin-server): use graphile-worker crontab * remove debugging * yeet redlock dependency * remove legacy test * Update comment * Update plugin-server/src/main/pluginsServer.ts Co-authored-by: Harry Waye <harry@posthog.com> * address review, update tests * fix old tests * testing, testing * maybe fix sigterm Co-authored-by: Harry Waye <harry@posthog.com>	2022-10-18 11:44:41 -03:00
Tiina Turban	377f2ae47f	chore: Rollout groups properties writes to events (#12233 ) * chore: Rollout groups properties writes to events * forgotten save * fix test	2022-10-17 09:50:20 -03:00
Yakko Majuri	53b527dbbe	refactor(graphile-worker): update terminology, clearer capabilities approach for setup (#12203 ) * rename legacy references to queue to more appropriate worker terminology * rename startJobsConsumer -> startGraphileWorker, no-op refactor * add back enqueue success and failure metrics * fix mock import * fix test for good	2022-10-12 10:24:22 -03:00
Yakko Majuri	a34228c49f	refactor: yeet job queues scaffolding in favor of only graphile worker (#12178 ) * refactor: rename graphile queue to graphile worker * refactor: rename job-queues/ to jobs/ * refactor: move graphile-worker to top level jobs/ dir * refactor: remove references to jobQueueManager * remove promise from startJobQueueConsumer * remove job-queue-manager.ts * remove non-test references to JobQueueBase * make fs-queue independent from JobQueueBase * rename FsQueue to MockGraphileWorker * add missing pauseConsumer method to MockGraphileWorker * rename fs-queue.ts --> mock-graphile-worker.ts * delete job-queue-base.ts * get rid of JobQueue type * rename graphileQueue --> graphileWorker * rename JobQueueConsumerControl --> JobsConsumerControl * remove unused jobs test * rename startJobQueueConsumer --> startJobsConsumer * fix tests job imports * rename jobQueueManager --> graphileWorker in tests * remove JobQueueManager tests * fix import * handle metrics and retries on graphileWorker.enqueue * minor fix * Delete buffer.ts * Revert "Delete buffer.ts" This reverts commit `40f1761d31`. * add initial test scaffolding * bring back relevant worker control promises * fix existing tests * add tests for graphile worker * fix exportEvents retries test * update e2e buffer test	2022-10-11 15:40:34 -03:00
Tiina Turban	20b9205877	fix: Always update is_identified = true for first identify or alias user (#12121 )	2022-10-11 15:22:10 +02:00
Tiina Turban	c6b1da5932	fix: hide initial referrer as event property (#11536 )	2022-08-30 18:07:02 +02:00
Karl-Aksel Puulmann	14b420da0a	fix(plugin-server): Fix cohort matching in actions (#11388 ) * fix(plugin-server): Remove wild clickhouseQuery in ingestion pipeline Point queries against clickhouse are slow and we should avoid them. They're also not instrumented. The postgres table already used in the method previously contains the right data. Use that instead. Reference: https://github.com/PostHog/posthog/blob/master/posthog/models/cohort/cohort.py#L274-L316 * Fixup and test doesPersonBelongToCohort * Handle NULLs	2022-08-22 11:07:56 +03:00
Michael Matloka	7bd3cac2f5	refactor(plugin-server): Unify event types (#10612 ) * Simplify Event, ClickHouseEvent, PreIngestionEvent, IngestionEvent * Unify `ClickhouseEventKafka` with `RawEvent` * Fix imports * Eliminate PostgresSessionRecordingEvent * Parse `Event.elements_chain` too * Update process-event.test.ts * Update tests * Make `IngestionEvent['timestamp']` consistent * Update tests * Restore `PreIngestionEvent` vs. `PostIngestionEvent` split * Update worker.test.ts * Improve typing a bit * Update tests to work with mandatory `DateTime` * Remove ClonableIngestionEvent * Rename RawEvent -> RawClickHouseEvent * Rename Event -> ClickHouseEvent * Update prepareEventStep tests * Update convertToIngestionEvent behavior back to master * Update tests to compile * Use branded types for ISO/Clickhouse timestamp string disambiguation * Test for parseRawClickHouseEvent() * Update each-batch tests * Tests for clickHouseTimestampToDateTime() Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>	2022-08-15 10:54:09 +03:00
Harry Waye	635fc7b23d	chore(plugin-server): remove healthcheck topic references (#11252 ) * chore(plugin-server): remove healthcheck topic references Rather than doing an end to end produce/consume from this topic, we instead rely on the intrumentation of KafkaJS to understand if the consumer is ready. Note that this code is not being used since the change to just return an HTTP 200 from the liveness endpoint: https://github.com/PostHog/posthog/pull/11234 This is just a cleanup of dead code. * Remove Kafka healthcheck tests	2022-08-11 12:11:43 +00:00
Karl-Aksel Puulmann	f8c203fe5a	fix(plugin-server): refactor groups caching (#11141 ) * Remove unneeded method * Refactor how groups are handled * Remove .only	2022-08-05 09:26:45 +03:00
Karl-Aksel Puulmann	4f648268f2	feat(ingestion): Make person loading lazy (#11091 ) * fix issues with fetchPerson() and add tests - fetchPerson() returned extra columns that were not needed * Add LazyPersonContainer class * Load person data lazily through the event pipeline * Make webhooks and action matching lazy * Update runAsyncHandlersStep * Return own person properties in process-event.ts * Remove snapshots that caused pain * Handle serialization of LazyPersonContainer * Merge: Handle LHS only existing .get() would be cached in that case not to do a query, which we can avoid * Serialize result args as well * Make personContainer functional * Resolve feedback	2022-08-04 09:57:43 +03:00
Karl-Aksel Puulmann	9c6f20b697	chore(plugin-server): Improve tracing (#11042 ) * Include kafka topic for setup * Sample runEventPipeline/runBufferEventPipeline less frequently comparatively This is done by duration - we still want the long transactions, but not the short ones * Trace enqueue plugin jobs * Trace node-fetch * Trace worker creation * Various fixes * Line up query tags properly * Make fetch mocking work * Resolve typing-related issues	2022-08-03 16:12:56 +03:00
Ben White	f0f0cd4e15	feat: Testing alpha releases of JS libs (#11011 ) * feat: Updated to alpha version of posthog-js * Swap to alpha versions of other libs	2022-07-28 11:19:56 +00:00
Karl-Aksel Puulmann	156fa2353f	feat(plugin-server): Use Snappy compression codec for kafka production (#10974 ) * feat(plugin-server): Use Snappy compression codec for kafka production This helps avoid 'message too large' type errors (see https://github.com/PostHog/posthog/pull/10968) by compressing in-flight messages. I would have preferred to use zstd, but the libraries did not compile cleanly on my machine. * Update tests	2022-07-28 11:58:33 +03:00
Karl-Aksel Puulmann	d00d587b1c	chore(plugin-server): Improve kafka producer wrapper (#10968 ) * chore(plugin-server): include extra information on kafka producer errors We're failing to send batches of messages to kafka on a semi-regular basis due to message sizes. It's unclear why this is the case as we try to limit each message batch size. This PR adds information on these failed batches to sentry error messages. Example error: https://sentry.io/organizations/posthog2/issues/3291755686/?project=6423401&query=is%3Aunresolved+level%3Aerror * refactor(plugin-server): Remove Buffer.from from kafka messages This allows us to be much more accurate estimating message sizes, hopefully eliminating a class of errors * estimateMessageSize * Track histogram with message sizes * Flush immediately for too large messages * fud	2022-07-27 11:26:19 +00:00
Yakko Majuri	4bce5dfa8a	feat: (bring back) buffer 3.0 again (#10896 ) * Revert "Revert "feat: (bring back) buffer 3.0 (#10874)" (#10883)" This reverts commit `e203bc7cfa`. * reduce graphile load	2022-07-20 12:16:13 +00:00
Yakko Majuri	e203bc7cfa	Revert "feat: (bring back) buffer 3.0 (#10874 )" (#10883 ) This reverts commit `3e772b8614`.	2022-07-19 17:50:06 +00:00
Yakko Majuri	3e772b8614	feat: (bring back) buffer 3.0 (#10874 ) * Revert "Revert "feat: buffer 3.0 (graphile) (#10735)" (#10802)" This reverts commit `ca8c4d0271`. * add metrics and error tracking	2022-07-19 16:34:07 +00:00
Yakko Majuri	ca8c4d0271	Revert "feat: buffer 3.0 (graphile) (#10735 )" (#10802 ) This reverts commit `9a2a9046cb`.	2022-07-14 18:24:58 +00:00
Yakko Majuri	9a2a9046cb	feat: buffer 3.0 (graphile) (#10735 ) * feat: buffer 3.0 (graphile) * fixes * test * address review * add test for buffer processAt	2022-07-13 11:32:00 +00:00
Yakko Majuri	985148ee7e	feat: buffer 2.0 (#10653 ) * feat: buffer 2.0 proposal * add tests * prevent infinite retrying * perf * updates * tweaks * Update latest_migrations.manifest * Update plugin-server/src/main/ingestion-queues/buffer.ts * update * updates * fix migrations issue * reliability uopdates * fix tests * test fix * e2e test * test * test * ?? * cleanup	2022-07-08 10:48:25 +00:00
Yakko Majuri	58a1fea111	fix: handle stale batches in buffer (#10643 ) * Revert "Revert "fix: handle stale batches in buffer (#10641)" (#10642)" This reverts commit `b564688ad8`. * fix test	2022-07-05 18:16:49 +00:00
Michael Matloka	b04015f25e	chore(plugin-server): Consume from buffer topic (#10475 ) * chore(plugin-server): Consume from buffer topic * Refactor `posthog` extension for buffering * Properly form `bufferEvent` and don't throw error * Add E2E test * Test buffer more end-to-end and properly * Put buffer-enabled test in a separate file * Update each-batch.test.ts * Test that the event goes through the buffer topic * Fix formatting * Refactor out `spyOnKafka()` * Ensure reliability batching-wise * Send heartbeats every so often * Make test less flaky * Commit offsets if necessary before sleep too * Update tests * Use seek-based mechanism (with KafkaJS 2.0.2) * Add comment to clarify seeking * Update each-batch.test.ts * Make minor improvements	2022-06-28 13:30:10 +02:00
Yakko Majuri	a598c7b664	feat(persons-on-events): cache + send persons and groups created_at with events (#10457 ) * feat(persons-on-events): cache + send persons and groups created_at with events * more testing * Update plugin-server/src/utils/db/db.ts * better naming * fixes * testing * update test	2022-06-27 11:39:58 +00:00
Neil Kakkar	9712fd9bb5	chore(feature-flags): Upsert hash key overrides on people merges (#10418 )	2022-06-24 10:58:42 +01:00
Karl-Aksel Puulmann	773f922eef	feat(apps): Remove onAction plugin function (#10414 ) * Remove onAction * Avoid fetching actions that dont deal with REST - 99% reduction * Plural hooks * Avoid hook fetching where not needed * Remove dead code * Update lazy VM test * Rename a function * Update README * Explicit reload actions in tests * Only reload actions which are relevant for plugin server * Remove excessive logging * Reload actions when hooks are updated * update action matching tests * Remove commented code * Solve naming issues	2022-06-24 12:29:10 +03:00
Karl-Aksel Puulmann	f4668ed855	refactor(plugin-server): move buffer as first step of event pipeline & more (#10360 ) * WIP: Move person creation earlier * WIP: move person updating, handle person property changing * WIP: leverage person information * Update `updatePersonDeprecated` signature * Avoid (and test avoiding) unneeded lookups whether 'creating' person is needed Note there were two tricky interactions within handleIdentify, which again got solved by indirect message passing. * Solve TODO * Normalize event before updatePersonIfTouchedByPlugins * Avoid another lookup for person in updatePersonProperties * Avoid lookup for newPerson in handleIdentifyOrAlias * Add kludge comments * Fix runBufferEventPipeline * Rename upsertPersonsStep => processPersonsStep * Update emitToBufferStep tests * Update some event pipeline step tests * Update prepareEventStep tests * Test processPersonStep * Add tests for updatePersonIfTouchedByPlugins step * Update runner tests * verify person vesrion in event-pipeline-integration test * Update process-event test suite * Argument ordering for person state tests * Update runner test snapshots * Cast to UTC * Fixup person-state tests * Dont refetch persons needlessly on $identify * Add missing version assertion * Cast everything to UTC * Remove version assertion * Undo radical change to event pipeline - will re-add it later! * Resolve comments	2022-06-23 10:27:01 +03:00
Tiina Turban	c659bad2ef	Revert "revert: Rollout ingestion batch breakup by distinctId (#10393 )" (#10398 ) This reverts commit `744d4ddf84`.	2022-06-21 14:34:45 -07:00
Michael Matloka	744d4ddf84	revert: Rollout ingestion batch breakup by distinctId (#10393 ) This reverts commit `9a085cb1f6`.	2022-06-21 19:06:31 +02:00
Tiina Turban	9a085cb1f6	chore: Rollout ingestion batch breakup by distinctId (#10370 ) * chore: Rollout ingetion batch breakup by distinctId * Update task-definition.plugins-ingestion.json Co-authored-by: Michael Matloka <dev@twixes.com>	2022-06-21 17:31:53 +02:00
Karl-Aksel Puulmann	dea8c6973a	perf(plugin-server): reduce number of `person` lookups in the event pipeline (#10324 ) * Return person in PreIngestionEvent if possible * Avoid unneccessarily fetching person in emitToBufferStep * Avoid unneccessarily fetching person in createEvent * Use unified type instead of separate type for cached data * Pass person info forward explicitly in each event-pipeline step * minor typing improvement * Remove person from type * Remove unneeded `undefined` * Add person check for prepareEventStep test * Fix hook test * Update getPersonData tests * Cast created_at to UTC * Cast created_at to utc on fetch * Remove personUuid var - unneeded * Add unit tests for process-event.ts#createEvent	2022-06-21 09:18:22 +03:00
Michael Matloka	313226838c	revert: revert: Revert person properties updates refactor (#10349 ) * Revert "revert: Revert person properties updates refactor (#10348)" This reverts commit `6b3c4691b3`. * sanitizeEvent -> normalizeEvent * Ensure we handle property updates from within plugins, test Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>	2022-06-20 09:49:11 +03:00
Neil Kakkar	6b3c4691b3	revert: Revert person properties updates refactor (#10348 )	2022-06-17 17:48:20 +02:00
Karl-Aksel Puulmann	d6ec3aedc6	refactor(plugin-server): person state updating (#10321 ) * Remove some excessive call signatures * refactor: move property sanitization outside of .capture * Move event sanitization into event sanitization logic * Move person creation/updating logic outside of capture/createSnapshot * refactor: remove personManager from arguments * refactor: remove various properties from arguments * Update `handleIdentifyOrAlias` signature * refactor: inline timeoutGuard into personStateManager * refactor: prefix pipeline steps with indexes * Extract timestamp parsing logic from process-event.ts * refactor: move timestamp tests over from process-event.ts * refactor: update process-event.test.ts * refactor: person-state-manager -> person-state * Move sanitizeEvent to a more suitable module * Fix some process-event tests	2022-06-17 09:17:08 +03:00
Karl-Aksel Puulmann	b4fee54222	refactor(plugin-server): extract person creation/handling logic from EventsProcessor (#10271 ) * refactor: Start with PersonStateManager * refactor: move createPerson to new service * refactor: move team fetching before aliasing * refactor: move `createPersonIfDistinctIdIsNew` * refactor: move `updatePersonProperties` * refactor: move `handleIdentifyOrAlias` * refactor: `createPerson` to private * Fix an import * Remove weird mocking in an e2e integration test	2022-06-14 11:51:58 +03:00
Karl-Aksel Puulmann	c51a2f7bc1	fix(plugin-server): use histogram in metrics properly (#10276 )	2022-06-13 16:48:57 +03:00
Karl-Aksel Puulmann	ebfc8251a7	fix(plugin-server): Properly set `version` in deletePerson (#10207 ) * Use correct style for querying postgres * Add test showing problems with deletePerson logic * Fix deleting persons from clickhouse * Fix concurrent tests * Version + 100 * Fixup FINAL * Remove console.log	2022-06-13 12:58:44 +03:00
Tiina Turban	569b50b4ec	feat(plugin-server): batching ingestion events to single process per distinct id (#10071 )	2022-06-08 19:20:40 +02:00
Yakko Majuri	004ba66349	fix: pass ISO timestamp to onAction/onEvent (#10178 ) * fix: pass ISO timestamp to onAction/onEvent * fix prettier * fix import * update timestamps	2022-06-08 11:05:54 +00:00
Karl-Aksel Puulmann	c3c5eaad02	fix(plugin-server): properly unparse event.properties when PLUGIN_SERVER_MODE=async (#10156 ) * Handle string properties in plugin-server convertToIngestionEvent * Update typing * fix: Add multi-server process event test This got accidentally yeeted from my previous PR. Shame! * Improve tests * Update test to reflect reality	2022-06-07 13:42:07 +03:00
Karl-Aksel Puulmann	59797efce8	refactor(plugin-server): yeet element-group related postgres code (#10161 )	2022-06-07 12:23:20 +03:00
Tiina Turban	95c9045cc1	fix: Postgres and CH Person version across other columns to match (#10135 )	2022-06-06 17:57:03 +02:00
Tiina Turban	26435cb70d	feat: Add version column to person in CH (#10117 )	2022-06-06 13:42:39 +02:00
Michael Matloka	64317238e6	refactor: Eliminate the `KAFKA_ENABLED` setting (#10059 ) * refactor: Eliminate the `KAFKA_ENABLED` setting * Remove dead code * Consolidate plugin server test scripts and CI * Fix CI command * Remove Celery queues * Rearrange test directories * Update import paths	2022-05-30 18:39:33 +00:00

... 2 3 4 5 6 ...

326 Commits