0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-11-29 03:04:16 +01:00
Commit Graph

201 Commits

Author SHA1 Message Date
Michael Matloka
7bd3cac2f5
refactor(plugin-server): Unify event types (#10612)
* Simplify Event, ClickHouseEvent, PreIngestionEvent, IngestionEvent

* Unify `ClickhouseEventKafka` with `RawEvent`

* Fix imports

* Eliminate PostgresSessionRecordingEvent

* Parse `Event.elements_chain` too

* Update process-event.test.ts

* Update tests

* Make `IngestionEvent['timestamp']` consistent

* Update tests

* Restore `PreIngestionEvent` vs. `PostIngestionEvent` split

* Update worker.test.ts

* Improve typing a bit

* Update tests to work with mandatory `DateTime`

* Remove ClonableIngestionEvent

* Rename RawEvent -> RawClickHouseEvent

* Rename Event -> ClickHouseEvent

* Update prepareEventStep tests

* Update convertToIngestionEvent behavior back to master

* Update tests to compile

* Use branded types for ISO/Clickhouse timestamp string disambiguation

* Test for parseRawClickHouseEvent()

* Update each-batch tests

* Tests for clickHouseTimestampToDateTime()

Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
2022-08-15 10:54:09 +03:00
Harry Waye
635fc7b23d
chore(plugin-server): remove healthcheck topic references (#11252)
* chore(plugin-server): remove healthcheck topic references

Rather than doing an end to end produce/consume from this topic, we
instead rely on the intrumentation of KafkaJS to understand if the
consumer is ready.

Note that this code is not being used since the change to just return an
HTTP 200 from the liveness endpoint:
https://github.com/PostHog/posthog/pull/11234

This is just a cleanup of dead code.

* Remove Kafka healthcheck tests
2022-08-11 12:11:43 +00:00
Karl-Aksel Puulmann
aedba02586
fix(plugin-server): Handle race between merge and updatePersonProperties (#11217)
Seeing this (rarely) in the wild: https://sentry.io/organizations/posthog2/issues/3348207612/?project=6423401&query=is%3Aunresolved+level%3Aerror

The 'fix' is to re-fetch person and try again - this is sufficient since
we don't do A -> B -> C merges
2022-08-10 10:57:17 +03:00
Karl-Aksel Puulmann
54c1fddc7e
chore(plugin-server): iterate tracking timeouts, include callsite location (#11205) 2022-08-10 00:20:28 +03:00
Karl-Aksel Puulmann
3c656098c5
chore(plugin-server): Debug TimeoutError (#11201)
* Include information on plugin/source in timeout messages

* Call __asyncGuard correctly

Previously __asyncGuard could be called without context. We now include
await/Promise.then rather than random promise objects as arguments

* Update transforms testing code
2022-08-09 14:46:59 +03:00
Karl-Aksel Puulmann
154a1a16dc
perf(plugin-server): Do bulk inserts within team-manager (#11195)
* Clean up `updateEventNamesAndProperties`

1. Add parallelization
2. Add error handling
3. Fix up weird handling of statsd timing

* Use EXCLUDED over interpolated value in team-manager.ts

* Convert teamManager to work via bulk inserts

* Refactor team-manager tests
2022-08-09 13:15:17 +03:00
Karl-Aksel Puulmann
606eee5c1c
chore(plugin-server): iterate tracking of person lazy loading (#11169)
* Iterate logging for person-state more

* iterate

* Add test for race
2022-08-05 15:55:48 +03:00
Karl-Aksel Puulmann
0467245276
fix(plugin-server): set right person_id if person created in a race (#11148)
* chore(plugin-server): Add test for $snapshot event never fetching person data

Follow-up to previous lazy person loading PR

* Verify person data loaded once

* fix(plugin-server): set right person_id if person created in a race

Previously (even after lazy-loading persons), person_id could be set to
undefined if for a new user, two events (A & B) were processed in parallel.
1. A runs buffer step, preloading person as undefined
2. B runs event pipeline, creating person
3. A runs person update, getting that person exists in pg
4. A sets person_id column as `undefined` since person was loaded in
step (1)

We now reset the personContainer instead if we suspect person might have
been created in a race.

* Fix test
2022-08-05 09:26:57 +03:00
Karl-Aksel Puulmann
f8c203fe5a
fix(plugin-server): refactor groups caching (#11141)
* Remove unneeded method

* Refactor how groups are handled

* Remove .only
2022-08-05 09:26:45 +03:00
Karl-Aksel Puulmann
449c255302
chore(plugin-server): Add test for $snapshot event never fetching person data (#11147)
* chore(plugin-server): Add test for $snapshot event never fetching person data

Follow-up to previous lazy person loading PR

* Verify person data loaded once
2022-08-05 09:25:54 +03:00
Tiina Turban
c41b24085e
chore: remove enabled for months envs - attempt 2 (#11118) 2022-08-04 17:03:47 +02:00
Karl-Aksel Puulmann
4f648268f2
feat(ingestion): Make person loading lazy (#11091)
* fix issues with fetchPerson() and add tests

- fetchPerson() returned extra columns that were not needed

* Add LazyPersonContainer class

* Load person data lazily through the event pipeline

* Make webhooks and action matching lazy

* Update runAsyncHandlersStep

* Return own person properties in process-event.ts

* Remove snapshots that caused pain

* Handle serialization of LazyPersonContainer

* Merge: Handle LHS only existing

.get() would be cached in that case not to do a query, which we can
avoid

* Serialize result args as well

* Make personContainer functional

* Resolve feedback
2022-08-04 09:57:43 +03:00
Tiina Turban
49f3311a65
chore: nuke protobuf fully (#10932) 2022-08-03 15:41:44 +02:00
Karl-Aksel Puulmann
9c6f20b697
chore(plugin-server): Improve tracing (#11042)
* Include kafka topic for setup

* Sample runEventPipeline/runBufferEventPipeline less frequently comparatively

This is done by duration - we still want the long transactions, but not
the short ones

* Trace enqueue plugin jobs

* Trace node-fetch

* Trace worker creation

* Various fixes

* Line up query tags properly

* Make fetch mocking work

* Resolve typing-related issues
2022-08-03 16:12:56 +03:00
Karl-Aksel Puulmann
c9f05fdaf1
chore(plugin-server): support tracing in plugin-server (#11029)
* Experimental tracing support for plugin server

* Add tag to postgresTransaction

* Track event pipeline steps as separate spans

* Track kafka queueMessage?

* Tracing for processEvent, onEvent, onSnapshot

* plugin.runTask

* Move sentry code

* Make tracing rate configurable
2022-07-28 15:05:00 +03:00
Ben White
f0f0cd4e15
feat: Testing alpha releases of JS libs (#11011)
* feat: Updated to alpha version of posthog-js
* Swap to alpha versions of other libs
2022-07-28 11:19:56 +00:00
Karl-Aksel Puulmann
156fa2353f
feat(plugin-server): Use Snappy compression codec for kafka production (#10974)
* feat(plugin-server): Use Snappy compression codec for kafka production

This helps avoid 'message too large' type errors (see
https://github.com/PostHog/posthog/pull/10968) by compressing in-flight
messages.

I would have preferred to use zstd, but the libraries did not compile
cleanly on my machine.

* Update tests
2022-07-28 11:58:33 +03:00
Karl-Aksel Puulmann
d00d587b1c
chore(plugin-server): Improve kafka producer wrapper (#10968)
* chore(plugin-server): include extra information on kafka producer errors

We're failing to send batches of messages to kafka on a semi-regular
basis due to message sizes. It's unclear why this is the case as we try
to limit each message batch size.

This PR adds information on these failed batches to sentry error
messages.

Example error: https://sentry.io/organizations/posthog2/issues/3291755686/?project=6423401&query=is%3Aunresolved+level%3Aerror

* refactor(plugin-server): Remove Buffer.from from kafka messages

This allows us to be much more accurate estimating message sizes,
hopefully eliminating a class of errors

* estimateMessageSize

* Track histogram with message sizes

* Flush immediately for too large messages

* fud
2022-07-27 11:26:19 +00:00
Karl-Aksel Puulmann
6457f0296b
fix(ingestion): Change overrides order when parsing Kafka messages (#10998)
* Move formPluginEvent

* Work around risky behavior

Previously, users could override some important event fields by passing
values in their payload. This bug was introduced way back in https://github.com/PostHog/plugin-server/pull/34

This bug indirectly caused the following sentry errors:
- https://sentry.io/organizations/posthog2/issues/3289550563/?project=6423401&query=is%3Aunresolved+level%3Aerror
- https://sentry.io/organizations/posthog2/issues/3455742732/?project=6423401&query=is%3Aunresolved+level%3Aerror
- https://sentry.io/organizations/posthog2/issues/3382895905/?project=6423401&query=is%3Aunresolved+level%3Aerror

One area I'm unsure on is specifically `ip` field and its expected
behavior, but looking at old code from 2020 it seems we always took the
ip from request rather than looking at event body.
2022-07-27 13:26:50 +03:00
Karl-Aksel Puulmann
612d62610e
perf(plugin-server): speed up getPluginConfigRows (#10952)
This is currently one of our slowest queries. The query itself is fine
but the amount of data returned is causing issues.

`error` column constitutes >80% of data stored in the column - by
removing it in the query we should see significant speedups.

pganalyze page: https://app.pganalyze.com/databases/-675880137/queries/5301003665?t=7d
2022-07-25 16:27:18 +03:00
Karl-Aksel Puulmann
e00b458dd8
fix(plugin-server): don't query source column for posthog_plugin (#10949)
This column is quite large and unused. Large columns slow down queries
due to extra data being sent back and forth.
2022-07-25 16:00:22 +03:00
Harry Waye
d7998cef30
Revert "chore(dev): use network mode host for docker-compose services (#10917)" (#10926)
This reverts commit 225a41db72.
2022-07-22 10:25:59 +01:00
Yakko Majuri
a8687f896e
fix: don't pull unnecessary plugin data in the plugin server (#10909)
* fix: don't pull unnecessary plugin data in the plugin server

* fix tests

* fix test 2

Co-authored-by: Tiina Turban <tiina303@gmail.com>
2022-07-21 16:47:55 +00:00
Harry Waye
225a41db72
chore(dev): use network mode host for docker-compose services (#10917)
* chore(dev): use network mode host for docker-compose services

This removes the need to add kafka to /etc/hosts.

As far as I can tell this should be fine for peoples local dev except
they will be required to reset and re-migrate ClickHouse tables as they
will be trying to pull from `kafka` instead of `localhost`.

* remove ports from redis

* Update a few more references
2022-07-21 15:29:31 +01:00
Harry Waye
e7a9b7de79
fix(autocapture): ensure $elements passed to onEvent (#10880)
* fix(autocapture): ensure `$elements` passed to `onEvent`

Before calling `onEvent` the plugin does, amoung other things, a delete
on `event.properties` of the `$elements` associated with `$autocapture`.
This means that for instance the S3 plugin doesn't include this data in
it's dump.

We could also include other data like `elements_chain` that we also
store in `ClickHouse` but I've gone for just including `elements` for
now as `elements_chain` is derived from `elements` anyhow.

* revert .env changes, I'll do that separately

* run prettier

* update to scaffold 1.3.0

* fix lint

* chore: update scaffold to 1.3.1

* update scaffold
2022-07-20 14:33:32 +01:00
Neil Kakkar
2b370c2d1a
fix(data-management): Allow property type updates (#10897) 2022-07-20 13:55:25 +01:00
Yakko Majuri
4bce5dfa8a
feat: (bring back) buffer 3.0 again (#10896)
* Revert "Revert "feat: (bring back) buffer 3.0  (#10874)" (#10883)"

This reverts commit e203bc7cfa.

* reduce graphile load
2022-07-20 12:16:13 +00:00
Yakko Majuri
e203bc7cfa
Revert "feat: (bring back) buffer 3.0 (#10874)" (#10883)
This reverts commit 3e772b8614.
2022-07-19 17:50:06 +00:00
Yakko Majuri
3e772b8614
feat: (bring back) buffer 3.0 (#10874)
* Revert "Revert "feat: buffer 3.0 (graphile) (#10735)" (#10802)"

This reverts commit ca8c4d0271.

* add metrics and error tracking
2022-07-19 16:34:07 +00:00
Yakko Majuri
ca8c4d0271
Revert "feat: buffer 3.0 (graphile) (#10735)" (#10802)
This reverts commit 9a2a9046cb.
2022-07-14 18:24:58 +00:00
Yakko Majuri
9a2a9046cb
feat: buffer 3.0 (graphile) (#10735)
* feat: buffer 3.0 (graphile)

* fixes

* test

* address review

* add test for buffer processAt
2022-07-13 11:32:00 +00:00
Yakko Majuri
25152334f9
refactor(plugin-jobs): make job queues extensible (#10713)
* refactor(plugin-jobs): make job queues extensible

* fix enqueue callsites

* final fix

* use enum for job names

* test fix

* fix tests
2022-07-11 15:34:36 +00:00
Yakko Majuri
985148ee7e
feat: buffer 2.0 (#10653)
* feat: buffer 2.0 proposal

* add tests

* prevent infinite retrying

* perf

* updates

* tweaks

* Update latest_migrations.manifest

* Update plugin-server/src/main/ingestion-queues/buffer.ts

* update

* updates

* fix migrations issue

* reliability uopdates

* fix tests

* test fix

* e2e test

* test

* test

* ??

* cleanup
2022-07-08 10:48:25 +00:00
Yakko Majuri
9a33696351
feat: split plugin server healthcheck into readiness and health (#10638)
* feat: splut plugin server healthcheck into readiness and health

* add ingestion capability to test
2022-07-07 10:39:22 +00:00
Yakko Majuri
58a1fea111
fix: handle stale batches in buffer (#10643)
* Revert "Revert "fix: handle stale batches in buffer (#10641)" (#10642)"

This reverts commit b564688ad8.

* fix test
2022-07-05 18:16:49 +00:00
Yakko Majuri
1936700172
fix: do not send events from mobile libraries to the buffer (#10628)
* fix:do not send events from mobile libraries to the buffer

* Update plugin-server/src/worker/ingestion/event-pipeline/1-emitToBufferStep.ts
2022-07-05 12:33:12 +00:00
Michael Matloka
4b674100cc
fix(event-buffer): Remove buffering for recently created persons (#10553)
* fix(event-buffer): Remove buffering for recently created persons

* Update emitToBufferStep.test.ts

* Update comment
2022-06-29 12:22:45 +00:00
Michael Matloka
b04015f25e
chore(plugin-server): Consume from buffer topic (#10475)
* chore(plugin-server): Consume from buffer topic

* Refactor `posthog` extension for buffering

* Properly form `bufferEvent` and don't throw error

* Add E2E test

* Test buffer more end-to-end and properly

* Put buffer-enabled test in a separate file

* Update each-batch.test.ts

* Test that the event goes through the buffer topic

* Fix formatting

* Refactor out `spyOnKafka()`

* Ensure reliability batching-wise

* Send heartbeats every so often

* Make test less flaky

* Commit offsets if necessary before sleep too

* Update tests

* Use seek-based mechanism (with KafkaJS 2.0.2)

* Add comment to clarify seeking

* Update each-batch.test.ts

* Make minor improvements
2022-06-28 13:30:10 +02:00
Yakko Majuri
a598c7b664
feat(persons-on-events): cache + send persons and groups created_at with events (#10457)
* feat(persons-on-events): cache + send persons and groups created_at with events

* more testing

* Update plugin-server/src/utils/db/db.ts

* better naming

* fixes

* testing

* update test
2022-06-27 11:39:58 +00:00
Neil Kakkar
9712fd9bb5
chore(feature-flags): Upsert hash key overrides on people merges (#10418) 2022-06-24 10:58:42 +01:00
Karl-Aksel Puulmann
773f922eef
feat(apps): Remove onAction plugin function (#10414)
* Remove onAction

* Avoid fetching actions that dont deal with REST - 99% reduction

* Plural hooks

* Avoid hook fetching where not needed

* Remove dead code

* Update lazy VM test

* Rename a function

* Update README

* Explicit reload actions in tests

* Only reload actions which are relevant for plugin server

* Remove excessive logging

* Reload actions when hooks are updated

* update action matching tests

* Remove commented code

* Solve naming issues
2022-06-24 12:29:10 +03:00
Neil Kakkar
350cf11a1b
fix(person-state): Reduce dependency between tests (#10455)
* fix(person-state): Reduce dependency between tests

* do the nasty way

* remove console

* Update clickhouse tests

Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
2022-06-23 19:46:49 +03:00
Karl-Aksel Puulmann
5052c61947
fix(plugin-server): only load actions in PLUGIN_SERVER_MODE=async (#10438)
Currently ingestion plugin-servers load actions every thread for no
reason. Stopping that from happening.
2022-06-23 15:39:25 +03:00
Karl-Aksel Puulmann
d14296f615
perf(plugin-server): load less team data, increase caching (#10439)
* perf(plugin-server): load less data for each team

Our `team` model is pretty fat and we were fetching columns not used in
plugin server from app. Reducing the no of columns will make lookups
faster.

* perf(plugin-server): cache team data for longer

We now cache team data for 2 minutes over 30 seconds. The trade-off is
that `anonymize_ips` setting will take longer to propagate, but we
already check that in capture endpoint
2022-06-23 15:38:58 +03:00
Karl-Aksel Puulmann
d04d0ce0f9
fix(plugin-server): don't load/buffer $snapshot events (#10436)
This behavior was added in
https://github.com/PostHog/posthog/pull/10360, we now loaded persons
needlessly for snapshot events.
2022-06-23 11:08:01 +03:00
Karl-Aksel Puulmann
f4668ed855
refactor(plugin-server): move buffer as first step of event pipeline & more (#10360)
* WIP: Move person creation earlier

* WIP: move person updating, handle person property changing

* WIP: leverage person information

* Update `updatePersonDeprecated` signature

* Avoid (and test avoiding) unneeded lookups whether 'creating' person is needed

Note there were two tricky interactions within handleIdentify, which
again got solved by indirect message passing.

* Solve TODO

* Normalize event before updatePersonIfTouchedByPlugins

* Avoid another lookup for person in updatePersonProperties

* Avoid lookup for newPerson in handleIdentifyOrAlias

* Add kludge comments

* Fix runBufferEventPipeline

* Rename upsertPersonsStep => processPersonsStep

* Update emitToBufferStep tests

* Update some event pipeline step tests

* Update prepareEventStep tests

* Test processPersonStep

* Add tests for updatePersonIfTouchedByPlugins step

* Update runner tests

* verify person vesrion in event-pipeline-integration test

* Update process-event test suite

* Argument ordering for person state tests

* Update runner test snapshots

* Cast to UTC

* Fixup person-state tests

* Dont refetch persons needlessly on $identify

* Add missing version assertion

* Cast everything to UTC

* Remove version assertion

* Undo radical change to event pipeline - will re-add it later!

* Resolve comments
2022-06-23 10:27:01 +03:00
Michael Matloka
3578a0c1c2
perf(apps): Only load PluginSourceFiles instead of Plugin.archive (#10374)
* perf(apps): Cache app code at install instead of fetching `archive` blob

* Fix typing

* Add migration

* Use `cast()` instead of `assert`

* Update plugin test helpers

* Restore `reload_plugins_on_workers()`

* Save plugin source file rows separately from plugin rows

* Fix migration

* Update latest_migrations.manifest

* Update plugins.test.ts

* Separate `plugin60WithSource` from `plugin60`

* Handle `PluginSourceFile` at `insertRow()` level

* Unify `plugin60WithSource` and `plugin60` again

* Update sql.test.ts

* Fix and simplify reading files in `loadPlugin()`

* Update plugins.test.ts

* Update more tests

* Clean code up

* Specify `json_parse=True` in lower-level function tests

* Add tests for `update_or_create_from_plugin_archive()`

* Update migration with updated function

* Update plugin `upgrade` endpoint

* Remove leftover `print()`

* Add error handling to `0243_unpack_plugin_source_files`

* Add `upgrade` assertions

* Revert plugin server changes

* Fix typing

* perf(apps): Only load `PluginSourceFile`s instead of `Plugin.archive`

This reverts commit 180ed228b8.

* Update logging level and message

* Refactor `find_index_ts_in_archive()` and `extract_plugin_code()` out

* Don't select `latest_tag` and `latest_tag_checked_at`

* Use kwargs in logging

Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>

* Fix missing comma

* Only throw if `plugin_json_parsed` is `None`

* Fix `reverse_func`

* Accept empty files

* Make sure files which are deleted between version are gone

* Update 0243_unpack_plugin_source_files.py

* Update 0243_unpack_plugin_source_files.py

* Explain query counts

* Use `@snapshot_postgres_queries` instead of `assertNumQueries()`

Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
2022-06-22 16:33:23 +02:00
Yakko Majuri
149604f530
fix(plugin-server): start main thread consumer only after plugins are loaded on threads (#10385)
* fix(plugin-server): start main thread consumer only after plugins are loaded on threads

* updates
2022-06-22 11:37:30 +01:00
Tiina Turban
c659bad2ef
Revert "revert: Rollout ingestion batch breakup by distinctId (#10393)" (#10398)
This reverts commit 744d4ddf84.
2022-06-21 14:34:45 -07:00
Michael Matloka
744d4ddf84
revert: Rollout ingestion batch breakup by distinctId (#10393)
This reverts commit 9a085cb1f6.
2022-06-21 19:06:31 +02:00