Issue: https://github.com/PostHog/posthog/issues/10337
Previously when emitting an $identify event which would result in a
merge and would have with $set/$set_once, plugin-server would:
1. Merge the two peoples properties
2. Update person properties separately
This causes a race as a single event-pipeline flow would emit two
different person update rows to clickhouse, with no guarantees which one
would win due to identical _timestamps.
We eliminate the need to double-update in a single message and add
tests.
* Update person-state to return person
* Add personState#update for joint updates
* Create personManager on hub
* Fix bug when $identify is the first user event
Bug was this:
1. We tried to create person with blank properties, later update the
properties separately.
2. This resulted in two kafka messages with identical _timestamps due to
batching
3. There was a 50/50 chance that the event would not be handled properly
We now don't create multiple messages if not needed. I also added tests
to person-state, but not exhaustive ones.
Also fixed an issue with `setIsIdentified` being called too many times
Main issue: https://github.com/PostHog/posthog/issues/10208
* Fixup - avoid errors
* Set time limit
* Re-add distinct_id as argument to fix tests
* Remove unneeded .toString()
* Fix is_identified as identified by tests
* Bump testTimeout
* Combine `runOnEvent` and `runOnSnapshot`
* Commoditize `RetryError` handling and use it in `on*` functions
* Fix names of custom errors
* Fix `runProcessEvent` signature
* Separate `runOnEvent` and `runOnSnapshot` again
* Fix `runRetriableFunction`
* Refactor `getPluginsForTeam` into `getPluginMethodsForTeam`
* Create retries.test.ts
* Remove debug code
* Fix formatting
* Put `runRetriableFunction` and `getNextRetryMs` in `retries.ts`
* Rework `runRetriableFunction` to void subsequent retries
* Capture `iterateRetryLoop` exceptions
* Fix total timing
* Fix an import
* Remove useless `teamIdString`
* Fix another import
* Revert small refactor
* Add `getNextRetryMs` tests
* Use `PromiseManager`
* refactor: Start with PersonStateManager
* refactor: move createPerson to new service
* refactor: move team fetching before aliasing
* refactor: move `createPersonIfDistinctIdIsNew`
* refactor: move `updatePersonProperties`
* refactor: move `handleIdentifyOrAlias`
* refactor: `createPerson` to private
* Fix an import
* Remove weird mocking in an e2e integration test
* Use correct style for querying postgres
* Add test showing problems with deletePerson logic
* Fix deleting persons from clickhouse
* Fix concurrent tests
* Version + 100
* Fixup FINAL
* Remove console.log
* Reorder blocks in `member_join.html` to make sense
* Add Celery task to send "fatal app error" email
* Add fallback to `Plugin.__str__` if `name` is empty
* Add lightweight `celeryApplyAsync` to `DB`
* Refactor `LazyPluginVM` to send "fatal app error" email
* Add tests
* Update email punctuation
* Update comment to be accurate
* Address feedback
* Add comment
* refactor tests to be more extensible
* Move elements-related code to separate file
* Copy over tests for element chain
* Handle `undefined` match group
As this test from python indicates, the right-most match group may be
empty. In javascript the behavior for this is different from python and
the match group may be `undefined`
* Update import
* Fix http capabilities
* Handle string properties in plugin-server convertToIngestionEvent
* Update typing
* fix: Add multi-server process event test
This got accidentally yeeted from my previous PR. Shame!
* Improve tests
* Update test to reflect reality
* refactor: Eliminate the `KAFKA_ENABLED` setting
* Remove dead code
* Consolidate plugin server test scripts and CI
* Fix CI command
* Remove Celery queues
* Rearrange test directories
* Update import paths
* Remove test logging
* Support unsetting person properties in the plugin server
* Update persons via clickhouse
Note that this doesn't support deleting person properties yet due to
limitations on CH side.
https://github.com/PostHog/posthog/issues/9856
* Separate endpoint for deleting user properties
* Tests for person property updating/unsetting
* Fix eslint
* Fix a previous test
* Fixup
* dont update postgres for property updates
* Add toast
* Add PLUGIN_SERVER_MODE
* Make capabilities dependent on PLUGIN_SERVER_MODE
* Subscribe to kafka-events topic
* runAsyncHandlersEventPipeline
* Test fixup: fix typing error
* Test fixup: flush right after queueing message
* Parse clickhouse event correctly
* Different consumer group ids for kafka queue based on mode
* Set different prompts for different modes
* Capability for http, disabled in tests
* Elements chain handling in async ingestion
* Test for runner.test.ts
* Update a snapshot
* Update plugin-server/README.md
Co-authored-by: Yakko Majuri <38760734+yakkomajuri@users.noreply.github.com>
* Solve review-related issues
* Fix a test
* Fix imports
* Capabilities test fix
* Update tests
Co-authored-by: Yakko Majuri <38760734+yakkomajuri@users.noreply.github.com>
* Fix(plugin-server): run all plugin-server tests
* Broken test fixes
* Fix integration test suite issue
* Remove dead vars
* Fixup package.json
* Fix a test
* feat(plugin-server-split): only setup plugins the server has a use for
* fix tests
* Update plugin-server/tests/worker/vm/capabilities.test.ts
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* address review
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* SiteUrlManager
* Update site_url
* Update imports
* Use SiteUrlManager with webhooks
* Remove most siteUrl usecases from preIngestionEvent
* Run onEvent even if action lookup fails
* Add tests
* Re-add some null guards
* Add missing test file
* Missing await
* refactor(plugin-server): move non postgres ingestion related tests out of postgres folder
* revert plugin logs move
* paths
Co-authored-by: Michael Matloka <dev@twixes.com>
* refactor(plugin-server): Require `clickhouse`, `kafka`, `kafkaProducer`
* Update Redis status log
* Don't use `kafkaProducer` as a proxy for `KAFKA_ENABLED`
* Delete file that's not in `master`
* Fix bad text replacement
* Fix more bad text replacement
* Apply suggestions from code review
Co-authored-by: Tiina Turban <tiina303@gmail.com>
* Fix imports
* Fix passing config in tests
Co-authored-by: Tiina Turban <tiina303@gmail.com>
* Start refactoring event pipeline
* Add some initial metrics
* Handle DLQ error messages in pipeline runner
* Add public functions for the pipeline
* Tests for runner.ts
* Tests for every step in event pipeline
* yeet some now-unneeded worker code
* Add timeoutGuard
* Emit to DLQ from buffer
* Move some tests to a separate file
* fix internal metrics
* Refactor method location, WIP
* Fix code determining if user is a recent person or not
* Update tests to deal with new pipeline
* Rename methods for consistency
* Remove now-dead test
* Update process-event.test.ts
* Update DLQ test
* Ignore test under yeet
* Remove mocked
* Remove dead code
* Update naming
* fix: make kafka health check timeout test reliable
This was reported to be flaky. I believe the issue might be related to
pausing not being immediate and it causing timing-related issues.
* Try again for more reliability