* everything except plugin server and sync_available_features
* sync_available_features_done, some plugin_server done?
* and a tiny bit more
* linting
* try to fix some tests
* more test fixes/
* clean up typos
* weed wacking bugs
* more test shenanigans
* fix plugin server
* actually fix plugin server test?
* still fixing tests
* another attempt
* some pr feedback
* small fix
* fix database query accessor
* fix functional tests
* fix tests
* Update query snapshots
* Update query snapshots
* Update query snapshots
* update some comments and fxn names
* fix plugin server test
* Update query snapshots
* Update query snapshots
* Update query snapshots
---------
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
* split out InternalPerson and Person types
* disable old backfill code
* add lazy person env var setting
* lazily create person/distinct_id rows
* remove backfill.ts
* drop properties after person lookup
* remove leftover backfill bit
* chore(plugin-server): remove piscina workers
Using Piscina workers introduces complexity that would rather be
avoided. It does offer the ability to scale work across multiple CPUs,
but we can achieve this via starting multiple processes instead. It may
also provide some protection from deadlocking the worker process, which
I believe Piscina will handle by killing worker processes and
respawning, but we have K8s liveness checks that will also handle this.
This should simplify 1. prom metrics exporting, and 2. using
node-rdkafka.
* remove piscina from package.json
* use createWorker
* wip
* wip
* wip
* wip
* fix export test
* wip
* wip
* fix server stop tests
* wip
* mock process.exit everywhere
* fix health server tests
* Remove collectMetrics
* wip
* team-manager: expire negative lookups after 5 minutes, improve docs
* populateTeamDataStep: don't drop token, keep team_id from capture if present, report results
* ingestEvent: run all analytic events through runLightweightCaptureEndpointEventPipeline
* continue accepting events with no token but a team_id
* fix: for empty group props
We were failing if the props were empty.
* refactor: make property definitions db dependencies explicit
We were relying on implicit details setup via `resetTestDatabase` which
makes it hard to reason about what's going on. It also uses hard coded
ideas which limits our ability to isolate tests properly, which would
help both for writing tests with confidence that they are correct, and
for adding for example parallelism.
* add test
This removes the timekeeper library and uses jest fake timers instead.
This also creates the hub once and reuses it for all tests, which is
faster than creating a new hub for each test.
* feat: add the performance events clickhouse schema
* add support for replication being off
* move and refactor performance events sql file
* with the same kafka columns as other tables
* don't account for replication settings in migration
* update snapshots
* kafka table doesn't have explicit kafka columns
Co-authored-by: Ben White <ben@posthog.com>
* add support for token field in kafka message
* formPipelineEvent
* rename pipeline files according to new order
* wip team_id and anonymize ips
* conditional handlers and tests
* some plugin server fixes
* fix capture bug
* fix
* more fixes
* fix capture tests
* pipeline update
* fix + investigate database resets
* fix import order
* testing and typing updates
* add test for capture endpoint
* testing
* python typing
* plugin server test
* functional test
* fix test
* another fix
* make sure no team ids clash in tests
* fix
* add more metrics and logs
* cache nulls
* updates
* add more metrics
* chore(plugins-server): use Kafka to buffer app jobs requests
To remove the dependency on the Graphile Worker database on things that
may be requesting app job runs we push the jobs to a Kafka topic.
* chore: use KAFKA_JOBS instead of string literal `'jobs'`
* chore: rename startJobsBufferConsumer -> startJobsConsumer
* avoid checking eventId
* fix lint
* fix producer wrapper tests
* fix retries test
* handle offset sync
* wip
* wip
* remove exports
* do better
* use Producer not wrapper
* reset db
* mock once
* Add test for raising to the consumer
* Update plugin-server/tests/main/ingestion-queues/run-async-handlers-event-pipeline.test.ts
Co-authored-by: Yakko Majuri <38760734+yakkomajuri@users.noreply.github.com>
* and in the darkness bind them
* fix tests
* don't forget the name update!
* rename DependencyError to DependencyUnavailable
* separate dlq
* update comment
Co-authored-by: Yakko Majuri <38760734+yakkomajuri@users.noreply.github.com>
* chore(plugin-server): use DELETE instead of TRUNCATE
Truncate seems a little slow. Other options to consider:
1. PostgreSQL fsync settings in tests
2. using tmpfs for "persistence"
3. use transaction/rollback: not totally sure we'd be able to do this
in our tests but may be worth a try.
* wip
* wip
* set fsync=off
* Delete all tables in current schema
* don't bother with fsync=off
* refactor(plugin-server): split out plugin server functionality
To get better isolation we want to allow specific functionality to run
in separate pods. We already have the ingestion / async split, but there
are further divides we can make e.g. the cron style scheduler for plugin
server `runEveryMinute` tasks.
* split jobs as well
* Also start Kakfa consumers on processAsyncHandlers
* add status for async
* add runEveryMinute test
* avoid fake timers, just accept slower tests
* make e2e concurrent
* chore: also test ingestion/async split
* increase timeouts
* increase timeouts
* lint
* Add functional tests dir
* fix
* fix
* hack
* hack
* fix
* fix
* fix
* wip
* wip
* wip
* wip
* wip
* fix
* remove concurrency
* remove async-worker mode
* add async-handlers
* wip
* add modes to overrideWithEnv validation
* fix: async-handlers -> exports
* update comment
* tests(plugin-server): run same tests for single and multi process modes
Previously we were running different tests, now we run the same.
* exclude async on ingestion capability
* fix typing
* do not test for onSnapshot
* fix autocapture test
* refactor(e2e): make e2e tests 4x faster
I'm a little scared to make changes to the pluginsServer without having
some way to test that is close to production. I started changing some of
the internals in https://github.com/PostHog/posthog/pull/12191 but
started to see test failures for internals that I'm not confident in
handling without having some higher level tests testing the
pluginsServer functionality as a whole, so I started looking at the e2e
tests which are very slow and look like they don't have great coverage.
Rather than restarting the pluginsServer each time, we can use the
feature of partitioning by teamId to run multiple tests against the same
server.
To do this I've added a few helper functions for pulling out of
ClickHouse which enforce that you include a filter on teamId. It's a
little more verbose but also hopefully serves as good documentation for
how e.g. plugins work and are loaded.
I've also avoided reaching into the internals of the pluginsServer as
much as possible, only doing so to reload the plugins. Ideally we'd also
handle the reload via the PubSub which would allow us to, for example,
run the e2e tests against a separate pluginsServer process thereby
getting more confidence, but that's a stretch and maybe something we can
do without. This would however make it easy, for instance to:
1. make big refactors without fear of breaking things
2. swap out, e.g. the anonymous consumer to use instead something
written in bash or maybe something not quite as absurd.
* enable buffer, remove random teamId
* Add new app_metrics feature available on scale and enterprise
* chore(ingestion): cache available_features for a longer period in OrganizationManager
This will come in handy for app metrics and is generally a performance win
* Add service to track app metrics
* refactor(historical-exports): Move retry limit handling code to same place as other error handling
* Track app metrics in processEvent/onEvent/exportEvents and historical exports
* Add mising app-metrics file
* Add missing __init__.py module
* Use correct topic + columns for app metrics
* Add a placeholder schema
* Set timestamp correctly
* Fix a typeerror in organization-manager.ts
* Schema fixup
* Add test showing read-own-writes logic
* Remove unneeded TODO
* Add missing constant
* Simplify flushing logic
* Stabilize VM tests
* Use correct sharding key
* Revert hooks changes
* site.ts
* Update snapshots
* fix file in test archive
* Update snapshots
* more renames
* plugin source file updated at
* fix test
* last renames
* add null
* blank
* remove what we don't need
* get test to pass
* use new posthog-js
* new version
Co-authored-by: mariusandra <mariusandra@users.noreply.github.com>
* Add a is_system flag to activity logs
* Allow writing activity logs from within plugin-server
* Make changes object non-required
* Render system user information
* Log when export finishes or fails in plugin activity log
* Update activityLogLogic.plugin.test.tsx
* support adding custom files
* compile web.ts
* add injected code into /decide
* transpile better
* add id and payload stub to injected code
* skip backend if web only
* more web source
* add payload to default demo
* values list and inline for
* revert
* simpler query
* reload plugin if source changed
* pass on config that has "web":true
* add a bootloader for scripts larger than 1kb
* rename payload to config
* access posthog also in injected script
* cleanup
* pass meta object to inject
* split web_js code
* List
* add a web token for plugin configs
* test for web.ts
* use WEB_APP_INJECTION env to enable/disable injection
* fix types
* remove instance setting
* add inject_web_apps on team
* track installed web apps on team
* refresh from db
* enabling web apps makes an extra query
* null true
* clean imports
* null check
* fix migration
* fix noise in tests
* test web app injection
* update Team without going through a signal loop
* update plugin test
* update plugin test
* typo
* Update snapshots
* typing
* inject only via external scripts
* enable injection in the app
* update posthog-js
* blank source files when updating
* url dataclass
* comment
* add and remove files
* not users
* only show button to add files if there are files to add
* update in bulk
* TYPE_CHECKING
* abstract transpilation just a bit
* feedback and bug
* we're down under
* errors if failed
* raise on 404
* clean up more
* refactor
* add updated_at to list
* no need to check code equality
* remove australia ip override
* test
* remove support for random files
Co-authored-by: mariusandra <mariusandra@users.noreply.github.com>
* Initial schema for new ingestion_warnings table
* Update and test ingestion_warnings API endpoint with new table
* Add captureIngestionWarning function
* Update variable name
* Reformat with black
* Remove ver
* Include partition
* Update test_schema snapshots
* Solve weird mypy error
* Update docs around function
* Experimental tracing support for plugin server
* Add tag to postgresTransaction
* Track event pipeline steps as separate spans
* Track kafka queueMessage?
* Tracing for processEvent, onEvent, onSnapshot
* plugin.runTask
* Move sentry code
* Make tracing rate configurable
This is currently one of our slowest queries. The query itself is fine
but the amount of data returned is causing issues.
`error` column constitutes >80% of data stored in the column - by
removing it in the query we should see significant speedups.
pganalyze page: https://app.pganalyze.com/databases/-675880137/queries/5301003665?t=7d
* chore(plugin-server): Consume from buffer topic
* Refactor `posthog` extension for buffering
* Properly form `bufferEvent` and don't throw error
* Add E2E test
* Test buffer more end-to-end and properly
* Put buffer-enabled test in a separate file
* Update each-batch.test.ts
* Test that the event goes through the buffer topic
* Fix formatting
* Refactor out `spyOnKafka()`
* Ensure reliability batching-wise
* Send heartbeats every so often
* Make test less flaky
* Commit offsets if necessary before sleep too
* Update tests
* Use seek-based mechanism (with KafkaJS 2.0.2)
* Add comment to clarify seeking
* Update each-batch.test.ts
* Make minor improvements