| `PLUGIN_SERVER_MODE=ingestion` | This plugin server instance only runs ingestion (1) |
| `PLUGIN_SERVER_MODE=async` | This plugin server processes all async tasks (2-4). Note that async plugin tasks are triggered based on ClickHouse events topic |
If `PLUGIN_SERVER_MODE` is not set the plugin server will execute all of its tasks (1-4).
| KAFKA_TRUSTED_CERT_B64 | Kafka trusted CA in Base64 | `null` |
| KAFKA_PRODUCER_MAX_QUEUE_SIZE | Kafka producer batch max size before flushing | `20` |
| KAFKA_FLUSH_FREQUENCY_MS | Kafka producer batch max duration before flushing | `500` |
| KAFKA_MAX_MESSAGE_BATCH_SIZE | Kafka producer batch max size in bytes before flushing | `900000` |
| LOG_LEVEL | minimum log level | `'info'` |
| SENTRY_DSN | Sentry ingestion URL | `null` |
| DISABLE_MMDB | whether to disable MMDB IP location capabilities | `false` |
| INTERNAL_MMDB_SERVER_PORT | port of the internal server used for IP location (0 means random) | `0` |
| DISTINCT_ID_LRU_SIZE | size of persons distinct ID LRU cache | `10000` |
| PISCINA_USE_ATOMICS | corresponds to the piscina useAtomics config option (https://github.com/piscinajs/piscina#constructor-new-piscinaoptions) | `true` |
| PISCINA_ATOMICS_TIMEOUT | (advanced) corresponds to the length of time (in ms) a piscina worker should block for when looking for tasks - instances with high volumes (100+ events/sec) might benefit from setting this to a lower value | `5000` |
| HEALTHCHECK_MAX_STALE_SECONDS | 'maximum number of seconds the plugin server can go without ingesting events before the healthcheck fails' | `7200` |
| KAFKA_PARTITIONS_CONSUMED_CONCURRENTLY | (advanced) how many kafka partitions the plugin server should consume from concurrently | `1` |
| PLUGIN_SERVER_MODE | (advanced) see alternative modes section | `null` |
Just bump up `version` in `package.json` on the main branch and the new version will be published automatically,
with a matching PR in the [main PostHog repo](https://github.com/posthog/posthog) created.
It's advised to use `bump patch/minor/major` label on PRs - that way the above will be done automatically when the PR is merged.
Courtesy of GitHub Actions.
## Walkthrough
The story begins with `pluginServer.ts -> startPluginServer`, which is the main thread of the plugin server.
This main thread spawns `WORKER_CONCURRENCY` worker threads, managed using Piscina. Each worker thread runs `TASKS_PER_WORKER` tasks ([concurrentTasksPerWorker](https://github.com/piscinajs/piscina#constructor-new-piscinaoptions)).
### Main thread
Let's talk about the main thread first. This has:
1.`pubSub`– Redis powered pub-sub mechanism for reloading plugins whenever a message is published by the main PostHog app.
1.`hub`– Handler of connections to required DBs and queues (ClickHouse, Kafka, Postgres, Redis), holds loaded plugins.
Created via `hub.ts -> createHub`. Every thread has its own instance.
1.`piscina`– This used to be a manager of tasks that were delegated to threads. It is now a shim over normal JS function calls that will be removed in the future.
1.`pluginScheduleControl`– Controller of scheduled jobs. Responsible for adding Piscina tasks for scheduled jobs, when the time comes. The schedule information makes it into the controller when plugin VMs are created.
Scheduled tasks are controlled with [Redlock](https://redis.io/topics/distlock) (redis-based distributed lock), and run on only one plugin server instance in the entire cluster.
1.`jobQueueConsumer`– The internal job queue consumer. This enables retries, scheduling jobs in the future (once) (Note: this is the difference between `pluginScheduleControl` and this internal `jobQueue`). While `pluginScheduleControl` is triggered via `runEveryMinute`, `runEveryHour` tasks, the `jobQueueConsumer` deals with `meta.jobs.doX(event).runAt(new Date())`.
Jobs are enqueued by `job-queue-manager.ts`, which is backed by Postgres-based [Graphile-worker](https://github.com/graphile/worker) (`graphile-queue.ts`).
1.`queue`– Event ingestion queue. This is a Celery (backed by Redis) or Kafka queue, depending on the setup (EE/Cloud is Kafka due to high volume). These are consumed by the `queue` above, and sent off to the Piscina workers (`src/main/ingestion-queues/queue.ts -> ingestEvent`). Since all of the actual ingestion happens inside worker threads, you'll find the specific ingestion code there (`src/worker/ingestion/ingest-event.ts`). There the data is saved into Postgres (and ClickHouse via Kafka on EE/Cloud).
It's also a good idea to see the producer side of this ingestion queue, which comes from `posthog/posthog/api/capture.py`. The plugin server gets the `process_event_with_plugins` Celery task from there, in the Postgres pipeline. The ClickHouse via Kafka pipeline gets the data by way of Kafka topic `events_plugin_ingestion`.
1.`mmdbServer`– TCP server, which works as an interface between the GeoIP MMDB data reader located in main thread memory and plugins ran in worker threads of the same plugin server instance. This way the GeoIP reader is only loaded in one thread and can be used in all. Additionally this mechanism ensures that `mmdbServer` is ready before ingestion is started (database downloaded from [http-mmdb](https://github.com/PostHog/http-mmdb) and read), and keeps the database up to date in the background.
> An `organization_id` is tied to a _company_ and its _installed plugins_, a `team_id` is tied to a _project_ and its _plugin configs_ (enabled/disabled+extra config).