add env var for number of nginx unit worker processes
i suspect that with asgi we'll be better off with 1 instead of 4 worker
processes - i'd like us to be able to test this per deployment via an
env var
* Revert "revert(insights): HogQL calculation of saved legacy insights v3 (#21778)"
This reverts commit c0be1d1412.
* Move HogQL in insight serialization to its own flag
* fix(insights): HogQL calculation of saved legacy insights v2
This reverts commit a6314c6bb7.
* Only use cached results in `process_query` for insight serializer
* Fix type of results
* Rename `RecalculationMode` to `ExecutionMode`
* Fix typing more
* Properly support dashboard filters
* Hacky fix for schema.py
* Don't test legacy `generate_insight_cache_key` with `query`
* Fix importing & typing
* Fix typo
* Update test_query_runner.py
* Account for property filter groups in dashboard filters
* Do return stale result in CACHE_ONLY case
* Fix `execute_hogql_query` espionage
Wow, this was a pain to figure out, only was an issue in CI, because the trigger was `TestCohort::test_creating_update_and_calculating_with_new_cohort_query` running prior to `TestInsight:: test_insight_refreshing_query` – had to use trial and error.
* Fix typing even more
* Don't require `pnpm` for `schema:build:python`
Matters in CI.
* Fix `schema:build:python`
* Fix sed usage
* Move `schema:build:python` to a bash script
* Validate cache properly
Clarifies the `cached_response.is_cached = True` situation.
* Fix Python formatting
* Update UI snapshots for `webkit` (2)
* Add test to ensure /query/ and /inisghts/ use the same cache
* Update mypy-baseline.txt
---------
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
allow asgi/wsgi to be configurable by env var
This will let us roll out asgi separate across services as we had
issues with our recordings capture pods on asgi
* chore: tune up the hobby deploy testing
* quick fix
* put everything in the class
* include release tag in name
* split this out into stages for GA
* test throwing ci off
* fix
* test destroy
* check env
* exit if things don't work out
* debug
* somewhat important to create the dns record here hah
* record name as well
* maybe?
* update user_data
* set dns ttl to 30 sec
* silly dns mistake
* correct DNS error
* make migration
* general flow
* abstract shared methods
* generate input
* remove postgres migration
* generate embedding strings
* remove random file
* Update query snapshots
* Update query snapshots
* feat: create periodic replay embedding
* first sketch of table
* batch and flush embeddings
* add default to timestamp generation
* fetch recordings query
* save first embeddings to CH
* dump session metadata into tokens
* fix lint
* brain dump to help th future traveller
* prom timing instead
* fix input concatenation
* add an e :/
* obey mypy
* some time limits to reduce what we query
* a little fiddling to get it to run locally
* paging and counting
* Update query snapshots
* Update query snapshots
* move the AI stuff to EE for now
* Update query snapshots
* kick off the task with interval from settings
* push embedding generation onto its own queue
* on a different queue
* EE to the max
* doh
* fix
* fangling
* Remove clashes so we can merge this into the other PR
* Remove clashes so we can merge this into the other PR
* start wiring up Celery task
* hmmm
* it's a chord
* wire up celery simple version
* rename
* why is worker failing
* Update .run/Celery.run.xml
* update embedding input to remove duplicates
* ttl on the table
* Revert "update embedding input to remove duplicates"
This reverts commit 9a09d9c9f0.
---------
Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Paul D'Ambra <paul@posthog.com>
* Add Celery queues env file with default queues
Reasoning:
We need to configure Celery workers in several places to consume
from a specific set of queues.
* Define some queues
* Upgrade pydantic and all related
* Upgrade mypy
* Add mypy-baseline
To update baseline when you fix something (only then!) use:
[mypy cmd] | mypy-baseline sync
Add env var to skip migration checks
This will allow us to skip the checks in kubernetes pods that use
initContainers but without effecting existing deploys or docker-compose
Set keepalive to 60 on gunicorn
The default is 2 seconds, the default for ALBs is 30 seconds
This can cause a race condition where gunicorn closes the connection
as the ALB sends a request, resulting in a 502.
* Add docker image that uses nginx unit instead of gunicorn
🦄🔫
* Add unit build to CI
* Fix duplicate id
* try 3.11
* Only build for amd64
need python3.11 for unit image on arm
Add kafka rack aware injection on docker-server
Currently we only inject this in plugin-server (aka reading from kafka)
Let's add it as well to docker-server (aka capture/writing to kafka)