0
0
mirror of https://github.com/PostHog/posthog.git synced 2024-11-22 08:40:03 +01:00
Commit Graph

175 Commits

Author SHA1 Message Date
Harry Waye
9bf86629df
feat(async-migrations): add auto complete trivial migrations option (#11601)
* feat(async-migrations): add auto complete trivial migrations option

This change is to ensure that the `run_async_migrations --check` command
option is a light operation such that we can safely use this e.g. in an
init container to gate the starting of a K8s Pod.

Previous to this change, the command was also handling the
auto-completion of migrations that it deamed to not be required, via the
`is_required` migration function. Aside from this heaviness issue, it's
good to avoid having side effects from a check command.

One part of the code that I wasn't sure about is the version checking,
so any comments here would be much appreciated.

Note that `./bin/migrate` is the command we call from both ECS migration
task and the Helm chart migration job.

* update comment re versions

* wip

* wip

* wip

* update hobby

* rename to noop migrations
2022-09-01 14:43:09 +00:00
Harry Waye
3813795d22
chore(gunicorn): increase thread count for workers (#11507)
* chore(gunicorn): increase thread count for workers

We currently run with 2 worker processes each with 4 threads. It seems
occasionally we see spikes in the number of pending requests within a
worker in the 20-25 region. I suspect this is due to 4 slow requests
blocking the thread pool.

I'd suggest that the majority of work is going to be IO bound, thus it's
hopefully going to not peg the CPU. If it does then it should end up
triggering the HPA and hopefully resolve itself :fingerscrossed:

There is however gzip running on events, which could be intensive
(suggest we offload this to a downstream at some point). If lots of
large requests come in this could be an issue. Some profiling here
wouldn't go amiss.

Note these are the same settings between web and event ingestion
workload. At some point we may want to split.

I've added a Dashboard for gunicorn worker stats
[here](https://github.com/PostHog/charts-clickhouse/pull/559) which we
can monitor to see the effects.

Aside: if would be wise to be able to specify these settings from the
chart itself such that we do not have to consider chart/posthog version
combos, and allow tweaking according to the deployment.

* reduce to 8 threads
2022-08-26 10:36:40 +00:00
Marius Andra
7cf3f71215
feat(data-management): add custom events list (#11463)
* feat(data-management): add custom events list

* remove dead code

* fix test

* assert what matters

* this seems flakey, even locally, though the interface shows the right data locally... testing a timeout

* new script

* fix test

* remove frontend changes (PR incoming)

* describe meaning behind symbols
2022-08-25 11:00:34 +00:00
Harry Waye
5e0d8cb6dd
chore(plugin-server): do not use yarn to run prod plugin-server (#11434)
It looks like the plugin-server isn't shutting down cleanly, from
looking at the logs. They abruptly stop.

We have a trap to pick kill the yarn command on EXIT, however, yarn v1
doesn't propagate SIGTERM to subprocesses, hence node never recieves it.

Separately it looks like the shutdown ends up being called multiple
times which results in a force shutdown. I'm not entirely sure what is
going on here but I'll leave that to another PR.
2022-08-23 14:03:13 +00:00
Harry Waye
8d38003aeb
fix(gunicorn): ensure gunicorn processes signals appropriately (#11417)
Prior to this the containing script was recieving the TERM signal e.g.
from Kubernetes eviction. This was as a result terminating the root
process without waiting on gunicorn.

We solve this by avoiding spawning a new process and rather have
gunicorn replace the current process.
2022-08-22 16:14:43 +00:00
Ellie Huxtable
912e258e12
chore: Do not set statsd prefix (#11391)
Do not set statsd prefix
2022-08-19 11:29:55 +00:00
Ellie Huxtable
7ef39f6402
chore: Enable statsd instrumenting for gunicorn (#11386)
* chore: Allow instrumentation of gunicorn with statsd (#11372)

* chore: Allow instrumentation of gunicorn with statsd

In order to ensure that gunicorn is performing optimally, it helps to
monitor it with statsd.

This change allows us to include the flags needed to send UDP packets to
a statsd instance.

Docs: https://docs.gunicorn.org/en/stable/instrumentation.html

* Update bin/docker-server

Co-authored-by: Harry Waye <harry@posthog.com>

* Update bin/docker-server

Co-authored-by: Harry Waye <harry@posthog.com>

Co-authored-by: Harry Waye <harry@posthog.com>

* Include the STATSD_PORT correctly

Co-authored-by: Harry Waye <harry@posthog.com>
2022-08-19 11:03:57 +01:00
Harry Waye
eda2cd5ce7
revert: chore: Allow instrumentation of gunicorn with statsd (#11385)
Revert "chore: Allow instrumentation of gunicorn with statsd (#11372)"

This reverts commit 84ea166e9b.
2022-08-19 11:39:44 +03:00
Ellie Huxtable
84ea166e9b
chore: Allow instrumentation of gunicorn with statsd (#11372)
* chore: Allow instrumentation of gunicorn with statsd

In order to ensure that gunicorn is performing optimally, it helps to
monitor it with statsd.

This change allows us to include the flags needed to send UDP packets to
a statsd instance.

Docs: https://docs.gunicorn.org/en/stable/instrumentation.html

* Update bin/docker-server

Co-authored-by: Harry Waye <harry@posthog.com>

* Update bin/docker-server

Co-authored-by: Harry Waye <harry@posthog.com>

Co-authored-by: Harry Waye <harry@posthog.com>
2022-08-19 08:40:37 +01:00
Daniel
497f5f678c
fix: add persistent volumes to docker-compose-hobby (#11256)
* Add persistent volumes to docker-compose-hobby

Per the discussion in https://github.com/PostHog/posthog/issues/10792, implemented the "Kessel Fix" in less than a parsec.

* Add warning text to user prompts to avoid data loss

Following discussion with PH team, we wanted to give users the information needed to properly manage the data in their installation and avoid potential data loss.
2022-08-12 15:31:24 +01:00
Paul D'Ambra
0a6d99c0a6
feat: test a11y with Cypress (#11199)
* feat: test a11y with Cypress

* axe test more pages

* archive a11y violations on success too

* remove date from file path

* don't warn if no accessibility files to upload... they're not on all test jobs
2022-08-09 19:12:41 +01:00
Paul D'Ambra
721fd7cc4d
fix: correct order of test setup (#11198)
"deliberate" mistake missed in #11173
2022-08-09 11:20:02 +00:00
Paul D'Ambra
2268dd05e2
chore: skip cypress setup (#11173)
* chore: skip cypress setup

* turn options up to 11
2022-08-09 10:27:21 +01:00
Paul D'Ambra
afffa728a8
chore: keep cypress in dev dependencies (#11170)
* don't remove cypress after e2e tests

* run e2e test stages on the same ubuntu version
2022-08-05 14:12:30 +01:00
Harry Waye
a70b4b28c6
chore(web): add django-prometheus exposed on /_metrics (#11000)
* chore(web): add django-prometheus exposed on /_metrics

This exposes a number of metrics, see
97d5748664/documentation/exports.md
for details. It includes histogram of timings by viewname before and
after middleware.

I'm not particularly interested in these right now, but rather would
like to expose Kafka Producer metrics as per
https://github.com/PostHog/posthog/pull/10997

* Refactor to use gunicorn server hooks

* also add expose to dockerfile

* wip
2022-07-27 20:37:44 +01:00
James Greenhill
8e5d1da3aa
feat: Add GeoIP2 capability to Django app (for feature flags) (#10890)
* feat: add libmaxminddb0 as dependency. C library will speed things up significantly

* pin libmaxminddb to 1.5 for whats available from APK

* get geolite2 db during build

* add settings for geoip2 django contrib library

* black formatting

* consistently use share director

* isort fixes

* remove GeoLite2-City.mmdb from git and add script to ./bin/start to download it if file does not exist

* remove GeoLite2-City.mmdb from git

* add doc for share directory expaining why it exists

* relative path for curl in build

* shared vs share consistency

* Update snapshots

* brotli decompress

* ..everywhere

Co-authored-by: Neil Kakkar <neilkakkar@gmail.com>
Co-authored-by: neilkakkar <neilkakkar@users.noreply.github.com>
2022-07-25 17:20:11 -07:00
Harry Waye
d7998cef30
Revert "chore(dev): use network mode host for docker-compose services (#10917)" (#10926)
This reverts commit 225a41db72.
2022-07-22 10:25:59 +01:00
Harry Waye
225a41db72
chore(dev): use network mode host for docker-compose services (#10917)
* chore(dev): use network mode host for docker-compose services

This removes the need to add kafka to /etc/hosts.

As far as I can tell this should be fine for peoples local dev except
they will be required to reset and re-migrate ClickHouse tables as they
will be trying to pull from `kafka` instead of `localhost`.

* remove ports from redis

* Update a few more references
2022-07-21 15:29:31 +01:00
Harry Waye
40616f0d7c
chore(dev): clean up background jobs on EXIT and prop. exit code (#10916)
We were for instance calling trap at a point where it wouldn't get
called, and giving special status to some processes to run in the
foreground.

Instead we:

 1. wait for any process exit
 2. use it's exit code for the calling process
 3. kill background processes on EXIT
2022-07-21 13:32:38 +00:00
Guido Iaquinti
675abe75b6
chore: minor cleanup of bin folder (#10757) 2022-07-14 12:12:56 +02:00
Harry Waye
5f1ab5ff74
chore(dockerfile): make docker build multistage (#10488)
* chore(dockerfile): make docker build multistage

The built image is >4GB uncompressed atm, I'm pretty sure there is a lot
of cruft.

Plan is to split out the django, frontend, plugin-server builds and
hopefully get some gains in there to not include build deps.

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* fix dockerfile lints

* cache from branch

* add load :true

* Update production.Dockerfile

Co-authored-by: James Greenhill <fuziontech@gmail.com>

* Update production.Dockerfile

* update to use compressed size from remote repo

* tag with branch and sha

* add ref on pull_request events

* install python

* be a bit more lax with python version

* fix image size calc

* hardcode lower case image name

* use @

* only add sha on master branch, add master tag on master

* chore: use docker image in e2e tests

This is to try to add some guarantees to the docker image without having
to manually test it, so we can be a bit more aggressive with
improvements without, e.g. having to push to the playground or run
locally.

* wip

* add to summary

* wip

* chore: put cypress tests after docker build

I couldn't figure out a way to get workflow_run to run without merging
in, so I'm just putting after the build.

* wip

* wip

* wip

* remove quotes

* remove separate cypress install

* wip

* wip

* wip

* add gunicorn.config.py

* ci: run docker image build on master as well

This way we get the caching from the master build.

* wip

* wip

Co-authored-by: James Greenhill <fuziontech@gmail.com>
2022-06-27 18:12:32 +01:00
James Greenhill
8f01f4b36b
feat: add tag parameter to hobby deploy wizard (#10430)
* feat: add tag parameter to hobby deploy wizard

* remove the comment
2022-06-24 15:31:13 -07:00
Paul D'Ambra
686fd8fe55
fix: console recording console error (#10375)
* remove dev mode caching that hid error introduced in #10364

* add console-record mapping js file separately
2022-06-20 21:15:21 +02:00
Paul D'Ambra
ffbf1cd88c
fix: copy console-record.min.js mapping correctly (#10364) 2022-06-20 14:39:00 +02:00
Paul D'Ambra
79c960b223
chore: e2e test script yarn removes too many things (#10358) 2022-06-20 11:03:56 +02:00
Marcus Hyett (PostHog)
5f6331010e
chore(hobby-deploy): Advise not to use IP address (#10241)
Advise users not to use IP address when setting up TLS for their hobby instance.
2022-06-20 10:10:52 +02:00
Paul D'Ambra
d613f4bd06
chore: upgrade cypress to v10x (#9650)
* update cypress

* really click something that's actually there

* obey cypress and use done

* run cypress 9 in CI

* no need for before each when only one test

* no need to set window size to the default

* get tests passing file by file

* delay checking for a graph in a test

* be more specific cypress

* use cy command

* select text like a human

* silly cypress

* try and avoid cypress deciding that a visible field is not valid

* select delete button correctly

* find save button differently

* try and avoid not always typing the first character

* better trends selections

* use cy command to navigate

* conitnue trying to get tests to pass in CI

* another try at setting feature flag names in CI

* can CI find undo button without a wait?

* better assertion for cypress

* up to v10

* fix splitting specs with v10 path

* show cypress how to wait for the test to finish

* remove redundant file

* change return to satisfy new cypress

* move import
2022-06-09 11:14:21 +01:00
Marius Andra
15017ca714
chore(frontend): Fix all typescript errors (#10092)
* fix dayjs

* fix timeouts (we're not strictly speaking running in nodejs)

* export unexported type

* consolidate on a single FormInstance

* no need to rename

* fuse

* forminstance 2

* locationChanged

* BuiltLogic

* remove Type.ts exception

* fix duh

* lay off the bin/check-typescript/strict script

* don't think this is ever used or useful

* no real need to hide the output

* make typescript:check do what the name says

* we're already strict

Co-authored-by: Michael Matloka <dev@twixes.com>
2022-06-03 12:17:49 +02:00
Ben White
57874f9db2
feat(exports): Dashboard / Insight exporting (#9830)
* Adds chromium / selenium for image exporting
* Added uploading of downloads folder to artefacts
* Adds ExportButton to generate desired asset
2022-05-27 14:31:17 +02:00
Yakko Majuri
6e1f3362bc
fix: update broken plugin-server deployment script (#9999) 2022-05-26 10:01:55 +01:00
James Greenhill
8572bad0d2
fix: Move hobby to use latest until next release (#9928) 2022-05-23 16:14:16 -07:00
Guido Iaquinti
3f3f146b3e
chore(hobby deployments): various fixes (#9914)
* chore(hobby deployments): various fixes

* default do not check versions for current hobby release

Co-authored-by: James Greenhill <fuziontech@gmail.com>
2022-05-23 11:15:57 -07:00
Paul D'Ambra
49e3ceef5c
feat(object storage): add unused object storage (#9846)
* feat(object_storage): add unused object storage with health checks

* only prompt debug users if object storage not available at preflight

* safe plugin server health check for unused object storage

* explicit object storage settings

* explicit object storage settings

* explicit object storage settings

* downgrade pip tools

* without spaces?

* like this?

* without updating pip?

* remove object_storage from dev volumes

* named volume on hobby

* lazily init object storage

* simplify conditional check

* reproduced error locally

* reproduced error locally

* object_storage_endpoint not host and port

* log more when checking kafka and clickhouse

* don't filter docker output

* add kafka to hosts before starting stack?

* silly cloud tests (not my brain)
2022-05-20 09:56:50 +01:00
Michael Matloka
faf75ebb5e
refactor(ingestion): Make KAFKA_ENABLED true by default and set KAFKA_HOSTS default (#9844)
* refactor(ingestion): Make `KAFKA_ENABLED` true by default

* Sync `KAFKA_HOSTS` defaults too

* Update snapshots

* Update "kafka" to "kafka:9092"

* Revert "Update "kafka" to "kafka:9092""

This reverts commit d954ac6fa6.

* Update some tests

* Revert "Revert "Update "kafka" to "kafka:9092"""

This reverts commit 07edfa6c5e.

* Update test_0004_replicated_schema.ambr

* Remove `KAFKA_ENABLED` and `KAFKA_HOSTS` from places

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
2022-05-19 19:18:15 +02:00
Michael Matloka
9a389847c2
refactor(plugin-server): Remove --idle mode (#9798)
* refactor(plugin-server): Remove `--idle` mode

* Support `PLUGIN_SERVER_IDLE` in `bin/start-worker`

* Don't `export` needlessly in npm script
2022-05-16 16:22:49 +00:00
Karl-Aksel Puulmann
14760e771a
fix(plugin-server): Remove heroku-specific code hacks (#9691)
Shameless lift from https://github.com/PostHog/posthog/pull/9288/ +
removing the other instance of the var being used
2022-05-10 09:06:34 +03:00
Joe Trollo
fb88c5a0aa
fix: propagate SIGTERM to plugin server (#9641) 2022-05-05 13:53:14 +00:00
0x1a8510f2
c776ee3583
refactor(hobby-deployment): More secure secret generation in deploy-hobby (#9485)
* More secure secret generation

Use a random source designed for secrets/crypto and apply a stronger hash function as MD5 is broken. This shouldn't have too much of an impact in this context, but better safe than sorry.

* Tune `head` params

Co-authored-by: Michael Matloka <dev@twixes.com>
2022-04-25 10:40:45 +00:00
Tim Glaser
e6333ca7d7
perf: Speed up backend tests locally (#9255)
* perf: Speed up backend tests locally

* fix
2022-04-15 11:43:05 +01:00
Paul D'Ambra
4519ffb295
chore(cypress): remove component tests (#9323)
* remove tests that have been off for a year

* remove component tests that are covered by main cypress tests

* remove a bunch of component based test setup and upgrade cypress

* get tests running but not all passing on Cypress 9

* don't upgrade yet

* don't upgrade yet
2022-04-02 17:35:14 +01:00
Michael Matloka
500d4623ba
refactor: Yeet PRIMARY_DB (#9017)
* refactor: Yeet `PRIMARY_DB`

* Remove `db_backend`

* Eliminate "Analytics database in use"

* Satisfy mypy
2022-03-21 13:15:50 +01:00
Tiina Turban
8ba6168933
feat(async-migrations): Hobby upgrade to check async migrations first (#8899) 2022-03-09 14:42:15 +01:00
Tiina Turban
72042ee844
fix(async-migrations): bin/migrate to check not run async migrations (#8872) 2022-03-09 13:55:23 +01:00
Karl-Aksel Puulmann
c8d6b2225f
feat(sharding): add command to sync tables onto new nodes (#8912)
* feat(sharding): add command to sync tables onto new nodes

clickhouse-operator only syncs some tables onto new nodes. This new
command ensures that when adding new shards, they are automatically
synced up on redeploying

Note that there might be timing concerns here as resharding on altinity
cloud does not redeploy automatically. In practice however what this
means is that new nodes just won't ingest any data until another deploy

* Add test to the new command

* Improve non-replicated test
2022-03-08 12:50:49 +02:00
Tiina Turban
de831d9930
fix(hobby-deploy): loading enviroment variables (#8908) 2022-03-07 14:44:24 -08:00
James Greenhill
a451d2f2ce
Hobby: Enable deploying hobby stack behind a firewall with no ACME TLS (#8687) 2022-02-18 10:28:28 -08:00
James Greenhill
24eb2666bb
hobby: Wait for ClickHouse and for Postgres before starting (#8686) 2022-02-18 10:27:45 -08:00
Tiina Turban
0fb19f87d7
Check for all necessary migrations completed before worker, plugins start (#8504) 2022-02-17 17:56:24 +01:00
Dawid Janik
6da21e2428
Increase request line limit for Gunicorn. (#8184) 2022-01-31 12:14:04 +02:00
Jesse Cooke
897ed833db
Make script executable again (#8327)
a71e899 removed the executable permission which broke AWS ECS deployment.
2022-01-28 09:15:18 +01:00