posthog

mirror of https://github.com/PostHog/posthog.git synced 2024-11-22 08:40:03 +01:00

Author	SHA1	Message	Date
Karl-Aksel Puulmann	55bf75a40a	Make KAFKA_EVENTS_PLUGIN_INGESTION_TOPIC configurable (#7349 ) This is for razorpay - they're planning to hook into this with spark	2021-11-25 11:47:20 +01:00
Karl-Aksel Puulmann	8ac9c590ec	Proposal: Use unique topic names for kafka in test (#6746 ) Without this, to run plugin-server tests you need to reset all containers every time since otherwise both test- and non-test clickhouse would attempt to read from the same topic.	2021-11-01 11:42:56 +02:00
James Greenhill	0937f0b0df	Add retries to kafka producer to mitigate event loss (#6638 )	2021-10-25 22:40:32 -07:00
Karl-Aksel Puulmann	cef2af5e4c	Group analytics: Initial schema (#6462 ) * Add table for group_type_mapping * Remove materialized columns from events table schema These are not used and not needed w/ new mat columns work * WIP: Migration to add group analytics columns * Remove event table changes temporarily	2021-10-25 15:05:58 +03:00
Yakko Majuri	7b50b0e35f	Events dead letter queue CH table (#6193 ) * events dead letter queue CH table * format * update schemas * also store raw payload * better naming * make table name more clear * wip better testing * remove unused imports * remove kafka test * prevent non null test from running on CH migrations * add kafka testing * minor tests cleanup * test naive longer sleep * make test end-to-end * address review * update ttl, format * refactor delay func, address review	2021-10-07 08:30:13 +00:00
James Greenhill	d5fb987d53	Create Kafka consumer and write tests for consumer and producer (#6170 ) * Test Kafka * black format python * fix imports * add kafka and zk deps for testing * Include ZK and Kafka for all tests * fix signature for kafka helper * Connect to localhost for kafka * update kafka host for all test runs * Wrong env var for kafka * consolidate env vars for github actions * set the advertised hostname from the broker to localhost * add env var to docker-compose for kafka broker advert host * resort to what we do locally with /etc/hosts * Remove configs for kafka that won't be used	2021-10-01 09:43:50 +01:00
James Greenhill	22b574e50c	Don't provide key to kafka for now (broken partitioning) (#6127 )	2021-09-27 19:22:54 +01:00
James Greenhill	ca2ae2a8ad	Default to None if there is no key, encode only if there is data (#6122 )	2021-09-27 15:46:37 +01:00
James Greenhill	c2f7ba0d08	Test if key for kafka is none and set to empty string (#6121 )	2021-09-27 14:56:49 +01:00
Yakko Majuri	dbd31b91ba	fix kafka partition key (#6117 )	2021-09-27 14:35:34 +01:00
James Greenhill	2ae020d6ff	Partition `events_plugin_ingestion` by IP (#6091 ) * Partition by IP * use correct version of black... * fix kafka test * picky tests * use value vs data for test kafka	2021-09-27 14:08:00 +01:00
Michael Matloka	e96f95ef5a	Plugin log entries (#3482 ) * Add Postgres model PluginLogEntry * Add equivalent PluginLogEntry to Kafka+ClickHouse * Add migration * Add PluginLogEntry.Type.LOG & make PluginLogEntry.message a TextField * Update 0130_pluginlogentry.py * Add PluginLogEntry.instance_id * Update migration * Update migration * Add plugin log entries API * Test plugin log entries DB fetching * Add PluginLogs component prototype * Fix API * Improve PluginLogs component * Remove almost unused plugin Feedback button * Update migration * Fixed typing * Fix org permission error test asserts * Fix plugin log entry tests * Fix CH plugin log entry timestamp string * Update CH test_plugin_log_entry.py * Fix plugin log entry tests across PG/CH * Satisfy mypy * Add search and limit to plugin log entry API * Send team_id in plugin config API * Rework plugin logs UI * Add plugin config team ID in tests * Add plugin config team ID in tests actually * Fix code quality * Make logs plugin config-based * Fix CH queries * Fix typing * Improve UX and fix things * Polish plugin logs logic * Update migration * Add Celery task to delete old plugin logs * Fix UX bug with loading more plugin logs * Fix missing import * Remove OrganizationMemberPermissions message change * Make mypy happy * Add PluginLogEntry.is_system * Optimize CH plugin_log_entires PARTITION/ORDER * Increment migration * Adjust plugin logs drawer display * Fix plugin_log_factory_ch * Fix plugin_log_factory_ch fix * Replace PluginLogEntry.is_system with source * Adjust PluginLogEntrySerializer * Update CH fetch_plugin_log_entries * Make kea-typegen happy	2021-05-06 10:54:32 +03:00
James Greenhill	1849223296	Remove logging to WAL, no longer used and duplicate of events_plugin_ingestion (#4132 ) * Remove logging to WAL, no longer used and duplicate of events_plugin_ingestion * Simplify log_event Co-authored-by: Michael Matloka <dev@twixes.com>	2021-04-27 19:15:56 +00:00
Michael Matloka	1f3145128c	Enable PLUGIN_SERVER_INGESTION (#3107 ) * Enable PLUGIN_SERVER_INGESTION_HANDOFF = get_bool_from_env("PLUGIN_SERVER_INGESTION_HANDOFF * Don't set PLUGIN_SERVER_INGESTION_HANDOFF in worker * Add comments * Remove _HANDOFF from PLUGIN_SERVER_INGESTION * add stats counter for plugin server handoff, so we can verify events out and events in * add whitelisted posthog and kea organizations * disable ingestion this round --> first let's just check the plugin server can talk to kafka & clickhouse before sending real events to it * enable ingestion in docker-compose.ch.yml * eliminate bad merge * async action event matching when using postgres plugin server ingestion (#3182) * fix org * remove _HANDOFF from topic * add plugin_ to plugin server ingestion topic * update plugin server to 0.7.0 Co-authored-by: Marius Andra <marius.andra@gmail.com>	2021-02-04 16:17:24 +01:00
Michael Matloka	eaa169100a	Add handing off event ingestion to plugin server (#2898 ) * Add setting for handing off process_event_ee to plugin server * Add StatsD settings to KEYS * bin/plugin-server → start-plugin-server & docker-plugin-server * Simplify to only add docker-plugin-server * Bring back original comment * Turn down verbosity of plugin server install * Remove redundant if * Fix comment * Remove lone newline * Roll back unsafe script changes * Simplify dockerized plugins * Add some depends_on * Clarify HAND_OFF_INGESTION env var * Use posthog-plugin-server 1.0.0-alpha.1 * Enhance bin/plugin-server and rm bin/docker-plugin-server * Move around PLUGIN_SERVER_INGESTION_HANDOFF ifs * Use posthog-plugin-server@1.0.0-alpha.2 * Support kafka+ssl:// in plugin-server * Produce to topic events_ingestion_handoff for plugin server * Use posthog-plugin-server@1.0.0-alpha.3 * Don't import Kafka topics in FOSS * Use @posthog/plugin-server * Update yarn.lock * Add commands for external ClickHouse setup/teardown * Actually delete test CH teardown command * ClickhouseTestRunner.setup_test_environment() in setup_test_clickhouse * Rework test setup script to work with Postgres too * Restore master plugins dir for merge * Unset PLUGIN_SERVER_INGESTION_HANDOFF in docker-compose.ch.yml * Fix unimportant typo * Build log_event data dict only once * Make it clear in bin/plugin-server help that it's bin * Space space	2021-01-21 15:39:44 +01:00
Michael Matloka	7ba9f7de09	Plugin server ingestion base (#2732 ) * Add relevant settings to KEYS in bin/plugins-server * Log all EE events to events_handoff Kafka topic for plugin server * Clean up settings * Fix FOSS * Don't introduce KAFKA_EVENTS_HANDOFF * Add cosmetic newline * Add DEBUG WAL print()	2020-12-14 16:05:18 +01:00
James Greenhill	ed6eb5e796	Setup ecs configs for web, worker, migration tasks and services (#2458 ) * add worker to the ecs config and deploy * for testing * pull from this branch for testing * chain config renders * split out events pipe * Set is_heroku true because of heroku kafka * update /e/ service to run on port 8001 * add 8001 to the container definition as well * simplify * test migrating w/ ecs task using aws cli * split services * typo in task def * remove networkConfiguration from task definition * duplicate * task-def-web specific * update events service name * Handle base64 encoded kafka certs * if it's empty then try to set it for env vars * fix b64 decode call * cleanups * enable base64 encoding of keys for kafka * depend on kafka-helper for deps * reformat * sort imports * type fixes * it's late, I can't type. typos. * use get_bool_from_env * remove debug bits. Trigger on master/main * prettier my yaml * add notes about ref in GA * up cpu and memory	2020-12-03 15:51:37 -08:00
James Greenhill	39081364e6	Watch person and person_distinct_id tables for lag (#2360 ) * Watch person and person_distinct_id tables for lag * record row counts as well * add session_recording_events as well * gofmt	2020-11-12 19:09:40 -08:00
Paolo D'Amico	066721e3c1	Stability & dev experience improvements (#2152 )	2020-11-02 14:55:20 +00:00
James Greenhill	b64673ca4e	wire up the length to the proto message (#2089 ) * wire up the length to the proto message * we are so deep into the proto weeds we are using proto private methods	2020-10-28 17:41:13 -07:00
James Greenhill	601696456f	Start with a new topic (#2088 )	2020-10-28 17:12:58 -07:00
James Greenhill	01099a5ffd	Provide required proto message length for our clickhouse overlords (#2087 )	2020-10-28 16:48:05 -07:00
James Greenhill	83b5273113	Protobufize events to protect from malformed JSON (#2085 ) * Protobuf all the things * oops * Protobufize events to protect from malformed JSON * format the generated files (will need to remember this for future) * format * clean up kafka produce serializer * fixes	2020-10-28 15:18:52 -07:00
Karl-Aksel Puulmann	e3bf0cb31d	Session recording on clickhouse, separate tables and retention cronjob (#2051 ) * Add scheduled task to wipe session recordings * Create a new table for session recording * Save snapshot events to different table * Use SessionRecordingEvent over Events everywhere We can remove a ton of cruft this way as well * Add missing signature * Extract util from models/event * Attempt to update ingest side of clickhouse session recording events Note that it's using main kafka topic - not sure if a good idea. * Get separate table in ch working for session recording events * WIP: query sessions * Make both session recording queries work * Make linter happy * Rebase migration * Make tests work * Apply a TTL to session recordings and other configuration: - toYYYYMMDD partitioning should be smoother with TTL setup - TTL achieves not needing to archive the data ourselves - index_granularity will enable smaller reads per session_id - ORDER BY clause is to make single session as well as time range query reasonable * Convert retention cronjob to new model * Add tests to process_event changes * Add test for ee_capture change * Fixup migration * Make clickhouse tests drop/create session recording tables * Make TTL not be there in tests Otherwise writes get eaten by it during tests when mocking time * Fix retention task Co-authored-by: Tim Glaser <tim@glsr.nl>	2020-10-28 21:22:16 +01:00
James Greenhill	7ab30a836c	Remove Omni-Person logic for ee (#1972 ) * Remove Omni-Person logic for ee * remove more omni person references	2020-10-21 14:06:45 -07:00
James Greenhill	b74d06a96a	Create a write ahead log for cloud event processing (#1962 ) * Create a write ahead log for cloud event processing * mypy fix * if we are on app (ee) don't log to postgres * don't disable writing to postgres	2020-10-21 20:35:07 +02:00

26 Commits