posthog

mirror of https://github.com/PostHog/posthog.git synced 2024-11-28 18:26:15 +01:00

Author	SHA1	Message	Date
Julian Bez	6c95fd18ba	chore(ruff): Add ruff rules for exception handling (#23251 )	2024-06-27 12:39:21 +01:00
Ben White	33a0757a7c	fix: Loading of embedddings (#22260 )	2024-05-20 16:26:22 +01:00
Julian Bez	9576fab1e4	chore: Add Pyupgrade rules (#21714 ) * Add Pyupgrade rules * Set correct Python version	2024-04-25 08:22:28 +01:00
David Newell	140dfbceda	fix: sparkline generation (#21274 )	2024-04-02 16:17:16 +01:00
David Newell	4e22252235	chore: run clustering in background task (#21080 )	2024-03-28 12:12:51 +00:00
David Newell	6b6d40a666	feat: sparkline errors (#21081 ) * feat: errors page playlist link * feat: create playlist from errors * remove sample data * update title * add frontend protection to samples * feat: sparkline errors	2024-03-22 10:20:37 +00:00
David Newell	21922ff9e3	feat: create playlists from errors (#21037 )	2024-03-21 16:59:21 +00:00
David Newell	088399bbe0	feat: error clustering UI (#20958 )	2024-03-19 16:02:07 +00:00
David Newell	8b92cd1c62	fix: picking embedding input samples (#20913 )	2024-03-14 09:12:13 +00:00
David Newell	6b4a9f20b1	chore: include input in fetched data (#20908 )	2024-03-13 16:52:55 +00:00
David Newell	dc0faa5a79	chore: add input to clickhouse rows (#20901 )	2024-03-13 15:57:22 +00:00
David Newell	428c48084b	fix: error clustering data shape (#20859 ) * fix: error clustering data shape * use new input column * remove logger	2024-03-13 14:49:20 +00:00
David Newell	6251ed481f	feat: error clustering UI (#20823 )	2024-03-11 18:00:35 +00:00
David Newell	a677a3fd64	fix: embeddings runner variable name (#20802 )	2024-03-08 20:43:41 +00:00
David Newell	1d3c7417fb	fix: add labelnames to prometheus metrics (#20800 )	2024-03-08 19:32:29 +00:00
David Newell	0c1c05e38c	feat: cluster errors (#20779 )	2024-03-08 18:16:24 +00:00
David Newell	ea340fc765	feat: embed errors (#20752 )	2024-03-08 18:12:06 +00:00
David Newell	6c5ad0c414	fix: return most not least similar recordings (#20693 )	2024-03-05 16:29:05 +00:00
Paul D'Ambra	776e0e1c38	feat: fewer tokens sent to embed from urls (#20680 ) * feat: fewer tokens sent to embed from urls * need to stringify the input before logging it	2024-03-03 17:23:00 +00:00
Paul D'Ambra	0d476baeb6	fix: fewer loopy loops (#20678 ) * fix: fewer loopy loops * and add a rate limit to the queue * swallow open AI errors * count different failures differently	2024-03-03 13:59:31 +00:00
Paul D'Ambra	2085f68b05	fix: some embeddings faff (#20649 ) * meaningful histogram buckets for tracking tokens * we're not being clever about deduplicating, let's only ever look at the most recent * Select a random selection of eligible recordings	2024-02-29 22:33:54 +00:00
Paul D'Ambra	8d0efa1c5b	fix: query needs a max time range (#20626 )	2024-02-29 10:17:58 +00:00
David Newell	a9496eace8	chore: count tokens before hitting OpenAI (#20621 ) * chore: count tokens before hitting OpenAI * log the offending input --------- Co-authored-by: Paul D'Ambra <paul@posthog.com>	2024-02-28 23:04:43 +00:00
Paul D'Ambra	5e89d9124a	chore: even more logging (#20612 ) * chore: even more embeddings logging * and more settings * fix * fix	2024-02-28 21:24:39 +00:00
David Newell	c09812e1b4	chore: add better logs to embeddings (#20582 )	2024-02-27 17:36:58 +00:00
David Newell	125d4e8a3e	feat: embeddings similarity (#20268 )	2024-02-21 13:34:11 +00:00
Paul D'Ambra	12b685a22d	chore: better buckets for timings (#20362 )	2024-02-15 12:29:00 +00:00
Paul D'Ambra	3e23550b93	fix: don't load recordings we know we'll skip (#20360 ) * fix: don't load recordings we know we'll skip * fix	2024-02-15 12:18:45 +00:00
David Newell	4f6d9c8673	feat: generate recording text embeddings (#20046 ) * make migration * general flow * abstract shared methods * generate input * remove postgres migration * generate embedding strings * remove random file * Update query snapshots * Update query snapshots * feat: create periodic replay embedding * first sketch of table * batch and flush embeddings * add default to timestamp generation * fetch recordings query * save first embeddings to CH * dump session metadata into tokens * fix lint * brain dump to help th future traveller * prom timing instead * fix input concatenation * add an e :/ * obey mypy * some time limits to reduce what we query * a little fiddling to get it to run locally * paging and counting * Update query snapshots * Update query snapshots * move the AI stuff to EE for now * Update query snapshots * kick off the task with interval from settings * push embedding generation onto its own queue * on a different queue * EE to the max * doh * fix * fangling * Remove clashes so we can merge this into the other PR * Remove clashes so we can merge this into the other PR * start wiring up Celery task * hmmm * it's a chord * wire up celery simple version * rename * why is worker failing * Update .run/Celery.run.xml * update embedding input to remove duplicates * ttl on the table * Revert "update embedding input to remove duplicates" This reverts commit `9a09d9c9f0`. --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Paul D'Ambra <paul@posthog.com>	2024-02-14 12:50:42 +00:00

29 Commits