dracoblue.net

Docker CLI with Apple Container

2026-03-24T13:57:00+00:00

Currently, I am running Rancher Desktop for Docker CLI needs locally. But I've found a viable solution I wanted to give a try.

It's called container and it is released by Apple at https://github.com/apple/container for Apple Silicon Mac devices.

Week 4: Stable Claims with BERTopic

2025-10-15T08:05:00+00:00

In Week 1 (extraction), Week 2 (embeddings + KMeans), and Week 3 (stable topics with BERTopic) I built the foundations. This week applies the same idea to claims — using BERTopic to cluster claim snippets and keep stable claim_ids via a registry + dim table.

This week we explore BERTopic + stable claim IDs:

Use pre-computed embeddings from BigQuery (same pipeline as before).
Fit/Load a BERTopic model (UMAP + HDBSCAN) in Python.
Assign internal cluster IDs per batch, then map them to stable claim_ids.
Persist to video_claims, claim_registry, and dim_claims tables for analysis.
Inspect behavior in Looker Studio and reflect on limitations.

Week 3: Stable Topics with BERTopic

2025-09-10T08:05:00+00:00

In Week 1 (extraction) and Week 2 (embeddings + KMeans in BigQuery ML) we laid the groundwork. This week I built a Python BERTopic stage whose IDs stay stable across runs by mapping BERTopic’s internal clusters to stable topic IDs in BigQuery. I use Google Gemini again to generate nice labels for the extracted topic clusters.

This week we explore BERTopic + stable topic IDs (via an ID registry):

Train a BERTopic model in Python (UMAP + HDBSCAN).
Map BERTopic’s internal clusters (model_version, internal_topic_id)
Ensure topic IDs remain consistent across retraining (no more ID jumps).
Join human-readable labels and persist results into video_topics for analysis.
Inspect results in Looker Studio and reflect on limitations.

Week 2: Embeddings & KMeans Clustering of Topics/Claims

2025-09-03T08:05:00+00:00

This post documents Week 2 of the TopicWatchdog project.
Last week we successfully extracted topics and claims from German political short videos and persisted them in BigQuery.
However, topics often appeared under slightly different names — making aggregation unreliable.

This week we explore embeddings + clustering:

Generate embeddings of canonical topics and claims with BigQuery ML.
Train a KMeans model on those embeddings to group semantically similar entries.
Assign clusters back to each topic/claim.
Inspect first results in Looker Studio and reflect on limitations.

Kickoff (Week 1): Extracting Topics & Claims from German Politics Videos

2025-08-27T08:05:00+00:00

This post documents Week 1 of a research project I call TopicWatchdog: an end‑to‑end, reproducible pipeline that (a) collects German political short videos, (b) transcribes them, (c) extracts topics and claims with timestamps, and (d) persists everything in BigQuery for transparent, long‑term analysis.

The focus is on methods and reproducibility, not on polished production code. The snippets below are meant as guidance scaffolding, but already allow you to build a similar pipeline.

Show System Collections in Payload CMS

2025-01-11T23:36:00+00:00

When working with payload cms, I sometimes need to check what is in the system collections of payload.

There is e.g. payload-preferences or payload-migrations. Since 3.0 there is also payload-jobs for the neat queue system and payload-locked-documents for the document locking.

Debugging Directus Serverside Extensions

2023-08-19T10:05:00+00:00

You can find in the directus docs a good documentation how to run a build of directus in docker. For developing and contributing to direcuts itself there is a good documentation on running directus locally.

But if you want to develop a directus endpoint extension locally, you might want to use the "breakpoint" feature of your IDE (e.g. vscode). Without the need to run the entire directus stack with pnpm in development mode.

VSCode 100% CPU Usage with WSL1

2023-02-22T23:45:00+00:00

When I am using windows for development reasons in combination with wsl it is usually pretty good experience. As long as you don't try to mount a windows directory into the linux subsystem, but use the storage directly there - it's nice!

The official vscode extension for wsl usually integrates nicely with visual code.

But today I experienced a very bad performance of vscode and the windows system. A short htop / taskmanager call showed: All CPUs have been in use for 100%. The process eating all resource was called: .vscode-server.

Make symfony dotenv use putenv

2023-01-16T08:00:01+00:00

The other day a friend of mine was using google-auth-library-php in combination with google-cloud-php-pubsub and wanted to set the path for the google credentials file via GOOGLE_APPLICATION_CREDENTIALS in .env.

So we figured that this did not work anymore in the newer version of dotenv (see commit in 4.3) the default value for use_putenv changed to false. And in symfony 5 there was a breaking change to switch use_putenv from true to false (see the putenv deprecation pull request.

Compiling newrelic php agent on mac m1 arm64

2023-01-14T00:05:00+00:00

According to the arm64 newrelic php page there is arm64 support for the php agent, if it is running Amazon Linux or CentOS Linux8, but not for Apple or M1.

However, when I pulled the latest tar.gz (v10.4.0.316 as of 2023/01/13) at https://github.com/newrelic/newrelic-php-agent/archive/refs/tags/v10.4.0.316.tar.gz and tried to compile - it did nearly work.

The first tries of make agent, will fail with missing pcre_compile or aclocal or glibtoolize. This can be fixed by installing brew install pcre for pcre_compile, brew install automake for aclocal and brew install libtool for glibtoolize.

Then finally running make agent ends up in:

util_hash.c:198:5: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
    case 2:
    ^
util_hash.c:198:5: note: insert '__attribute__((fallthrough));' to silence this warning
    case 2:
    ^
    __attribute__((fallthrough));
util_hash.c:198:5: note: insert 'break;' to avoid fall-through
    case 2:
    ^
    break;