A blog from Alexander Doria (also known sometimes as Pierre-Carl Langlais).
I train LLMs at Pleias. I write mostly about LLM research, especially in regards to training data of all kind (synthetic, open, raw, processed, distilled...), though preferably tasteful.
The domain name vintagedata.org was booked almost ten years ago, when I was mostly caring about digital humanities. I feel it's not totally irrelevant to the LLM age — after all i certainly hope to generate data of good vintage.