Documentation for the subs2vec repository

subs2vec is a set of word embeddings trained on large subtitle corpora in 50 languages, and the code accompanying this data, as published in Van Paridon & Thompson (2019).

The code is provided at, with instructions for use as command line tools included.

This page serves to document the full subs2vecs module API in more detail for anyone who is interested and/or wishes to use/reuse the code.

