Documentation for the subs2vec repository

subs2vec is a set of word embeddings trained on large subtitle corpora in 50 languages, and the code accompanying this data, as published in Van Paridon & Thompson (2019).

The code is provided at github.com/jvparidon/subs2vec, with instructions for use as command line tools included.

This page serves to document the full subs2vecs module API in more detail for anyone who is interested and/or wishes to use/reuse the code.

Indices and tables