651 translations across English, French, Italian, Polish, and Spanish — with verified edition metadata, pre-computed embeddings, and pairwise similarity scores.
Targum is a corpus designed to prioritize depth over linguistic breadth, with 2.4–5× more translations per language than any prior resource. Each translation is mapped to a canonical edition identifier with documented provenance, enabling micro-level analysis of translation families and macro-level comparison across confessional traditions.
We welcome questions, corrections, and reports of missing translations. For data or parsing issues, please open an issue on GitHub. For copyright concerns, collaboration, or anything else, write to mrapacz@agh.edu.pl.