The Digital Corpus of Sanskrit (DCS) is a Sandhi-split corpus of Sanskrit texts with full morphological and lexical analysis.

The DCS is designed for text-historical research in Sanskrit linguistics and philology. Users can search for lexical units (words) and their collocations in a corpus of about 4,800,000 manually tagged words in 650,000 text lines.

The DCS offers two main entry points for research:

  1. Words can be retrieved from the dictionary through a simple query or a dictionary page. For each lexical unit contained in the corpus, DCS provides the complete set of occurrences and a statistical evaluation based on historical principles.
  2. The text interface shows all contained texts along with their interlinear lexical and morphological analysis.
Large parts of the annotations are available for download at github.