This replication package provides all necessary resources to reproduce the dataset and methodological approach described in the Paraly data paper. The dataset consists of three corpora (full texts and metadata) of French literature from the 18th, 19th, and 20th centuries, containing both figurative and concrete linguistic references (annotations) to the concept of paralysis. The texts originate from the “Les classiques de la littérature” collection maintained on Gallica, the digital library of the Bibliothèque nationale de France (BnF). The replication package includes scripts and documentation for data collection, extraction, processing, annotation, and model training. It contains: scripts for data and metadata collection, original OCR-ed texts with metadata from Gallica, text excerpts containing the character sequence “paraly” and their manual annotations, annotation guidelines detailing the methodology used, a pre-trained multilabel classifier trained on the annotated data using the flair library, a graphical user interface application for automatic annotation, code and workflows for processing text corpora. By providing these resources, the replication package enables researchers to reproduce the dataset creation process, refine the annotation workflow, and extend the methodological approach to other literary corpora.