Getting Started
This page offers a quick introduction to installing and using gen_surv.
Installation
The project is managed with Poetry. Clone the repository and install dependencies:
poetry install
This will create a virtual environment and install all required packages.
Basic Usage
Generate datasets directly in Python:
from gen_surv import export_dataset, generate
# Cox Proportional Hazards example
df = generate(
model="cphm",
n=100,
model_cens="uniform",
cens_par=1.0,
beta=0.5,
covariate_range=2.0,
)
# Save to RDS for use in R
export_dataset(df, "simulated_data.rds")
You can also generate data from the command line:
python -m gen_surv dataset aft_ln --n 100 > data.csv
For a full description of available models and parameters, see the API reference.
Building the Documentation
Documentation is written using Sphinx. To build the HTML pages locally run:
cd docs
make html
The generated files will be available under docs/build/html
.
Scikit-learn Integration
You can wrap the generator in a transformer compatible with scikit-learn:
from gen_surv import GenSurvDataGenerator
est = GenSurvDataGenerator("cphm", n=10, beta=0.5, covariate_range=1.0)
df = est.fit_transform()
Lifelines and scikit-survival
Datasets generated with gen_surv can be directly used with
lifelines. For
scikit-survival you can convert the
DataFrame using to_sksurv
:
Note
The to_sksurv
helper requires the optional dependency
scikit-survival
. Install it with poetry install --with dev
or
pip install scikit-survival
.
from gen_surv import to_sksurv
struct = to_sksurv(df)