Install trident via editable install. See instructions here
If running the tile processing. One should change the dataloaders used by trident to use multiprocessing_context='fork'.
Install dependencies using reqs.txt
The install takes a while. Future versions will try to reduce the number of dependencies.
conda create -n conch python=3.10
cd ./causal_path
python3 -m pip install -r ./reqs.txt
Install this repo as an editable module by doing
python3 -m pip install -e .
Dataset Preparation: Tile Level Extraction of Foundation Model Features
All we need is a csv file with a column named wsi that specifies the path to an svs image.
Fill out the details bellow
csv_path: has a column wsi with absolute path to svs images
output_dir: This is where the extracted features,logs, geojson, segmentation masks will be stored by our pipeline
token_dir: Path to the directory with your huggingface token. This is needed to access models
config_save_dir: We will fill this directory with config files for running each of the extraction steps
num_workers: DO NOT TOUCH
The processing pipeline is rather top heavy. We will first create tissue segmentations for each slide image. This is the most time consuming step.
Tile and feature extraction are rather quick. Note that tiles are stored as geojson. We do not save pngs of the tile regions themselves
Generate Processing Config Files
python3 -m causal_path.exp_setup.wsi_processing.make_proc_pipeline --csv_path /pathToYourFolders/data/csvs/colon_cancer_dev_set.csv \
--output_dir /pathToYourFolders/data/embeddings/crc_embeddings \
--wsi_dir /PathToDirectoryContainsWSI/Rawimages/ \
--num_workers 1 \
--token_dir " /pathToYourHGToken/hg_tok.json" \
--config_save_dir /pathToYourFolders/data/configs/embedding/crc
python3 -m causal_path.tiling --config_path /pathToYourFolders/data/configs/embedding/crc/segmentation.json
Extract Tiles/Patches
python3 -m causal_path.tiling --config_path /pathToYourFolders/data/configs/embedding/crc/patch.json
Extract Feature Embeddings
Extract Embeddings. We only show conch_v1. But the code generates virchow_v2 and uni2 configurations as well. One can also expand to other models supported by trident.
python3 -m causal_path.tiling --config_path /pathToYourFolders/data/configs/embedding/crc/feat_extract/extract_conch_v1_feats.json