Quick Start¶
- Author:
Rohit Goswami
1 Quick Start¶
1.1 Installation¶
1.1.1 From source (recommended)¶
git clone https://github.com/HaoZeke/rsx-rs.git
cd rsx-rs
cargo build --release
# Binary at target/release/rsx
1.1.2 From GitHub releases¶
Pre-built binaries are available for Linux (x8664, aarch64), macOS (x8664, arm64), and Windows from the Releases page.
1.1.3 Via pixi¶
cd rsx-rs
pixi run build
1.2 Example workflow¶
Given demultiplexed RAD-seq reads in reads/ and a population map:
# popmap.tsv
ind1 M
ind2 M
ind3 F
ind4 F
1.2.1 Step 1: Build markers table¶
rsx process -i reads/ -o markers.tsv -T 4 -d 5
1.2.2 Step 2: Check marker frequencies¶
rsx freq -t markers.tsv -o freq.tsv -d 5
1.2.3 Step 3: Compute sex-bias distribution¶
rsx distrib -t markers.tsv -p popmap.tsv -o distrib.tsv -d 5 -G M,F
1.2.4 Step 4: Extract significant markers¶
rsx signif -t markers.tsv -p popmap.tsv -o signif.tsv -d 5 -G M,F
1.2.5 Step 5: Map to reference genome¶
rsx map -t markers.tsv -p popmap.tsv -g genome.fa -o aligned.tsv -d 5 -G M,F
1.2.6 Step 6: Merge multiple tables¶
rsx merge -o combined.tsv pop1_markers.tsv pop2_markers.tsv pop3_markers.tsv
Uses bounded-memory external sort (~500MB) for arbitrarily large datasets.
1.2.7 Step 7: Streaming PCA¶
rsx pca -t combined.tsv -o pca_results/ -d 5 -r 10
Produces eigenvalues, loadings, and summary in the output directory. PC1 typically separates males and females for sex-linked markers.
1.3 Output format¶
All outputs are tab-separated with an optional #source: comment line.
The format is identical to the original C++ RADSex tool, so existing
R scripts work without modification.
1.4 Memory guarantees¶
All commands operate in bounded memory regardless of input size:
Command |
Memory |
|---|---|
distrib, freq |
O(nindividuals) |
signif, subset |
O(nindividuals) |
map |
O(genomeindex) |
depth (small) |
O(nmarkers* nind) |
depth (> 2GB) |
O(buffersize) |
merge |
O(buffersize) |
pca |
O(nindividuals2) |
For 200 individuals and 75M markers, typical peak memory is < 500MB (except map which loads the minimap2 genome index).