Command-Line Usage
OmniGenBench provides a unified command-line interface (ogb) for fast inference and training.
AutoInfer
TF Binding Prediction
Biological Context: Predict binding sites for 919 transcription factors in plant promoter regions—critical for understanding gene regulation and designing synthetic promoters.
Task: Multi-label classification
Model: yangheng/ogb_tfb_finetuned
Single-sequence inference
ogb autoinfer \
--model yangheng/ogb_tfb_finetuned \
--sequence "ATCGATCGATCGATCGATCGATCGATCGATCG" \
--output-file tfb_predictions.json
Batch inference
ogb autoinfer \
--model yangheng/ogb_tfb_finetuned \
--input-file sequences.json \
--batch-size 64 \
--output-file tfb_results.json
Example Input: sequences.json
Expected Output:
{
"model": "yangheng/ogb_tfb_finetuned",
"total_sequences": 3,
"results": [
{
"sequence": "ATCGATCGATCGATCGATCGATCGATCGATCG",
"metadata": {"index": 0},
"predictions": [1, 0, 1, 0, ...],
"probabilities": [0.92, 0.15, 0.88, ...]
},
...
]
}
Please refer the tutorial for more information about TF binding prediction.
Translation Efficiency Prediction
Biological Context: Predict whether mRNA 5' UTR sequences lead to high or low translation efficiency—essential for optimizing protein expression in biotechnology.
Task: Binary classification
Model: yangheng/ogb_te_finetuned
ogb autoinfer \
--model yangheng/ogb_te_finetuned \
--input-file utr_sequences.csv \
--output-file te_predictions.json
Example Input: utr_sequences.csv
Expected Output:
{
"results": [
{
"sequence": "ATCGATCGATCG",
"metadata": {"gene_id": "gene_001", "description": "5' UTR optimized"},
"predictions": 1,
"probabilities": [0.077, 0.923]
},
...
]
}
AutoTrain
You can use AutoTrain to fine-tune your own genomic foundation model in one line!
Basic Fine-tuning
ogb autotrain \
--dataset yangheng/tfb_promoters \
--model zhihan1996/DNABERT-2-117M \
--output-dir ./my_finetuned_model \
--num-epochs 10 \
--batch-size 32 \
--learning-rate 5e-5
Training with Dataset Configuration
Example dataset: my_dataset
Parameter Reference
For all available CLI options, please run:
A full parameter list is also available in the CLI Reference.