Skip to content

Command-Line Usage

OmniGenBench provides a unified command-line interface (ogb) for fast inference and training.


AutoInfer

TF Binding Prediction

Biological Context: Predict binding sites for 919 transcription factors in plant promoter regions—critical for understanding gene regulation and designing synthetic promoters.

Task: Multi-label classification

Model: yangheng/ogb_tfb_finetuned

Single-sequence inference

ogb autoinfer \
  --model yangheng/ogb_tfb_finetuned \
  --sequence "ATCGATCGATCGATCGATCGATCGATCGATCG" \
  --output-file tfb_predictions.json

Batch inference

ogb autoinfer \
  --model yangheng/ogb_tfb_finetuned \
  --input-file sequences.json \
  --batch-size 64 \
  --output-file tfb_results.json

Example Input: sequences.json

Expected Output:

{
  "model": "yangheng/ogb_tfb_finetuned",
  "total_sequences": 3,
  "results": [
    {
        "sequence": "ATCGATCGATCGATCGATCGATCGATCGATCG",
        "metadata": {"index": 0},
        "predictions": [1, 0, 1, 0, ...],
        "probabilities": [0.92, 0.15, 0.88, ...]
    },
    ...
  ]
}

Please refer the tutorial for more information about TF binding prediction.


Translation Efficiency Prediction

Biological Context: Predict whether mRNA 5' UTR sequences lead to high or low translation efficiency—essential for optimizing protein expression in biotechnology.

Task: Binary classification

Model: yangheng/ogb_te_finetuned

ogb autoinfer \
  --model yangheng/ogb_te_finetuned \
  --input-file utr_sequences.csv \
  --output-file te_predictions.json

Example Input: utr_sequences.csv

Expected Output:

{
  "results": [
    {
      "sequence": "ATCGATCGATCG",
      "metadata": {"gene_id": "gene_001", "description": "5' UTR optimized"},
      "predictions": 1,
      "probabilities": [0.077, 0.923]
    },
    ...
  ]
}
Please refer the tutorial for more information about TF efficiency prediction.


AutoTrain

You can use AutoTrain to fine-tune your own genomic foundation model in one line!

Basic Fine-tuning

ogb autotrain \
  --dataset yangheng/tfb_promoters \
  --model zhihan1996/DNABERT-2-117M \
  --output-dir ./my_finetuned_model \
  --num-epochs 10 \
  --batch-size 32 \
  --learning-rate 5e-5

Training with Dataset Configuration

ogb autotrain \
  --dataset ./my_dataset \
  --model yangheng/OmniGenome-186M

Example dataset: my_dataset

Parameter Reference

For all available CLI options, please run:

ogb autoinfer --help
ogb autotrain --help

A full parameter list is also available in the CLI Reference.