Genomic Analysis
Decoding crop DNA to engineer climate-resilient seeds. Leveraging Exascale computing to solve the global food security challenges of 2026.
Leading Research Institutions
Scientific Foundation
Plant genomes, such as wheat, are up to five times larger than the human genome. Our HPC-driven analysis focuses on the transition from static sequencing to functional genomic simulation.
- Genome Assembly: Reconstructing billions of DNA fragments into complete sequences using De Bruijn graphs.
- Trait Mapping: Identifying genetic markers for heat, drought, and saline resistance.
- Molecular Dynamics: Simulating protein folding under environmental stress "in-silico".
// HPC_COMPLEXITY_INDEX
A single wheat genome (Triticum aestivum) requires over 150GB of raw data for 30x coverage, demanding extreme I/O and RAM throughput.
HPC Infrastructure Requirements
Memory & Compute
- - High RAM Nodes: 1TB - 4TB per Node
- - Processors: AMD EPYC 9004 Series
- - GPU Power: NVIDIA H100 for Basecalling
Storage & IO
- - Filesystem: Lustre Parallel File System
- - Throughput: 500GB/s Burst Buffer
- - Interconnect: 400G InfiniBand NDR
Compute Environment
- - Scheduler: SLURM Workload Manager
- - Containers: Apptainer / Singularity
- - Stack: CUDA 12.x / ROCm 6.0
The Bioinformatics Stack
| Category | Core Tool | Application |
|---|---|---|
| Workflow | Nextflow | Scalable pipeline management across HPC nodes. |
| Assembly | Flye / SPAdes | De novo assembly of long/short read sequences. |
| Alignment | BWA-MEM | Mapping reads against a 16GB+ reference genome. |
| Variant Calling | GATK | Identifying SNPs and InDels for climate traits. |
| Simulation | GROMACS | Molecular dynamics of plant-environment interaction. |