Build sparse weight triplets and generate `.grom`, `.gid`, and `.sid` outputs from model weights and PLINK2 genotype inputs.
Usage
grom_impute(
weights_table,
grom_pfx,
pgen_dir,
snp_chunk = 1000L,
sample_subset = integer(0),
CHUNK = 1000L,
exportChunk = 256L,
meanimpute = TRUE,
is_round = TRUE
)Arguments
- weights_table
A `data.frame` or `data.table` of model weights.
- grom_pfx
Output prefix for the generated grom files.
- pgen_dir
Path to a directory containing PLINK2 `.pgen`, `.pvar`, and `.psam` files.
- snp_chunk
Integer chunk size used while streaming SNPs. Specifies how many genotype records are loaded at a time and can be used to control peak and average memory usage.
- sample_subset
Integer vector specifying which samples to load and use for imputation.
- CHUNK
Integer chunk size passed to the low-level imputation engine. Specifies the tile size used during computation and can help maintain efficient CPU-cache utilization.
- exportChunk
Integer chunk size used when exporting results. Specifies how many output columns are written at a time and can be used to control peak and average memory usage.
- meanimpute
Logical indicating whether missing dosages should be mean imputed.
- is_round
Logical indicating whether mean imputation should use rounded means.
Details
Chromosome-to-file mapping uses the `chromosomes` column in `weights_table`. The basenames of the `.pgen` and `.pvar` files in `pgen_dir` must contain the corresponding chromosome tag. For example:
weights_table$chromosomes:
chr1
chr2
chr10
Matching filenames in pgen_dir:
cohort.chr1.pgen
cohort.chr1.pvar
cohort.chr2.pgen
cohort.chr2.pvar
cohort.chr10.pgen
cohort.chr10.pvar
Matching is unique by chromosome tag, so `chr1` does not match `chr10`. Each chromosome tag must resolve to exactly one `.pgen` and one `.pvar` file. The `.psam` file is discovered separately and is not chromosome-tag matched.
To confirm the resolved mapping after a run, inspect the generated files `meta/chr_to_pgen_map.tsv` and `meta/chr_to_pvar_map.tsv` under the output directory. These record which chromosome tag was matched to which genotype file basename. On completion, `grom_impute()` also reports the output directory path in the console message stream.
Examples
library(gromtools)
pgen_dir <- system.file(
"extdata",
"synthetic_chromosomes",
package = "gromtools"
)
db_directory <- system.file(
"extdata",
"synth_small_variant_weights_db",
package = "gromtools"
)
model_weights_table <- read_db_dir(db_dir = db_directory)
out_dir <- file.path(tempdir(), "tmp_grom_run")
dir.create(out_dir, recursive = TRUE, showWarnings = FALSE)
grom_pfx <- file.path(out_dir, "synth_example")
grom_impute(
weights_table = model_weights_table,
grom_pfx = grom_pfx,
pgen_dir = pgen_dir
)
#> All chromosome tags were uniquely mapped to a .pgen file.
#> ### impute_grom() started at: 2026-04-20 01:57:31
#> ### impute_grom() completed at: 2026-04-20 01:57:31
#> Results stored under directory /tmp/RtmpLdOp7N/tmp_grom_run