Io¶
btorch.io
¶
I/O utilities for serializing and loading simulation data.
This module provides helpers for converting between PyTorch tensors, nested dictionaries, and persistent storage formats (xarray/Zarr). Key features include:
- Sparse array encoding for efficient spike storage
- Dimension-aware serialization with flexible (time, batch, neuron) grouping
- Automatic handling of partial recordings on neuron subsets
- Compression and chunking for large datasets
Main entry points
memories_to_xarray: Convert nested dict to xr.Datasetxarray_to_memories: Restore nested dict from xr.Datasetsave_memories_to_xarray: Save dict directly to Zarrload_memories_from_xarray: Load dict from Zarr store
Attributes¶
__all__ = ['dict_to_xarray', 'from_spike_sparse', 'load_dict_from_xarray', 'load_memories_from_xarray', 'memories_to_xarray', 'save_dict_to_xarray', 'save_memories_to_xarray', 'to_sparse_repr', 'xarray_to_dict', 'xarray_to_memories']
module-attribute
¶
dict_to_xarray = memories_to_xarray
module-attribute
¶
load_dict_from_xarray = load_memories_from_xarray
module-attribute
¶
save_dict_to_xarray = save_memories_to_xarray
module-attribute
¶
xarray_to_dict = xarray_to_memories
module-attribute
¶
Functions¶
from_spike_sparse(ds, var_name, return_sparse_2d=False)
¶
Reconstruct a dense or scipy sparse array from btorch sparse encoding.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Dataset containing the sparse-encoded variable. |
required |
var_name
|
str
|
Name of the sparse marker variable (the scalar with attrs). |
required |
return_sparse_2d
|
bool
|
If True and the original was 2D, return a scipy coo_array instead of dense numpy. |
False
|
Returns:
| Type | Description |
|---|---|
ndarray | coo_array
|
A tuple of (array, used_variable_names). The array is either dense |
set[str]
|
numpy or scipy sparse (if 2D and requested). used_variable_names |
tuple[ndarray | coo_array, set[str]]
|
contains all dataset keys consumed during reconstruction. |
Shape semantics
- Output array has shape from
original_shapeattrs - Dense output: numpy array of original dtype
- Sparse output (2D only): scipy.sparse.coo_array
Source code in btorch/io/serialization.py
load_memories_from_xarray(path, dask=False, return_sparse_2d=False)
¶
Load a nested dictionary from a Zarr store.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to the Zarr store. |
required |
dask
|
bool
|
If True, return Dask-backed arrays (lazy loading). If False, load into memory immediately. |
False
|
return_sparse_2d
|
bool
|
If True, return 2D arrays as scipy sparse coo_array. |
False
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Nested dictionary with restored structure. |
Source code in btorch/io/serialization.py
memories_to_xarray(memories, dim_counts=None, dim_names=('time', 'batch', 'neuron'), neuron_ids=None, hint_field=None, partial_map=None, strict_dims=True, spike_suffix='spike', spike_dtype=bool, sparse_threshold=0.05, force_sparse=False)
¶
Convert a nested dictionary of simulation results into an xr.Dataset.
This function flattens a nested dictionary (e.g., from a simulation run containing spike trains, voltages, and synaptic states) and converts it into an xarray Dataset with consistent dimension naming and optional sparse encoding for spike arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
memories
|
dict[str, Any]
|
Nested dictionary of arrays/tensors. Keys become variable names (dot-separated for nested dicts). |
required |
dim_counts
|
Sequence[int] | None
|
Number of dimensions per logical group (time, batch,
neuron). If None, inferred from |
None
|
dim_names
|
Sequence[str]
|
Logical group names for dimensions. |
('time', 'batch', 'neuron')
|
neuron_ids
|
Any | None
|
Optional neuron identifiers for |
None
|
hint_field
|
str | None
|
Field name (flattened, dot-separated) to use as shape template for dimension inference. |
None
|
partial_map
|
dict[str, Any] | None
|
Dict of |
None
|
strict_dims
|
bool
|
If True, enforce exact dimension structure match. If False, allow lower-rank arrays (e.g., parameters). |
True
|
spike_suffix
|
str
|
Substring to identify spike arrays for sparse encoding. |
'spike'
|
spike_dtype
|
Any
|
Data type for spikes when converting dense to sparse. |
bool
|
sparse_threshold
|
float
|
Sparsity ratio threshold for triggering sparse encoding (nnz / total_size < threshold). |
0.05
|
force_sparse
|
bool | Sequence[str]
|
If True, force sparse encoding for all spike arrays. Can also be a list of specific field names to force sparse. |
False
|
Returns:
| Type | Description |
|---|---|
Dataset
|
xr.Dataset with all variables, coordinates, and sparse encodings. |
Example
memories = { ... "spike": torch.randn(100, 32, 128) > 0, # (T, B, N) ... "v": torch.randn(100, 32, 128), ... } ds = memories_to_xarray(memories, dim_counts=(1, 1, 1)) ds # Dataset with dims (time: 100, batch: 32, neuron: 128)
Source code in btorch/io/serialization.py
326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 | |
save_memories_to_xarray(data, path, dim_counts=None, dim_names=('time', 'batch', 'neuron'), neuron_ids=None, hint_field=None, partial_map=None, strict_dims=True, spike_suffix='spike', spike_dtype=bool, sparse_threshold=0.05, compression_level=5, chunks=None, overwrite=True)
¶
Save a nested dictionary to a Zarr store via xarray.
Convenience wrapper that converts the dictionary to a Dataset and saves with compression and optional chunking.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict[str, Any]
|
Nested dictionary of arrays/tensors to save. |
required |
path
|
str | Path
|
Path to the output Zarr store. |
required |
dim_counts
|
Sequence[int] | None
|
Dimension counts per logical group (see
|
None
|
dim_names
|
Sequence[str]
|
Logical dimension names. |
('time', 'batch', 'neuron')
|
neuron_ids
|
Any | None
|
Optional neuron identifiers. |
None
|
hint_field
|
str | None
|
Field to use for shape inference. |
None
|
partial_map
|
dict[str, Any] | None
|
Partial recording indices for subset fields. |
None
|
strict_dims
|
bool
|
Enforce strict dimension matching. |
True
|
spike_suffix
|
str
|
Substring identifying spike arrays. |
'spike'
|
spike_dtype
|
Any
|
Dtype for spike conversion. |
bool
|
sparse_threshold
|
float
|
Sparsity threshold for sparse encoding. |
0.05
|
compression_level
|
int
|
Zstd compression level (1-9, higher=smaller). |
5
|
chunks
|
dict[str, int] | None
|
Optional chunk sizes per dimension, e.g.,
|
None
|
overwrite
|
bool
|
If True, overwrite existing store. If False, raise error if store exists. |
True
|
Source code in btorch/io/serialization.py
to_sparse_repr(val, var_dims, var_name)
¶
Convert a dense or sparse array to sparse COO representation for storage.
Supports arbitrary dtypes (float, int, bool). Only non-zero entries are stored. The returned dictionary contains index arrays per dimension and a data array, suitable for constructing an xr.Dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
val
|
ndarray | spmatrix | sparray
|
Array to encode. Can be dense numpy or scipy sparse. |
required |
var_dims
|
Sequence[str]
|
Physical dimension names for this variable (e.g., ["time", "batch", "neuron"]). |
required |
var_name
|
str
|
Base name for the variable (used to name output keys). |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dictionary mapping variable names to (dims, data) tuples or xr.DataArray |
dict[str, Any]
|
coords. Keys include:
- |
Shape semantics
- Input array with shape
(*var_dims)andnnznon-zeros - Output index arrays: each has shape
(nnz,) - Output data array: shape
(nnz,), dtype preserved from input
Source code in btorch/io/serialization.py
xarray_to_memories(ds, return_sparse_2d=False)
¶
Convert an xr.Dataset back to a nested dictionary.
Reconstructs the original nested dictionary structure from a Dataset
created by memories_to_xarray. Handles sparse-encoded variables
automatically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ds
|
Dataset
|
Dataset to convert (typically loaded from Zarr). |
required |
return_sparse_2d
|
bool
|
If True, return 2D arrays as scipy sparse coo_array instead of dense numpy. |
False
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Nested dictionary with restored variable names and structure. |
Example
ds = xr.open_zarr("simulation.zarr") memories = xarray_to_memories(ds) memories["spike"].shape # (T, B, N) or scipy sparse