Welcome to torchiva’s documentation!#
A package for blind source separation and beamforming in pytorch .
supports many BSS and beamforming methods
supports memory efficient gradient computation for training neural source models
supports batched computations
can run on GPU via pytorch
Quick Start#
The package can be installed via pip:
pip install torchiva
Separation using Pre-trained Model#
We provide a pre-trained model in trained_models/tiss. You can easily try separation with the pre-trained model:
# Separation
python -m torchiva.separate INPUT OUTPUT
where INPUT
is either a multichannel wav file or a folder containing
multichannel wav files. If a folder, then all the files inside are separted.
The output is saved to OUTPUT
.
The model stored in trained_models/tiss
is automatically downloaded to
$HOME/.torchiva_models
. The path or url to the model can also be
manually provided via the --model
option.
The model was trained on the WSJ1-mix dataset with the same configuration
as ./examples/configs/tiss.json
.
Training#
We provide some simple training scripts. We support training of T-ISS, MWF, MVDR, GEV:
cd examples
# install some modules necessary for training
pip install -r requirements.txt
# training
python train.py PATH_TO_CONFIG PATH_TO_DATASET
Note that our example scripts assumes using WSJ1-mix dataset. If you want to use other datasets, please change the script in the part that loads audios.
Test your trained model with checkpoint from epoch 128:
# python ./test.py --dataset ../wsj1_6ch --n_fft 2048 --hop 512 --n_iter 40 --iss-hparams checkpoints/tiss_delay1tap5_2ch/lightning_logs/version_0/hparams.yaml --epoch 128 --test
Export the trained model for later use:
python ./export_model.py ../trained_models/tiss checkpoints/tiss_delay1tap5_2ch/lightning_logs/version_0 128 146 148 138 122 116 112 108 104 97
Run the example script using the exported model:
python ./example_dnn.py ../wsj1_6ch ../trained_models/tiss -m 2 -r 100
License#
2022 (c) Robin Scheibler, Kohei Saijo, LINE Corporation.
All of this code is released under MIT License