In this post, I will summarize the deliverables and the installation procedure.
Deliverables
The RHA (Red Hen Anonymizer) is used to anonymize the faces in videos. You can either hide the face or replace with some other person’s face.
Dependencies
RHA strictly uses python3. There is no preferred version of python3, but any latest version will work.
Audio anonymizer requires sox
pip install sox # do not use conda install
The face hider requires packages of opencv, ffmpeg-python and MTCNN. So one can follow these commands
conda install -c conda-forge opencv
conda install -c conda-forge mtcnn
pip install ffmpeg-python
These are a few installations for swapper
conda install pytorch torchvision cudatoolkit=11.1 -c pytorch -c nvidia # actually any latest version will work
conda install -c conda-forge yacs
conda install -c conda-forge tqdm
conda install -c conda-forge matplotlib
pip install --upgrade tensorflow
pip install tensorboardX
pip install ffmpeg-python
Clone the repo
git clone https://github.com/yashkhasbage25/AnonymizingAudioVisualData.git --depth 1
# remove --depth 1 if you want to check how the code was gradually developed
If your machine has GPUs, do use them or the behaviour is unpexpected . Correcting this will unnecesarrily increase the amount of manual changes needed.
Not everyone is allowed to run RHA
The FSGAN present in RHA can be used for defaming or creating DeepFakes. Hence, its usage it not open to general public. The pretrained weights for FSGAN are not provided in this public repository for the same purpose.
If you want to get access to pretrained weights of FSGAN directly from FSGAN team, see the page https://github.com/YuvalNirkin/fsgan/wiki/Paper-Models-Inference and fill out their form. Upon knowing your purpose of using FSGAN, they will share you a script download_fsgan_models.py. You need to place it at AnonymizingAudioVisualData/fsgan/download_fsgan_models.py. Change its line
from fsgan.utils.utils import download_from_url
to
from utils.utils import download_from_url
The file download_fsgan_models.py can also be requested from the RedHen mentors (specifically, Mark Turner, Francis Steen, Peter Uhrig, Karan Singla, Daniel Alcaraz). Then the same instructions as mentioned above, can be followed.
Then, run the script
python download_fsgan_models.py -m v2
This will download the pretrained models at correct places.
Downloading Face Detector Weights
Install gdown using
pip install gdown
and download the weights
gdown https://drive.google.com/u/0/uc?id=1WeXlNYsM6dMP3xQQELI-4gxhwKUQxc3-
Place the weights at face_detection_dsfd/weights/WIDERFace_DSFD_RES152.pth
Unlike the FSGAN weights, these weights are publicly available.
** RHA face-swapper cannot be used at all without this step **
Running RHA
For CWRU HPC Users
There are some kernel incompatibilities for K40 GPUs. Hence, the following constraint -C 'gpu2v100|gpu4v100'
has to be added for gpu types.
srun -p gpu --gpus 1 --mem 8000 -C 'gpu2v100|gpu4v100' --pty bash
Enter into the cloned repo
cd AnonymizeAudioVisualData
There you can find rha.py. It is a single file for running hider and swapper.
Swapper:
python rha.py --input <input_video_path> --facepath <path_to_face image> --outpath <path_for_output video> --pitch <pitch change value>
For hider, do not use the –facepath option of the above command.
facepath is the imaginary face or a target face that should be present in the anonymized video. It should be visually visible. A 256x256 size photo is usually recommended, but you can try other sizes also. Rectangular photos are not allowed. However, input video can have any size and frame.
Pitch option is used for anonymizing audio. The value provided will change the pitch by that amount. It has to be an integer (both positive and negative integers). Usually, values near zero, hardly make any changes. Zero value actually, leaves the sound unchanged. Hence if you do not want to change sound, use –pitch 0.
It is known that female voice has high pitch and male voice has low pitch. Hence use a positive value like 3,4,5, etc to make it female-like. Use negative values likes -3, -4, -5 etc to make it more male-like.
For running on cpu, you need to use –cpu_only flag. This will only work for swapper. For hider, the use of gpu/cpu will depend on the tensorflow-gpu/cpu installed.
Additionally, we recommend the use of our facebank to get random target faces. (https://drive.google.com/drive/folders/1EGiVI3fMLwNiYG-Es-Sy5qqie9Co2eZI?usp=sharing)
There are some more installations mentioned in https://github.com/YuvalNirkin/fsgan/wiki/Ubuntu-Installation-Guide . However, these are mostly present in every modern linux distribution. I don’t think anybody will ever need to do the apt-get mentioned in this page.
If your machine has 4 gpus, and you want to use only gpus 0,3 (indexing starts at 0) then do
CUDA_VISIBLE_DEVICES=0,3 python rha.py <remaining_options>
Demo
Download the video covid.mp4 and face-image random_face.jpg from https://drive.google.com/drive/folders/1y3kytBHZULQ2gLDL0xKBmq2oR4uPx0D4?usp=sharing . Place them along with rha.py and run
Face Hider:
python rha.py --inpath covid.mp4 --outpath hider_output.mp4 --pitch -4
Face Swapper:
python rha.py --inpath covid.mp4 --facepath random_face.jpg --outpath swapper_output.mp4 --pitch -4
You can set pitch according to your choice. But since, the target face is male, we prefer a low pitched voice. We encourage you to try out several target faces, by downloading the facebank. (https://drive.google.com/drive/folders/1EGiVI3fMLwNiYG-Es-Sy5qqie9Co2eZI?usp=sharing)