back to home

AliaksandrSiarohin / first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation

15,004 stars
3,278 forks
317 issues
Jupyter NotebookPythonDockerfile

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing AliaksandrSiarohin/first-order-model in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/AliaksandrSiarohin/first-order-model)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

!!! Check out our new paper and framework improved for articulated objects First Order Motion Model for Image Animation This repository contains the source code for the paper First Order Motion Model for Image Animation by Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci and Nicu Sebe. Hugging Face Spaces Example animations The videos on the left show the driving videos. The first row on the right for each dataset shows the source videos. The bottom row contains the animated sequences with motion transferred from the driving video and object taken from the source image. We trained a separate network for each task. VoxCeleb Dataset Fashion Dataset MGIF Dataset Installation We support . To install the dependencies run: YAML configs There are several configuration ( ) files one for each . See to get description of each parameter. Pre-trained checkpoint Checkpoints can be found under following link: google-drive or yandex-disk. Animation Demo To run a demo, download checkpoint and run the following command: The result will be stored in . The driving videos and source images should be cropped before it can be used in our method. To obtain some semi-automatic crop suggestions you can use . It will generate commands for crops using ffmpeg. In order to use the script, face-alligment library is needed: Animation demo with Docker If you are having trouble getting the demo to work because of library compatibility issues, and you're running Linux, you might try running it inside a Docker container, which would give you better control over the execution environment. Requirements: Docker 19.03+ and nvidia-docker installed and able to successfully run the usage tests. We'll first build the container. And now that we have the container available locally, we can use it to run the demo. Colab Demo @graphemecluster prepared a GUI demo for the Google Colab. It also works in Kaggle. For the source code, see . For the old demo, see . Face-swap It is possible to modify the method to perform face-swap using supervised segmentation masks. For both unsupervised and supervised video editing, such as face-swap, please refer to Motion Co-Segmentation. Training To train a model on specific dataset run: The code will create a folder in the log directory (each run will create a time-stamped new directory). Checkpoints will be saved to this folder. To check the loss values during training see . You can also check training data reconstructions in the subfolder. By default the batch size is tuned to run on 2 or 4 Titan-X gpu (apart from speed it does not make much difference). You can change the batch size in the train_params in corresponding file. Evaluation on video reconstruction To evaluate the reconstruction performance run: You will need to specify the path to the checkpoint, the subfolder will be created in the checkpoint folder. The generated video will be stored to this folder, also generated videos will be stored in subfolder in loss-less '.png' format for evaluation. Instructions for computing metrics from the paper can be found: https://github.com/AliaksandrSiarohin/pose-evaluation. Image animation In order to animate videos run: You will need to specify the path to the checkpoint, the subfolder will be created in the same folder as the checkpoint. You can find the generated video there and its loss-less version in the subfolder. By default video from test set will be randomly paired, but you can specify the "source,driving" pairs in the corresponding files. The path to this file should be specified in corresponding file in pairs_list setting. There are 2 different ways of performing animation: by using **absolute** keypoint locations or by using **relative** keypoint locations. 1) Animation using absolute coordinates: the animation is performed using the absolute positions of the driving video and appearance of the source image. In this way there are no specific requirements for the driving video and source appearance that is used. However this usually leads to poor performance since irrelevant details such as shape is transferred. Check animate parameters in to enable this mode. 2) Animation using relative coordinates: from the driving video we first estimate the relative movement of each keypoint, then we add this movement to the absolute position of keypoints in the source image. This keypoint along with source image is used for animation. This usually leads to better performance, however this requires that the object in the first frame of the video and in the source image have the same pose Datasets 1) **Bair**. This dataset can be directly downloaded. 2) **Mgif**. This dataset can be directly downloaded. 3) **Fashion**. Follow the instruction on dataset downloading from. 4) **Taichi**. Follow the instructions in data/taichi-loading or instructions from https://github.com/AliaksandrSiarohin/video-preprocessing. 5) **Nemo**. Please follow the instructions on how to download the dataset. Then the dataset should be preprocessed using scripts from https://github.com/AliaksandrSiarohin/video-preprocessing. 6) **VoxCeleb**. Please follow the instruction from https://github.com/AliaksandrSiarohin/video-preprocessing. Training on your own dataset 1) Resize all the videos to the same size e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images. We recommend the later, for each video make a separate folder with all the frames in '.png' format. This format is loss-less, and it has better i/o performance. 2) Create a folder with 2 subfolders and , put training videos in the and testing in the . 3) Create a config , in dataset_params specify the root dir the . Also adjust the number of epoch in train_params. Additional notes Citation: