Compare commits

..

No commits in common. 'main' and 'fix-disco-xform-utils-import-error' have entirely different histories.

  1. 146
      .gitignore
  2. 1577
      Disco_Diffusion.ipynb
  3. 18
      LICENSE
  4. 57
      README.md
  5. 1711
      disco.py
  6. 24
      disco_utils.py
  7. 31
      disco_xform_utils.py
  8. 47
      docker/README.md
  9. 40
      docker/main/Dockerfile
  10. 25
      docker/prep/Dockerfile

146
.gitignore vendored

@ -1,146 +0,0 @@
# Disco-specfic ignores
init_images/*
images_out/*
MiDaS/
models/
pretrained/*
settings.json
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/

File diff suppressed because it is too large Load Diff

@ -50,7 +50,6 @@ Licensed under the MIT License
Copyright (c) 2021 Maxwell Ingham
Copyright (c) 2022 Adam Letts
Copyright (c) 2022 Alex Spirin
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
@ -69,20 +68,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
--
flow-related - https://github.com/NVIDIA/flownet2-pytorch/blob/master/LICENSE
--
Copyright 2017 NVIDIA CORPORATION
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

@ -6,16 +6,7 @@ A frankensteinian amalgamation of notebooks, models and techniques for the gener
[to be updated with further info soon]
## Contributing
This project uses a special conversion tool to convert the python files into notebooks for easier development.
What this means is you do not have to touch the notebook directly to make changes to it
the tool being used is called [Colab-Convert](https://github.com/MSFTserver/colab-convert)
- install using `pip install colab-convert`
- convert .py to .ipynb `colab-convert /path/to/file.py /path/to/file.ipynb`
- convert .ipynb to .py `colab-convert /path/to/file.ipynb /path/to/file.py`
## Changelog
@ -35,12 +26,12 @@ the tool being used is called [Colab-Convert](https://github.com/MSFTserver/cola
* Fixed issue with NaNs resulting in black images, with massive help and testing from @Softology
* Perlin now changes properly within batches (not sure where this perlin_regen code came from originally, but thank you)
#### v4 Update: Jan 2022 - Somnai
#### v4 Update: Jan 2021 - Somnai
* Implemented Diffusion Zooming
* Added Chigozie keyframing
* Made a bunch of edits to processes
#### v4.1 Update: Jan 14th 2022 - Somnai
#### v4.1 Update: Jan 14th 2021 - Somnai
* Added video input mode
* Added license that somehow went missing
* Added improved prompt keyframing, fixed image_prompts and multiple prompts
@ -73,37 +64,13 @@ the tool being used is called [Colab-Convert](https://github.com/MSFTserver/cola
* Remove Slip Models
* Update for crossplatform support
#### v5.2 Update: Apr 10th 2022 - nin_artificial / Tom Mason
* VR Mode
#### v5.3 Update: Jun 10th 2022 - nshepperd, huemin, cut_pow
* Horizontal and Vertical symmetry
* Addition of ViT-L/14@336px model (requires high VRAM)
#### v5.4 Update: Jun 14th 2022 - devdef / Alex Spirin, integrated into DD main by gandamu / Adam Letts
* Warp mode - for smooth/continuous video input results leveraging optical flow estimation and frame blending
* Custom models support
#### v5.5 Update: Jul 11th 2022 - Palmweaver / Chris Scalf, KaliYuga_ai, further integration by gandamu / Adam Letts
* OpenCLIP models integration
* Pixel Art Diffusion, Watercolor Diffusion, and Pulp SciFi Diffusion models
* cut_ic_pow scheduling
#### v5.6 Update: Jul 13th 2022 - Felipe3DArtist, integration by gandamu / Adam Letts
* Integrated portrait_generator_v001 - 512x512 diffusion model trained on faces - from Felipe3DArtist
## Notebook Provenance
## Notebook Provenance
Original notebook by Katherine Crowson (https://github.com/crowsonkb, https://twitter.com/RiversHaveWings). It uses either OpenAI's 256x256 unconditional ImageNet or Katherine Crowson's fine-tuned 512x512 diffusion model (https://github.com/openai/guided-diffusion), together with CLIP (https://github.com/openai/CLIP) to connect text prompts with images.
Modified by Daniel Russell (https://github.com/russelldc, https://twitter.com/danielrussruss) to include (hopefully) optimal params for quick generations in 15-100 timesteps rather than 1000, as well as more robust augmentations.
Further improvements from Dango233 and nshepperd helped improve the quality of diffusion in general, and especially so for shorter runs like this notebook aims to achieve.
Further improvements from Dango233 and nsheppard helped improve the quality of diffusion in general, and especially so for shorter runs like this notebook aims to achieve.
Vark added code to load in multiple Clip models at once, which all prompts are evaluated against, which may greatly improve accuracy.
@ -116,19 +83,3 @@ Advanced DangoCutn Cutout method is also from Dango223.
Somnai (https://twitter.com/Somnai_dreams) added 2D Diffusion animation techniques, QoL improvements and various implementations of tech and techniques, mostly listed in the changelog below.
3D animation implementation added by Adam Letts (https://twitter.com/gandamu_ml) in collaboration with Somnai.
Turbo feature by Chris Allen (https://twitter.com/zippy731)
Improvements to ability to run on local systems, Windows support, and dependency installation by HostsServer (https://twitter.com/HostsServer)
VR Mode by Tom Mason (https://twitter.com/nin_artificial)
Horizontal and Vertical symmetry functionality by nshepperd. Symmetry transformation_steps by huemin (https://twitter.com/huemin_art). Symmetry integration into Disco Diffusion by Dmitrii Tochilkin (https://twitter.com/cut_pow).
Warp and custom model support by Alex Spirin (https://twitter.com/devdef).
Pixel Art Diffusion, Watercolor Diffusion, and Pulp SciFi Diffusion models from KaliYuga (https://twitter.com/KaliYuga_ai). Follow KaliYuga's Twitter for the latest models and for notebooks with specialized settings.
Integration of OpenCLIP models and initiation of integration of KaliYuga models by Palmweaver / Chris Scalf (https://twitter.com/ChrisScalf11)
Integrated portrait_generator_v001 from Felipe3DArtist (https://twitter.com/Felipe3DArtist)

1711
disco.py

File diff suppressed because it is too large Load Diff

@ -1,24 +0,0 @@
import subprocess
from importlib import util as importlibutil
def module_exists(module_name):
return importlibutil.find_spec(module_name)
def gitclone(url, targetdir=None):
if targetdir:
res = subprocess.run(['git', 'clone', url, targetdir], stdout=subprocess.PIPE).stdout.decode('utf-8')
else:
res = subprocess.run(['git', 'clone', url], stdout=subprocess.PIPE).stdout.decode('utf-8')
print(res)
def pipi(modulestr):
res = subprocess.run(['pip', 'install', modulestr], stdout=subprocess.PIPE).stdout.decode('utf-8')
print(res)
def pipie(modulestr):
res = subprocess.run(['git', 'install', '-e', modulestr], stdout=subprocess.PIPE).stdout.decode('utf-8')
print(res)
def wget(url, outputdir):
res = subprocess.run(['wget', url, '-P', f'{outputdir}'], stdout=subprocess.PIPE).stdout.decode('utf-8')
print(res)

@ -12,10 +12,9 @@ except:
sys.exit()
MAX_ADABINS_AREA = 500000
MIN_ADABINS_AREA = 448*448
@torch.no_grad()
def transform_image_3d(img_filepath, midas_model, midas_transform, device, rot_mat=torch.eye(3).unsqueeze(0), translate=(0.,0.,-0.04), near=2000, far=20000, fov_deg=60, padding_mode='border', sampling_mode='bicubic', midas_weight = 0.3,spherical=False):
def transform_image_3d(img_filepath, midas_model, midas_transform, device, rot_mat=torch.eye(3).unsqueeze(0), translate=(0.,0.,-0.04), near=2000, far=20000, fov_deg=60, padding_mode='border', sampling_mode='bicubic', midas_weight = 0.3):
img_pil = Image.open(open(img_filepath, 'rb')).convert('RGB')
w, h = img_pil.size
image_tensor = torchvision.transforms.functional.to_tensor(img_pil).to(device)
@ -28,23 +27,17 @@ def transform_image_3d(img_filepath, midas_model, midas_transform, device, rot_m
predictions using nyu dataset
"""
print("Running AdaBins depth estimation implementation...")
infer_helper = InferenceHelper(dataset='nyu', device=device)
infer_helper = InferenceHelper(dataset='nyu')
image_pil_area = w*h
if image_pil_area > MAX_ADABINS_AREA:
scale = math.sqrt(MAX_ADABINS_AREA) / math.sqrt(image_pil_area)
depth_input = img_pil.resize((int(w*scale), int(h*scale)), Image.LANCZOS) # LANCZOS is supposed to be good for downsampling.
elif image_pil_area < MIN_ADABINS_AREA:
scale = math.sqrt(MIN_ADABINS_AREA) / math.sqrt(image_pil_area)
depth_input = img_pil.resize((int(w*scale), int(h*scale)), Image.BICUBIC)
else:
depth_input = img_pil
try:
_, adabins_depth = infer_helper.predict_pil(depth_input)
if image_pil_area != MAX_ADABINS_AREA:
adabins_depth = torchvision.transforms.functional.resize(torch.from_numpy(adabins_depth), image_tensor.shape[-2:], interpolation=torchvision.transforms.functional.InterpolationMode.BICUBIC).squeeze().to(device)
else:
adabins_depth = torch.from_numpy(adabins_depth).squeeze().to(device)
adabins_depth = torchvision.transforms.functional.resize(torch.from_numpy(adabins_depth), image_tensor.shape[-2:], interpolation=torchvision.transforms.functional.InterpolationMode.BICUBIC).squeeze().to(device)
adabins_depth_np = adabins_depth.cpu().numpy()
except:
pass
@ -107,25 +100,9 @@ def transform_image_3d(img_filepath, midas_model, midas_transform, device, rot_m
# coords_2d will have shape (N,H,W,2).. which is also what grid_sample needs.
coords_2d = torch.nn.functional.affine_grid(identity_2d_batch, [1,1,h,w], align_corners=False)
offset_coords_2d = coords_2d - torch.reshape(offset_xy, (h,w,2)).unsqueeze(0)
if spherical:
spherical_grid = get_spherical_projection(h, w, torch.tensor([0,0], device=device), -0.4,device=device)#align_corners=False
stage_image = torch.nn.functional.grid_sample(image_tensor.add(1/512 - 0.0001).unsqueeze(0), offset_coords_2d, mode=sampling_mode, padding_mode=padding_mode, align_corners=True)
new_image = torch.nn.functional.grid_sample(stage_image, spherical_grid,align_corners=True) #, mode=sampling_mode, padding_mode=padding_mode, align_corners=False)
else:
new_image = torch.nn.functional.grid_sample(image_tensor.add(1/512 - 0.0001).unsqueeze(0), offset_coords_2d, mode=sampling_mode, padding_mode=padding_mode, align_corners=False)
new_image = torch.nn.functional.grid_sample(image_tensor.add(1/512 - 0.0001).unsqueeze(0), offset_coords_2d, mode=sampling_mode, padding_mode=padding_mode, align_corners=False)
img_pil = torchvision.transforms.ToPILImage()(new_image.squeeze().clamp(0,1.))
torch.cuda.empty_cache()
return img_pil
def get_spherical_projection(H, W, center, magnitude,device):
xx, yy = torch.linspace(-1, 1, W,dtype=torch.float32,device=device), torch.linspace(-1, 1, H,dtype=torch.float32,device=device)
gridy, gridx = torch.meshgrid(yy, xx)
grid = torch.stack([gridx, gridy], dim=-1)
d = center - grid
d_sum = torch.sqrt((d**2).sum(axis=-1))
grid += d * d_sum.unsqueeze(-1) * magnitude
return grid.unsqueeze(0)

@ -1,47 +0,0 @@
# Docker
## Introduction
This is a Docker build file that will preinstall dependencies, packages, Git repos, and pre-cache the large model files needed by Disco Diffusion.
## TO-DO:
- Make container actually accept parameters on run. Right now you'll just be seeing lighthouses.
## Change Log
- `1.0`
Initial build file created based on the DD 5.1 Git repo. This initial build is deliberately meant to work touch-free of any of the existing Python code written. It does handle some of the pre-setup tasks already done in the Python code such as pip packages, Git clones, and even pre-caching the model files for faster launch speed.
## Build the Prep Image
The prep image is broken out from the `main` folder's `Dockerfile` to help with long build context times (or wget download times after intitial build.) This prep image build contains all the large model files required by Disco Diffusion.
From a terminal in the `docker/prep` directory, run:
```sh
docker build -t disco-diffusion-prep:5.1 .
```
From a terminal in the `docker/main` directory, run:
## Build the Image
From a terminal, run:
```sh
docker build -t disco-diffusion:5.1 .
```
## Run as a Container
This example runs Disco Diffusion in a Docker container. It maps `images_out` and `init_images` to the container's working directory to access by the host OS.
```sh
docker run --rm -it \
-v $(echo ~)/disco-diffusion/images_out:/workspace/code/images_out \
-v $(echo ~)/disco-diffusion/init_images:/workspace/code/init_images \
--gpus=all \
--name="disco-diffusion" --ipc=host \
--user $(id -u):$(id -g) \
disco-diffusion:5.1 python disco-diffusion/disco.py
```
## Passing Parameters
This will be added after conferring with repo authors.

@ -1,40 +0,0 @@
# Model prep phase, also cuts down on build context wait time since these models files
# are large and prone to take long to copy...
FROM disco-diffusion-prep:5.1 AS modelprep
FROM nvcr.io/nvidia/pytorch:21.08-py3
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
# Install a few dependencies
RUN apt update
RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install -y tzdata imagemagick
# Create a disco user
RUN useradd -ms /bin/bash disco
USER disco
# Set up code directory
RUN mkdir code
WORKDIR /workspace/code
# Copy over models used
COPY --from=modelprep /scratch/models /workspace/code/models
COPY --from=modelprep /scratch/pretrained /workspace/code/pretrained
# Clone Git repositories
RUN git clone https://github.com/alembics/disco-diffusion.git && \
git clone https://github.com/openai/CLIP && \
git clone https://github.com/assafshocher/ResizeRight.git && \
git clone https://github.com/MSFTserver/pytorch3d-lite.git && \
git clone https://github.com/isl-org/MiDaS.git && \
git clone https://github.com/crowsonkb/guided-diffusion.git && \
git clone https://github.com/shariqfarooq123/AdaBins.git
# Install Python packages
RUN pip install imageio imageio-ffmpeg==0.4.4 pyspng==0.1.0 lpips datetime timm ipywidgets omegaconf>=2.0.0 pytorch-lightning>=1.0.8 torch-fidelity einops wandb pandas ftfy
# Precache other big files
COPY --chown=disco --from=modelprep /scratch/clip /home/disco/.cache/clip
COPY --chown=disco --from=modelprep /scratch/model-lpips/vgg16-397923af.pth /home/disco/.cache/torch/hub/checkpoints/vgg16-397923af.pth

@ -1,25 +0,0 @@
FROM nvcr.io/nvidia/pytorch:21.08-py3 AS prep
RUN mkdir -p /scratch/models && \
mkdir -p /scratch/models/superres && \
mkdir -p /scratch/models/slip && \
mkdir -p /scratch/model-lpips && \
mkdir -p /scratch/clip && \
mkdir -p /scratch/pretrained
RUN wget --progress=bar:force:noscroll -P /scratch/model-lpips https://download.pytorch.org/models/vgg16-397923af.pth
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/models https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/models https://v-diffusion.s3.us-west-2.amazonaws.com/512x512_diffusion_uncond_finetune_008100.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/models https://openaipublic.blob.core.windows.net/diffusion/jul-2021/256x256_diffusion_uncond.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/models https://v-diffusion.s3.us-west-2.amazonaws.com/secondary_model_imagenet_2.pth
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/pretrained https://cloudflare-ipfs.com/ipfs/Qmd2mMnDLWePKmgfS8m6ntAg4nhV5VkUyAydYBp8cWWeB7/AdaBins_nyu.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip/ https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN50.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/8fa8567bab74a42d41c5915025a8e4538c3bdbe8804a470a72f30b0d94fab599/RN101.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/7e526bd135e493cef0776de27d5f42653e6b4c8bf9e0f653bb11773263205fdd/RN50x4.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/52378b407f34354e150460fe41077663dd5b39c54cd0bfd2b27167a4a06ec9aa/RN50x16.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/be1cfb55d75a9666199fb2206c106743da0f6468c9d327f3e0d0a543a9919d9c/RN50x64.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt
RUN wget --no-directories --progress=bar:force:noscroll -P /scratch/clip https://openaipublic.azureedge.net/clip/models/b8cca3fd41ae0c99ba7e8951adf17d267cdb84cd88be6f7c2e0eca1737a03836/ViT-L-14.pt
Loading…
Cancel
Save