Skip to content

Commit c9c8be2

Browse files
Nupur KumariNupur Kumari
authored andcommitted
customconcept101
1 parent c8dbe94 commit c9c8be2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+3953
-0
lines changed

assets/sample_images.png

10.7 MB
Loading

customconcept101/README.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
# CustomConcept101
2+
3+
We release a dataset of 101 concepts with 3-15 images for each concept for evaluating model customization methods.
4+
5+
<br>
6+
<div>
7+
<p align="center">
8+
<img src='../assets/sample_images.png' align="center" width=800>
9+
</p>
10+
</div>
11+
12+
13+
14+
## Download dataset
15+
16+
```
17+
git clone https://github.com/adobe-research/custom-diffusion.git
18+
cd custom-diffusion/customconcept101
19+
wget https://www.cs.cmu.edu/~custom-diffusion/assets/benchmark_dataset.zip
20+
unzip benchmark_dataset.zip
21+
```
22+
23+
## Evaluation
24+
25+
We provide a set of text prompts for each concept in the [prompts](prompts/) folder. The prompt file corresponding to each concept is mentioned in [dataset.json](dataset.json) and [dataset_multiconcept.json](dataset_multiconcept.json). The CLIP feature based image and text similarity can be calculated as:
26+
27+
```
28+
python evaluate.py --sample_root {folder} --target_path {target-folder} --numgen {numgen}
29+
```
30+
31+
* `sample_root`: the root location to generated images. The folder should contain subfolder `samples` with generated images. It should also contain a `prompts.json` file with `{'imagename.stem': 'text prompt'}` for each image in the samples subfolder.
32+
* `target_path`: file to target real images.
33+
* `numgen`: number of images in the `sample_root/samples` folder
34+
* `outpkl`: the location to save evaluation results (default: evaluation.pkl)
35+
36+
## Results
37+
We compare our method (Custom Diffusion) with [DreamBooth](https://dreambooth.github.io) and [Textual Inversion](https://textual-inversion.github.io) on this dataset. We trained DreamBooth and Textual Inversion according to the suggested hyperparameters in the respective papers. Both Ours and DreamBooth are trained with generated images as regularization.
38+
39+
**Single concept**
40+
41+
<table>
42+
<tr>
43+
<td></td>
44+
<td colspan=3 align=center > 200 DDPM </td>
45+
<td colspan=3 align=center> 50 DDPM </td>
46+
</tr>
47+
<tr>
48+
<td></td>
49+
<td>Textual-alignment (CLIP)</td>
50+
<td>Image-alignment (CLIP)</td>
51+
<td>Image-alignment (DINO)</td>
52+
<td>Textual-alignment (CLIP)</td>
53+
<td>Image-alignment (CLIP)</td>
54+
<td>Image-alignment (DINO)</td>
55+
</tr>
56+
<tr>
57+
<td>Textual Inversion</td>
58+
<td> 0.6126 </td>
59+
<td> 0.7524 </td>
60+
<td> 0.5111 </td>
61+
<td> 0.6117 </td>
62+
<td> 0.7530 </td>
63+
<td> 0.5128 </td>
64+
</tr>
65+
<tr>
66+
<td>DreamBooth</td>
67+
<td> 0.7522 </td>
68+
<td> 0.7520 </td>
69+
<td> 0.5533 </td>
70+
<td> 0.7514 </td>
71+
<td> 0.7521 </td>
72+
<td> 0.5541 </td>
73+
</tr>
74+
<tr>
75+
<td> Custom Diffusion (Ours)</td>
76+
<td> 0.7602 </td>
77+
<td> 0.7440 </td>
78+
<td> 0.5311 </td>
79+
<td> 0.7583 </td>
80+
<td> 0.7456 </td>
81+
<td> 0.5335 </td>
82+
</tr>
83+
</table>
84+
85+
**Multiple concept**
86+
87+
<table>
88+
<tr>
89+
<td></td>
90+
<td colspan=3 align=center > 200 DDPM </td>
91+
<td colspan=3 align=center> 50 DDPM </td>
92+
</tr>
93+
<tr>
94+
<td></td>
95+
<td>Textual-alignment (CLIP)</td>
96+
<td>Image-alignment (CLIP)</td>
97+
<td>Image-alignment (DINO)</td>
98+
<td>Textual-alignment (CLIP)</td>
99+
<td>Image-alignment (CLIP)</td>
100+
<td>Image-alignment (DINO)</td>
101+
</tr>
102+
<tr>
103+
<td>DreamBooth</td>
104+
<td> 0.7383 </td>
105+
<td> 0.6625 </td>
106+
<td> 0.3816 </td>
107+
<td> 0.7366 </td>
108+
<td> 0.6636 </td>
109+
<td> 0.3849 </td>
110+
</tr>
111+
<tr>
112+
<td>Custom Diffusion (Opt)</td>
113+
<td> 0.7627 </td>
114+
<td> 0.6577 </td>
115+
<td> 0.3650 </td>
116+
<td> 0.7599 </td>
117+
<td> 0.6595 </td>
118+
<td> 0.3684 </td>
119+
</tr>
120+
<tr>
121+
<td> Custom Diffusion (Joint)</td>
122+
<td> 0.7567 </td>
123+
<td> 0.6680 </td>
124+
<td> 0.3760 </td>
125+
<td> 0.7534 </td>
126+
<td> 0.6704 </td>
127+
<td> 0.3799 </td>
128+
</tr>
129+
</table>
130+
131+
132+
## License
133+
Images taken from UnSplash are under [Unsplash License](https://unsplash.com/license). Images captured by ourselves are released under [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.en) license. Flower category images are downloaded from Wikimedia/Flickr/Pixabay and the link to orginial images can be found [here](https://www.cs.cmu.edu/~custom-diffusion/assets/urls.txt).
134+
135+
136+
## Acknowledgments
137+
We are grateful to Sheng-Yu Wang, Songwei Ge, Daohan Lu, Ruihan Gao, Roni Shechtman, Avani Sethi, Yijia Wang, Shagun Uppal, and Zhizhuo Zhou for helping with the dataset collection, and Nick Kolkin for the feedback.

0 commit comments

Comments
 (0)