# Guide to Hugging Face‘s PhotoMaker Model

## Introduction

Users can input one or a few face photos, along with a text prompt, to receive a customized photo or painting within seconds. Additionally, this model can be adapted to any base model based on SDXL or used in conjunction with other LoRA modules.

## Realistic Results

![Realistic Result 1](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/BYBZNyfmN4jBKBxxt4uxz.jpeg)
![Realistic Result 2](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/9KYqoDxfbNVLzVKZzSzwo.jpeg)

## Stylization Results

![Stylization Result 1](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/du884lcjpqqjnJIxpATM2.jpeg)
![Stylization Result 2](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/-AC7Hr5YL4yW1zXGe_Izl.jpeg)

More results can be found on the [project page](https://photo-maker.github.io/).

## Model Details

The model mainly contains two parts corresponding to two keys in the loaded state dict:
1. `id_encoder`: includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers.
2. `lora_weights`: applies to all attention layers in the UNet, and the rank is set to 64.

## Usage

You can directly download the model from the repository. Alternatively, download the model in the python script:

“`python
from huggingface_hub import hf_hub_download
photomaker_ckpt = hf_hub_download(repo_id=”TencentARC/PhotoMaker”, filename=”photomaker-v1.bin”, repo_type=”model”)
“`

Then, please follow the instructions in our [GitHub repository](https://github.com/TencentARC/PhotoMaker).

## Limitations

– The model’s customization performance degrades on Asian male faces.
– The model still struggles with accurately rendering human hands.

## Bias

While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.

## Citation

**BibTeX:**
“`bibtex
@article{li2023photomaker,
title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
booktitle={arXiv preprint arxiv:2312.04461},
year={2023}
}
“`

This guide provides an overview of Hugging Face‘s PhotoMaker model and its features. For more detailed information, refer to the [official documentation](https://huggingface.co/docs/hub/).

Source link
# Huggingface PhotoMaker Manual

## Introduction

Users can input one or a few face photos, along with a text prompt, to receive a customized photo or painting within seconds. No training is required! This model can also be adapted to any base model based on SDXL or used in conjunction with other LoRA modules.

## Realistic Results

![Realistic Result 1](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/BYBZNyfmN4jBKBxxt4uxz.jpeg)

![Realistic Result 2](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/9KYqoDxfbNVLzVKZzSzwo.jpeg)

For more realistic results, visit our [project page](https://photo-maker.github.io/).

## Stylization Results

![Stylization Result 1](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/du884lcjpqqjnJIxpATM2.jpeg)

![Stylization Result 2](https://cdn-uploads.huggingface.co/production/uploads/6285a9133ab6642179158944/-AC7Hr5YL4yW1zXGe_Izl.jpeg)

## Model Details

The model mainly contains two parts corresponding to two keys in the loaded state dict:

1. `id_encoder`: Includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers.
2. `lora_weights`: Applies to all attention layers in the UNet, and the rank is set to 64.

## Usage

You can directly download the model in this repository. Additionally, you can download the model in a Python script:

“`python
from huggingface_hub import hf_hub_download
photomaker_ckpt = hf_hub_download(repo_id=”TencentARC/PhotoMaker”, filename=”photomaker-v1.bin”, repo_type=”model”)
“`

Then, please follow the instructions in our [GitHub repository](https://github.com/TencentARC/PhotoMaker).

## Limitations

– The model’s customization performance degrades on Asian male faces.
– The model still struggles with accurately rendering human hands.

## Bias

While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.

## Citation

**BibTeX:**
“`bibtex
@article{li2023photomaker,
title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
booktitle={arXiv preprint arxiv:2312.04461},
year={2023}
}
“`

For more information, visit the [Huggingface PhotoMaker project page](https://huggingface.co/).



Introduction

Users can input one or a few face photos, along with a text prompt, to receive a customized photo or painting within seconds (no training required!). Additionally, this model can be adapted to any base model based on SDXL or used in conjunction with other LoRA modules.



Realistic results

image/jpeg

image/jpeg



Stylization results

image/jpeg

image/jpeg

More results can be found in our project page



Model Details

It mainly contains two parts corresponding to two keys in loaded state dict:

  1. id_encoder includes finetuned OpenCLIP-ViT-H-14 and a few fuse layers.

  2. lora_weights applies to all attention layers in the UNet, and the rank is set to 64.



Usage

You can directly download the model in this repository.
You also can download the model in python script:

from huggingface_hub import hf_hub_download
photomaker_ckpt = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model")

Then, please follow the instructions in our GitHub repository.



Limitations

  • The model’s customization performance degrades on Asian male faces.
  • The model still struggles with accurately rendering human hands.



Bias

While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.



Citation

BibTeX:

@article{li2023photomaker,
  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
  booktitle={arXiv preprint arxiv:2312.04461},
  year={2023}
}

The

tag is an essential building block of web development and can be used for various purposes. In the provided HTML snippet, the

tag is used to encapsulate a set of content and organize it into a specific section of the webpage. This makes it easier to apply styling, scripting, and other functionalities to the enclosed content as a single unit.

In the context of the given HTML code, the

tag is being used to structure and define different sections of a web page. Each

and

heading is followed by a corresponding

paragraph or