Image manipulation has always been a cornerstone of creativity, enabling designers, artists, and hobbyists to bring their ideas to life. With the advent of AI, tools like Instruct Pix2Pix are revolutionizing this field by allowing users to edit images through simple text instructions. Instruct Pix2Pix leverages state-of-the-art deep learning models to perform detailed and accurate modifications to images based on prompts.

This guide provides a step-by-step walkthrough to set up and use Instruct Pix2Pix on an Ubuntu GPU server.

Prerequisites

Before we dive into the setup, ensure that your server meets the following requirements:

  • An Ubuntu 22.04 Cloud GPU Server with at least 20 GB VRAM.
  • CUDA Toolkit and cuDNN Installed.
  • A root or sudo privileges.

To confirm CUDA support, run:

nvidia-smi

Ensure the output lists your GPU and available VRAM.

Step 1: Installing Python Dependencies

We’ll start by installing the required Python packages. Ensure Python 3 and pip are installed on your system.

apt install python3 python3-pip

Next, install torch with Cuda support using the pip.

pip3 install torch --index-url https://download.pytorch.org/whl/cu124

Step 2: Setting Up Jupyter Notebook

Jupyter Notebook provides an interactive environment to work with Instruct Pix2Pix.

You can install it using the following command.

pip3 install notebook

Next, run the Jupyter notebook.

jupyter notebook --ip=your-server-ip  --allow-root

You will see the following output.

[I 2024-12-17 03:32:14.456 ServerApp] Jupyter Server 2.14.2 is running at:
[I 2024-12-17 03:32:14.456 ServerApp] http://your-server-ip:8888/tree?token=6b926e2169755fafadf0282b219f775b978c9f5e009bbe16
[I 2024-12-17 03:32:14.456 ServerApp]     http://127.0.0.1:8888/tree?token=6b926e2169755fafadf0282b219f775b978c9f5e009bbe16
[I 2024-12-17 03:32:14.456 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 2024-12-17 03:32:14.459 ServerApp] No web browser found: Error('could not locate runnable browser').
[C 2024-12-17 03:32:14.459 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///root/.local/share/jupyter/runtime/jpserver-2789-open.html
    Or copy and paste one of these URLs:
        http://your-server-ip:8888/tree?token=6b926e2169755fafadf0282b219f775b978c9f5e009bbe16
        http://127.0.0.1:8888/tree?token=6b926e2169755fafadf0282b219f775b978c9f5e009bbe16

Note down the Jupyter notebook URL in the above output.

Step 3: Access Jupyter Notebook

In this section, we will access the Notebook and run the Model via web-based interface.

1. Access Jupyter Notebook

Open the URL provided in the output (e.g., http://your-server-ip:8888/tree?token=6b926e2169755fafadf0282b219f775b978c9f5e009bbe16) in a web browser to access the Jupyter Notebook interface.

Click on File => New => Notebook to create a new Notebook. You will be asked to select a kernel.

Select a Python kernel and click on Select. You will see the new Notebook page.

2. Install Required Dependencies

In the new Notebook window, install the required model libraries.

!pip3 install diffusers accelerate safetensors transformers pillow

Click on run or press Control + Enter to install the above libraries.

Upgrade Jupyter Notebook and the ipywidgets package.

!pip3 install --upgrade jupyter ipywidgets

3. Import Necessary Libraries

In the Notebook window, add the below code to import the necessary library.

from PIL import Image, ImageOps
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

4. Download an Image

Add this function to download and prepare an image:

def download_image(url):
    image = Image.open(requests.get(url, stream=True).raw)
    image = ImageOps.exif_transpose(image)
    image = image.convert("RGB")
    return image

# Example
url = "https://i.postimg.cc/gkGpRjsp/image1.png"
image = download_image(url)

Note: replace https://i.postimg.cc/gkGpRjsp/image1.png with your image URL.

5. Load the Instruct Pix2Pix Model

Define and configure the model pipeline:

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16, safety_checker=None)
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

6. Running the Pipeline

Once the model is loaded, you can use it to edit images based on text prompts:

prompt = "Make it red and blue"
images = pipe(prompt, image=image, num_inference_steps=10, image_guidance_scale=1).images

# Save or display the result
images[0].save("output.png")
images[0].show()

Note: Replace the prompt with your desired modification description. For example, “Turn it into a sketch” or “Add a sunset background.”

After adding all the code to the Notebook window, Click on run or press Control + Enter to run the Notebook. You will see the generated image in the Notebook window.

7. Verifying GPU Utilization

Ensure the GPU is utilized efficiently by running the following command in Notebook window:

!nvidia-smi

The output should show the GPU memory being used by Python processes.

Conclusion

Instruct Pix2Pix enables flexible and precise image editing using simple text prompts. With an Ubuntu GPU server, you can leverage this powerful AI tool for creative projects or experimentation. Follow this guide to set up your environment and start exploring the possibilities of AI-driven image manipulation.