This project implements Neural Style Transfer (NST), a deep learning technique that applies the artistic style of one image (style image) to another (content image) while preserving the original structure.
The model is based on VGG19 and uses a combination of content loss and style loss to optimize the generated image.
| Initial Content image | Initial Style Image |
|---|---|
![]() |
![]() |
| Step 0 (Initial Random Noise) | Step 7000 (Final Stylized Image) |
|---|---|
![]() |
![]() |
πΉThe left image is the initial random noise. The right image is the final stylized version after 7000 iterations.
NST works by optimizing a target image to minimize two types of losses:
- Content Loss β Ensures that the main structure of the content image is retained.
- Style Loss β Ensures that the textures and patterns of the style image are transferred.
β Preprocess images: Resize and normalize both content and style images.
β Feature Extraction: Use VGG19 convolutional layers to extract feature maps.
β Optimize Target Image: Start from random noise and optimize it using content & style losses.
β Save Progress: The output image improves gradually over multiple steps.
To run the project, install the required Python libraries:
pip install torch torchvision numpy tqdm pillow opencv-pythonπ NST_Project
βββ π images # Folder for input images
β βββ content.jpg # Your content image
β βββ style.jpg # Your style image
β βββ result_0.png # First output (random noise)
β βββ result_7000.png # Final stylized image
βββ π nst.py # Main Python script
βββ π README.md # This README file
Before applying NST, we need to crop and resize the images to the same size:
from PIL import Image
from pathlib import Path
def crop_it(img):
input_path = Path(img)
image = Image.open(img)
width, height = image.size
left = (width - 512) / 2
right = left + 512
top = (height - 512) / 2
bottom = top + 512
cropped_img = image.crop((left, top, right, bottom))
modified_path = input_path.with_name(f"{input_path.stem}_crop{input_path.suffix}")
cropped_img.save(modified_path)
return cropped_img.show()
crop_it('path/to/style.jpg')
crop_it('path/to/content.jpg')We load a pre-trained VGG19 model but remove the fully connected layers, keeping only the convolutional layers.
import torch
import torchvision.models as models
import torch.nn as nn
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
class VGG(nn.Module):
def __init__(self):
super(VGG, self).__init__()
self.select_features = ['0', '5', '10', '19', '28'] # Conv layers
self.vgg = models.vgg19(pretrained=True).features
def forward(self, output):
features = []
for name, layer in self.vgg._modules.items():
output = layer(output)
if name in self.select_features:
features.append(output)
return features
# Load the model
vgg = VGG().to(device).eval()def get_content_loss(target_vec, content_vec):
return torch.mean((target_vec - content_vec) ** 2)def gram_matrix(input, c, h, w):
input = input.view(c, h * w)
return torch.mm(input, input.t())
def get_style_loss(target, style):
_, c, h, w = target.size()
G = gram_matrix(target, c, h, w)
S = gram_matrix(style, c, h, w)
return torch.mean((G - S) ** 2) / (c * h * w)from torchvision import transforms
from PIL import Image
loader = transforms.Compose([
transforms.Resize((512, 512)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
def load_img(path):
img = Image.open(path)
img = loader(img).unsqueeze(0)
return img.to(device)
content_img = load_img('path/to/content_crop.jpg')
style_img = load_img('path/to/style_crop.jpg')target_img = torch.randn_like(content_img, device=device, requires_grad=True)import torch.optim as optim
from tqdm import tqdm
optimizer = optim.Adam([target_img], lr=0.01)
alpha = 50 # Content weight
beta = 50 # Style weight
content_feature = [f.detach() for f in vgg(content_img)]
style_feature = [f.detach() for f in vgg(style_img)]
steps = 10000
for step in tqdm(range(steps)):
target_feature = vgg(target_img)
content_loss = sum(get_content_loss(t, c) for t, c in zip(target_feature, content_feature))
style_loss = sum(get_style_loss(t, s) for t, s in zip(target_feature, style_feature))
total_loss = alpha * content_loss + beta * style_loss
optimizer.zero_grad()
total_loss.backward()
optimizer.step()
if step % 500 == 0:
save_image(target_img, f'result_{step}.png')
print(f"Step {step}: Loss = {total_loss.item():.4f}")- Original NST Paper (Gatys et al., 2016)
- Fast Style Transfer (Johnson et al., 2016)
- PyTorch NST Tutorial



