A
PROJECT REPORT
ON
“Pixel Generator”
SUBMITTED BY -
Abhay Suresh Badhe
Seat No: 10801
UNDER THE GUIDANCE OF
Mr. Sushilkumar S. Kulkarni
SAVITRIBAI PHULE PUNE UNIVERSITY (SPPU)
MASTER OF COMPUTER APPLICATIONS
DR. D. Y. PATIL UNITECH SOCIETY’S
DR. D. Y. PATIL INSTITUTE OF MANAGEMENT AND RESEARCH, PIMPRI,
PUNE-18
2023-2024
1
Dr. D. Y. Patil Unitech Society’s
Dr. D.Y. PATIL INSTITUTE OF MANAGEMENT & RESEARCH
Sant Tukaram Nagar, Pimpri, Pune-411018
Recognized by the Savitribai Phule Pune University (SPPU-IMMP014220) AISHE CODE C-42109
Recipient of the "Best College Award" Accredited by NAAC with "A++ MBA & MCA Programme
of Savitribai Phule Pune University Grade” (CGPA 3.52) Accredited by NBA
CERTIFICATE
This is to certify that Abhay Badhe has successfully completed the project on “Pixel
Generator’’ as a partial fulfilment of his Master of Computer Applications (MCA-II
Sem-III) under the curriculum of Savitribai Phule Pune University, Pune for the
academic year 2023-24
Mr. Sushilkumar S. Kulkarni Dr. Shikha Dubey Dr. Vishal Wadajkar
Project Guide H.O.D. MCA Director
Signature Signature
Name Name
Internal Examiner External Examiner
Date : Date :
2
Acknowledgement
The success and final outcome of “Pixel Generator” project required a
lot of guidance and assistance from many people and we are extremely
privileged to have got this all along the completion of our project. All that we
have done is only due to such supervision and assistance and we would not
forget to thank them.
We sincerely thank to the Director Dr. Meghana Bhilare, Associate
Director Dr. Vishal Wadajkar, and HOD Dr. Shikha Dubey and to my project
guide Prof. Sushilkumar Kulkarni, for providing us an opportunity to do the
project work and give us all support and guidance which made us completes the
project duty.
We are extremely thankful to them for providing such a nice support and
guidance, although they had busy schedule managing the corporate affairs.
We owe a deep gratitude to our project guides for taking keep interest on
our project work and guiding us all along, till the completion of our project
work by providing all the necessary information for developing good system.
We are thankful to and fortunate enough to get constant encouragement,
support and guidance from all teaching staff which helped us in successfully
completing our project work. Also, we would like to extend our sincere esteems
to all staff in laboratory for the timely support
3
Table of Content
Sr no. Title Page no.
1 Introduction to Proposed System 5-6
1) Problem Definition
2) System Overview
3) Project Functionalities with Module Specification
4) Operating Environment (H/W & S/W Requirement
Specification)
2 Overview of the Proposed System 7-8
1) Proposed System
2) Objectives of the System
3) Feasibility Study
4) User Requirement Specification
3 System Analysis &Design 9-13
1) Data Flow Diagram / Context Level Diagram
2) Class Diagram/ ERD
3) Activity Diagram
4) Data Dictionary with Table Specification
5) Use Case Diagram
4 User Manual 14-18
1) Operational Instructions
2) Input/output Screens
3) Reports
5 System Limitation 19
6 Future Enhancement and Conclusion 20
7 Bibliography and Glossary (Definitions, Acronyms and 21
Abbreviations used in the Proposed System)
4
1) Introduction
1.1) Problem Definition
Modern image generation tools often require extensive manual tweaking,
complex installations, and online access. Many users seek offline, user-friendly
solutions with minimal system requirements to generate high-quality images
without relying on cloud services.
1.2) System Overview
Pixel generator is an offline, open-source image generation software designed
for simplicity and efficiency. Built on Stable Diffusion XL, it eliminates the need
for extensive prompt engineering and operates with minimal clicks. It supports
various image manipulation tasks and advanced features for optimized results.
1.3) Project Functionalities and Module specifications
The website will consist of the following key components:
- Image Generation: High-quality text-to-image transformation.
-Image Editing: In-painting, out-painting, and face-swapping.
-Advanced Sampling: Features like style adjustments, prompt expansion, and
negative prompt refinement.
-Platform-Specific customization : Presets for anime, realistic, and general
themes.
-Multi-prompt and weighting support: Allows flexible prompt inputs and
weighting adjustments for refined outputs.
1.4) Operating Environment
The website will be live on servers used by Flask framework. It will incorporated
MYSQL database connection locally for now.
5
Hardware Requirements:
1. Minimum configuration:
GPU: Nvidia with 4GB VRAM (e.g., RTX 20XX series).
CPU: Multi-core processor. 2.0 GHz or higher.
RAM: 8GB system memory.
STORAGE: 40GB of disk space.
2. Recommended configuration:
GPU: Nvidia RTX 30XX or higher with 6GB+ VRAM.
STORAGE: 100GB SSD for faster performance.
RAM: 16GB or more.
3. CPU-Only use:
High-performance CPU (8+ cores) and 32GB RAM, though
processing will be slower.
Software Requirements:
1. Operating System:
Windows: 10 or later (64-bit).
Linux: Modern distributions (e.g., Ubuntu 20.04+).
MacOS: Catalina or newer.
2. Python and libraries:
Python: Version 3.10 (mandatory).
Libraries: PyTorch, Gradio
3. Additional software:
Internet for initial downloads.
Updated Nvidia drivers (version 531 or recommended).
4. Browser:
Any modern browser for UI access.
6
2) Proposed System
2.1) Proposed System
The proposed system, Pixel generator, is an offline image generation software
designed to simplify the creation of high-quality visuals. Built on the Stable
Diffusion XL framework, it removes the complexities of traditional tools by
offering a streamlined, user-friendly experience. Pixel generator operates entirely
offline, ensuring user privacy and independence from cloud services, making it
ideal for artists, designers, and enthusiasts seeking an open-source alternative. Its
ease of installation, requiring fewer than three clicks, minimizes technical
barriers and makes it accessible to users of all skill levels.
2.2) Objectives of System
The primary objectives of Pixel generator are simplicity, accessibility, and
performance optimization. It enables users to generate stunning images
effortlessly, even with short prompts, while delivering consistent high-quality
results across various tasks. The system is optimized for mid-range hardware,
ensuring smooth operation without compromising output quality.
2.3) Feasibility study
Technical Feasibility:
Pixel generator leverages the Stable Diffusion XL architecture, a proven and
reliable model for high-quality image generation. The system is designed to run
efficiently on widely available mid-range hardware, such as Nvidia GPUs with a
minimum of 4GB VRAM or CPUs with sufficient RAM for processing. By
adopting an offline operational mode, Pixel generator eliminates reliance on
cloud-based resources, ensuring a secure and independent user experience.
Operational Feasibility:
The primary objectives of Pixel generator are simplicity, accessibility, and
performance optimization. It enables users to generate stunning images
effortlessly, even with short prompts, while delivering consistent high-quality
results across various tasks. The system is optimized for mid-range hardware,
ensuring smooth operation without compromising output quality.
Economic Feasibility:
Pixel generator is entirely free and open-source, eliminating the need for costly
subscriptions or commercial licenses. This makes it accessible to a broad
audience, including students, hobbyists, and professionals. The minimal
hardware requirements ensure that users do not need to invest in high-end
computing systems, significantly lowering the cost of adoption.
7
2.4) User Requirement Specification
Simplicity and accessibility:
Enable anyone, regardless of technical expertise, to generate stunning
images with minimal effort.
Simplify prompt usage so even short phrases produce visually appealing
results.
Performance optimization:
Ensure smooth operation on mid-range hardware by optimizing memory
usage and computational efficiency.
Maintain consistent image quality across various tasks, such as in-painting,
out-painting, and face-swapping.
Privacy and independence:
Offer a fully offline experience, protecting user data and ensuring
operations are not reliant on external servers.
Customizability:
Provide advanced options for users who wish to experiment with styles,
sampling parameters, and model presets.
Support multiple model configurations, including anime, realistic, and
general themes.
8
3) System Analysis & Design
3.1) Data flow diagram
9
3.2) Class Diagram
10
3.4) Activity Diagram
11
3.5) Data dictionary with Table specification
12
3.6) Use case diagram
13
4) User Manual
4.1) Installation:
1. Windows:
Run run.bat for general usage, run_anime.bat for anime-style images, or
run_realistic.bat for realistic images.
Models and dependencies will automatically download on the first run.
2. Linux:
Set up the environment using Conda or Python's virtual environment.
Run python entry_with_update.py to start the application.
3. MacOS:
Follow the instructions for installing PyTorch with MPS support.
Run python entry_with_update.py to launch Pixel generator.
Launching Pixel generator:
The application opens in your default browser, displaying the user interface.
If necessary, access the interface on another device by enabling --share or --
listen in the command-line arguments.
Basic Usage:
1. Enter a prompt describing the desired image (e.g., "A serene beach at
sunset").
2. Adjust settings such as style (Anime, Realistic) or resolution if needed.
3. Click the "Generate" button to produce the image.
4. Save the generated image to your desired location.
Advanced Features:
Use the Negative Prompt field to exclude specific elements from the image.
Modify sampling parameters like sharpness and contrast under the
"Advanced" section.
Switch presets (e.g., Anime or Realistic) for different visual styles.
14
4.2) Input/Output Screens
Home Page
Prompt input and image generation
15
Generated image
Input Image
16
Enhance Image
Advanced
17
4.3) Report
Image Metadata: Includes details like the prompt used, resolution, style settings,
and generation time.
Export Options: Save images along with their metadata as a compressed file or
in a report-friendly format (e.g., PDF).
Usage Tracking: Optionally track the number of images generated and
frequently used settings for analysis.
18
5) System Limitations
• Scope Limitations: Pixel generator is limited in scope as it primarily focuses on
high-quality image generation and editing, with no support for batch processing
or direct integration with third-party tools. It also lacks smaller, lightweight
models optimized for lower-end devices.
• Technical Limitations: Pixel generator requires a minimum of 4GB VRAM for
optimal GPU performance, with slower operation on CPUs or AMD GPUs. It
also has high storage demands, requiring at least 40GB of free disk space.
Additionally, the software is not updated with new model architectures beyond
Stable Diffusion X
It is important for the project team to identify and address these limitations to ensure
successful implementation.
19
6) Future Enhancement and conclusion
Integration of newer model architectures.
Enhanced compatibility for non-Nvidia GPUs.
Extended functionalities for batch processing and additional output formats
Conclusion
Pixel generator stands out as a powerful, user-friendly solution for offline image
generation, offering high-quality results with minimal effort. Its open-source
nature encourages community contributions and potential enhancements.
20
7) Bibliography
Bibliography
Stable Diffusion XL Documentation: Stable Diffusion
Glossary
SDXL: Stable Diffusion XL, the underlying model architecture for Pixel
Generator
In-painting: Modifying specific areas of an image.
Out-painting: Extending an image beyond its original boundaries.
Negative Prompt: Input for excluding specific elements in the generated output.
21