to download the project base paper of game maps.

Abstract:

The five-dollar model is a lightweight text-to-image generative architecture that generates low-dimensional images from an encoded text prompt. This model can successfully generate accurate and aesthetically pleasing content in low dimensional domains, with limited amounts of training data. Despite the small size of both the model and datasets, the generated images are still able to maintain the encoded semantic meaning of the textual prompt. We apply this model to three small datasets: pixel art video game maps, video game sprite images, and down-scaled emoji images and apply novel augmentation strategies to improve the performance of our model on these limited datasets. We evaluate our model’s performance using cosine similarity scores between text-image pairs generated by the CLIP VIT-B/32 model.

Introduction Recent developments with text-to-image generations in the AI community have spurred a renaissance of large generative models. Both open and closed-source generations, such as DALLE and Stable Diffusion have demonstrated their versatility and range in image generation, from from hyperrealistic photos to abstract artworks. However, to generate these quality images, these large models require a large amount of training data—both in exceptions and paired images—large amounts of computing and training time, and large amounts of storage. Though these models boast incredible textual understanding and image fidelity, they cannot be used in every domain. For more specific tasks, such as creating assets like sprites, textures, or levels for video games, these models require extensive prompt engineering or fine-tuning to meet domain-specific constraints

the-five-dollar-model-generating-game-maps-and-sprites-from-sentence-embeddings, final year projects for computer science.
Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *