Playing with AI Image Generation

Jean-Samuel Chenard

Electronic Architecture | AIoT Sensors | Microsystems

Published Aug 11, 2024

Those who know me will attest that I keep toying around with technology all the time. Every so often I go over a topic to see the improvements and circle over and over. Drones, CNC machine, AI code, microcontrollers, RF, etc. Its a never-ending game of catching up... With all the progress, its fascinating.

A few days ago, I viewed a few posts on LinkedIn that left me perplexed. I have toyed with image generators like Stable Diffusion, Midjourney and their outputs have this hint of AI lingering when they generate human faces and persons. Sometimes the persons look too perfect or they have a glitch (they used to have 6 fingers or an extra armlet for example). Often they are 'too perfect' or cartoonlike. Like a doll or a photoshopped ad for a perfume.

I often browse CIVITAI for many examples that showcase a diversity of styles and models but many seem to target games or stories.

That post I saw on LinkedIn was an image generated with a TEDx background of someone that was giving a talk. It looked so realistic... They mentioned a model called FLUX. I had to try it out.

Note: my way of trying those models (image generators and many others) is via replicate.com. That site takes care of the hassle of running the host with expensive GPUS and quick machines to run it. For a couple of dollars (for hundreds of generations), it is a very affordable way to try those models. I think that some of the models can also be run on AWS Bedrock, but I stuck with Replicate as it is a very good interface and showcases examples of the models and is quite easy to use and explore.

Generate Persons

At some point in the future, we'll have to put updated photos on our website. I figured that we would eventually get some photo shoots done of the team and some of our work. Its always a bit delicate because either the photo could end up with confidential information on it or there will be an oddball mess in the background (especially if that shot includes my office...).

I figured I would try for fun to generate a person with a Motsai branding as a starter. I gave a prompt based on the example (a cake with some text written with the frosting like the hamburger image for this article) and modified it to describe a technical person in a lab.

Article content — After a few tries, I got some guy that could easily look like one of my colleague (if we had branded shirts...). I assure you, this person is not real.

Then I asked myself if I could put a friend's face on the same generated content and re-generate the image with that. Below is the result:

For those who know °Gilles Prepoint - Liberating Greatness , you will find some level of similarity. Its not by chance because I put his face over the picture before re-generating it with AI. Its not exactly him, but not too far either.

You will also notice that the image got re-generated but the FLUX model inpainted the seams quite well. I have no Photoshop skills, so its the direct algorithm output. I find that Replicate algorithms are more able to manipulate photos. Dall-E doesn't really want to regenerate photos with identifiable individuals. Its their terms and conditions and I can certainly understand why they want to stay away from that...

Could it generate athletes with a name on their bib ?

One thing that is difficult to generate is some action shot with an athlete as it needs a lot of control over the scenery and most AI image generators totally mess up the text. Here is how FLUX handles the integration of logos and text.

The MOTSAI bib came out well (mapping is good and seems to follow the shirt's curve. Some logos popped into existence (sorry Nike, if you don't like that I will remove that image). His hands are good, maybe he would need a bit of work on the sweat as he's quite ahead of the others but his skin looks very dry... Also his smart watch has some weird kink.

I had to try with a few tweaks in the prompts and add more diversity.

Hitting the button a few times for the generate will produce a wide variety of images, many look quite realistic. There are a few elements that control quality. I found I preferred many options to pick from and refined quality later.

I used a combination of 'cheap' image generations via flux-schnell (generate many candidates) and refined them with flux-dev (using the rough image from the quick algorithm as an input to the more elaborate one and asking it to follow the same prompt).

Trials with futuristic images

In the spirit of the current Olympics, I generated sport images but instead of athletes, I wanted robots. With all the AI developments, we may end up with some robot Olympics and we'll just watch them outpace each other... Boston Dynamics vs Tesla for the gold...

The above figure is what I can call a cute robot runner. But could he be made to swim? I wonder if the AI would take that same design of robot and dunk it in water. That would not be too good for his joints and exposed screws and vents...

Interestingly the new robot that got generated looks more adapted to water. It has some kind of neoprene soft sections that would keep water out of the joints. Not sure about the red dot on the forehead, maybe that is some distance sensor. Good protection for the cameras via the goggles and no exposed screws that could rust. Overall I am surprised how the context changed the robot and there's some 'design intelligence' to it.

Sport Car Racing team ? Why not!

What if we wanted to go bit and sponsor a race team. That is a lot of money that I feel we can invest in more lab tools and employee bonuses... Let's see if the AI can help get us a branded car on the cheap.

In this example, I asked a few things in the prompt. One was to have the Motsai name on the hood. That went well and the text follows the curvature well. I also wanted the number 77 on the car and that worked out OK. I asked for my dog's name (Mochi) to replace Porsche as this was generated on the car by default. The AI blended MOCHI and PORSCHE as MOTSCHI which is quite clever... Also the logos around the car don't look like they exist, so we are less likely to receive a cease and desist if we were to publish that picture.

Amazing progress

I am really impressed by how fast the image generation has been progressing in the past year. Early generations lacked context, had odd bits that seemed wrong. With rapid advances in image generation tools, multilingual voice generation it starts to feel like those models have an internal understanding of the context what they are generating.

Look at shadows, lighting, focus, muscles (for athletes), facial expression. Also the way clothing is wrapped around objects text overlays, logos and numbers all look in the right place.

It is easy to dismiss portions of the images because of some glitch or artefact, but those models are prompt-inputs image output so they have very little context to work with. They can generate a cake, a race car, a hamburger / cheeseburger... Impressive!

I also found that they have some level of bias in FLUX that should be compensated by explicitly asking for diversity in the ethnicity and gender.

The poutine test... if you read until here!

Tuba Tufail

Exploring FLUX for AI image generation sounds like a game-changer for visual content creation. The realism it brings could redefine branding and design across industries. Exciting possibilities ahead!

Parkha Ali

Freelance Community Builder | PR words | Content writer

Jean-Samuel Exploring the capabilities of FLUX for AI-driven image generation opens up exciting possibilities, especially with the level of realism it brings to branding and design. It's fascinating to see how these tools can transform creative processes across various industries.

°Gilles Prepoint - Liberating Greatness

I work with CEOs who are committed to creating environments where energy, creativity, and meaningful achievement flow daily

This is awesome Jean-Samuel! I absolutely love my new shirt, it almost makes me look skinny! You are truly a life engineer; the way you delve into topics to extract the very best is nothing short of inspiring. Working at Motsai Research Inc. or Soralink alongside you and your team must be a blessing for everyone involved. Thanks for your Leadership !

See more comments

To view or add a comment, sign in

LinkedIn respects your privacy

Playing with AI Image Generation

Jean-Samuel Chenard

Electronic Architecture | AIoT Sensors | Microsystems

Generate Persons

Could it generate athletes with a name on their bib ?

Trials with futuristic images

Sport Car Racing team ? Why not!

Amazing progress

More articles by Jean-Samuel Chenard

Others also viewed

As Nvidia and Adobe double down on generative AI, scientists and CEOs sound the alarm

Upgrade 2024: Beauty

AI Is Making Us More Creative, Not Less

Can A Robot Be The Next Picasso?

Generative AI in Game Development in USA: Get Custom AI Solutions for Your Next Game

AI-Generated Videos in Seconds: What Midjourney's V1 Means for Creators

AI Reshaping the Game: How Artificial Intelligence Will Change the Computer Gaming Industry

AI Augmented Human Led Creativity

🧠 THIS WEEK IN CREATIVE AI Issue #50

Transformers Were the Secret Unlock to Phenomenal Image Generation

Explore content categories

Generate Persons

Could it generate athletes with a name on their bib ?

Trials with futuristic images

Sport Car Racing team ? Why not!

Amazing progress

More articles by Jean-Samuel Chenard

OpenAI API Calls - Is that the future of programming ?

Simple Analog Circuits with GPT-4 ?

Rétrospective de la pandémie chez Motsai

Leverage Search Engines to Explore Alternatives

Others also viewed

As Nvidia and Adobe double down on generative AI, scientists and CEOs sound the alarm

Upgrade 2024: Beauty

AI Is Making Us More Creative, Not Less

Can A Robot Be The Next Picasso?

Generative AI in Game Development in USA: Get Custom AI Solutions for Your Next Game

AI-Generated Videos in Seconds: What Midjourney's V1 Means for Creators

AI Reshaping the Game: How Artificial Intelligence Will Change the Computer Gaming Industry

AI Augmented Human Led Creativity

🧠 THIS WEEK IN CREATIVE AI Issue #50

Transformers Were the Secret Unlock to Phenomenal Image Generation

Explore content categories