Sunday, April 19, 2026

Agentic Projects - The Infinite Book

I want to talk about a project I started and abandoned a few years back, it was an early attempt at building an application that used generative AI for both text and image elements. I didn't want to write a children's book, I wanted to write thousands of children's books. Anyway, the project failed, but outlined here are some of the steps I took, and some of the lessons I learned.

Early Concept

One of my first stabs at agentic content creation, making stuff will AI, as we called it then, was in September of 2024. I had played with image generation before that, the age of Dall-E 2 and 3, but of course it was terrible. After playing with Stable Diffusion and chat-gpt 4, ideas started forming around content creation, trying to make something real. I had an 8 year old at the time, she was just growing out of young kids books, but I thought I could make her an infinite book, she would be able to turn a page, say what she wanted to have happen next, and see it come to life. This turnout to be too ambitious for the time, but something still came out of it. Let me show you the never published app, Stories Together.

Stories Together - iOS Dashboard

Above you can see the Dashboard view of the iOS app Stories Together, with many of the generated test books on the dashboard of the app. I never did the get the layout right, but you can see the cover page for each book. Selecting an image, opens the book to pages like this:

A Reasonably Good Page

The idea was simple, I would use Chat GPT (via API) to generate the pages of the text, then use Chat GPT to generate a prompt for Stable Diffusion, and then I could generate the image for each page. The first versions of this failed pretty bad, the story text was often way too long, the prompts for generating the images where too long, almost all of the stories where the same. Plus a lot of other stumbling blocks, it was time to start learning how to use AI to generate the stuff that was not terrible.

The first improvements I made were to generate things in pieces, instead of all at once. For example, I would generate the characters for the story first, before getting into the story at all. "Please generate a character for a children's book, if the character is an animal, make it humanoid. Please describe the character in some detail", something like that as a starting prompt, I would generate 2-4 of those as the main characters. Then for each character, "Please describe the this character visually, describe their cloths, their shoes and anything on their head.", this would produce a piece of text I could use to anchor the character. The stuff about the shoes and the their head became critical later, as these descriptions would force Stable Diffusion to draw their feet and head, insuring a picture of the entire character, and not a cut off. This type of tweaking early in the process was required to get any kind of consistency in later generation steps, text or image.

After generating the character text chunks, name, personality, visual description, etc. I then used Chat GPT to generate a story idea, an ethos for the story as a whole, again, something to anchor the coming page text generations. Once the character and text was generated, I could then combine the visuals description of the characters with the text of the page and generate a prompt for Stable Diffusion. Since I included details about character, there was a chance that I would get images with an identifiable character in each image. For you see, if you ask Stable Diffusion for an image the a cat name fluffy driving a car, and an image of a car named fluffy buying bread, you will get two very different cats driving the car and buying the bread, even if you clamp down on the style of the image.



In the above picture you can see the hero's of the story, a raccoon, a tortoise, and a bear, doing something with balloons. Believe it or not, in these two pictures, they are meant to be the same raccoon, tortoise and bear, in some cases it worked, but it often failed with the tools I had had on hand. This was just emerging in the world of AI as the Consistent Character problem, I was hardly the first to run into this issue. There was a host of complex, error prone, and expensive solutions on the market, but this issue was dragging the project down. Though it was not my only issue.

Monetization

I had hopped, when the project started, to be able to make a little something by selling this app on the Apple iOS app store. I was early to market, so I thought I could beat the rush and try and position myself as a name in agentic content creation, at least in the app space. Big dreams. It became very obvious early in the project, that this was going to be a problem. I need to run AI products, Stable Diffusion and Chat GPT, during app development, and was a noticeable expense. It didn't break the bank, but an evening of coding was costing me 10 bucks in API access. How was I going to scale this up to a meaningful number of users, and how could I afford it.

My initial thought for the app, was for it to be a simple, honest, one time purchase. Maybe ten bucks, let the kid go crazy creating stories in the back of the car. I did some quick back of the napkin math, and I realize it would cost be something like 45¢ per book! A bored kid would blow through 10 bucks in tokens in a few hours. I really didn't want to do in-app purchases for 'story tokens' or some such, though for most business, that would be the right move, simply expose your downstream costs to the buyer. But I was not going to do that to parents, have their kid trying to get mom to buy another 10 books for 5 bucks or whatever. I needed a different solution. 

I explored the costs of self hosting, Stable Diffusion (and many-many others) image generation libraries can be run on local hardware, some with surprisingly modest graphics cards. I could buy a couple of decent graphics cards and get the first version out the door. There was always AWS if I needed to scale up quickly. Then came the question of text generation, and I admit this surprised me a lot, there was no way I was going to be able to host an LLM. There were options for running LLMs locally at the time of course, but they were super dumb, I even tried to run some on the iPad, it would never work. 

As I worked on the project, I would twist the project one way or another, trying to cope with the unresolved monetization problem. What if I used a less expensive model? what if we don't give the user any choices? What if I fake new book generation by simply re-using them? Can I 'cache' a story or an image with an index of embeddings? It was starting to fall apart.

End of a Project

Between not being able to get dynamic character consistence right and not being able to come up with realistic monetization path I knew this project was simply too early, at least for my skill set and budget. It was hard to say goodbye to this one, by the end of the project, a I had generated hundreds of characters and read many-many stories about them. Some were pretty good, better than they should have been. There was this weird experience of running the code, generating a story or a character or a page, realizing I need to tweak the prompt, and also realizing, if I hit 'run' again on Xcode, this image or character would simply vanish, it was a little haunting. 

I am writing this a year and half after I put work into this project, and I know I made the right decision, I think the app could techincally work today, by using 18 month old models at a deep discount, but it would look like slop now. the newer models have raised the bar so much, and people are expecting more and more. Ironically, to produce this app with the quality todays audience would except, is still too expensive, the app is still to early in some ways. One of the many contradictions in this early age of AI.





What I leaned

The most valuable part of this project was learning the basics of working with generative AI. I didn't have these words for it then, but I was starting to learn, that the trick to generative AI is to provide guidelines for it to work with. This started to form when I started breaking up the story generation into steps, first the character, then their personalities, the theme of the story, the summary of the story, and then the text for each page. By the last step, the LLM has a bunch of great context to work from and is focused on the task at hand. My tooling has evolved as to how I accomplish this, but the core idea seems to be holding, if you want to play chess with an LLM, give it a chessboard.











Wednesday, February 10, 2010

JavaFX Presentation

Last night I presented JavaFX to the Rochester Java Users Group (RJUG.org). It was a very receptive bunch of people, no experience with JavaFX but a lot of interest. That makes a great combination!

I put the presentation together with Google Docs, so here is the presentation without sound...



Here is a list of resource to help people get going with JavaFX:

The main JavaFX site.

JFXtras is a collection of open source libraries for JavaFX and an excellent mailing list.

Until I get my source code uploaded to JFXtras, the source from the presentation and my book can be found here: JavaFX Special Effects: Taking Java™ RIA to the Extreme with Animation, Multimedia, and Game Elements.

Last, Jim Weaver's blog is a good source for JavaFX news.

Looks like I will be presenting again in April on GWT and Google App Engine. Fun!

Monday, December 07, 2009

JavaONE 2010 call for papers?

I was hoping to submit some work from my new book as a talk at JavaONE 2010. Then I realized that the call for papers last was around this time last year. A quick google search makes it look like the call for papers when out on Dec 4th of last year.


It being Dec 7th today and there is no word yet that JavaONE 2010 will even be happening. Well, lets hope we hear about it soon, I would really like to go to another JavaONE... Though the last DEVOXX looked like it was the cats meow. Maybe I should make that my annual conference?

Update
It looks like JavaONE 2010 will be held Sept 19-23. The call papers should be announced any day now.

Friday, November 06, 2009

JavaFX Production Suite - More Suggestions

Introduction

I was recently working with the JavaFX Production Suite and have another suggestion for making it better. The first thing I want to talk about is the coordinates assigned to Nodes when they exported. Or lack of coordinates to be precise. The second issue is the inability to clone items within fxz content.

Coordinate Issue

Lets take a look at the example illustrator project in the following picture.

We see a red rectangle and a blue circle, both items are positioned at some location from Illustrators origin. When this is exported to JavaFX, the upper left corner becomes the origin for all of the generated nodes. So, in our example the upper left corner of the rectangle is the origin. This sort of makes sense, since positioning each node from some other arbitrary position would lead to other difficulties. But what concerns me is how the nodes are position, lets take a look at what this looks like in JavaFX.

Group {
content: [
SVGPath {
fill: Color.rgb(0x0,0xae,0xef)
stroke: null
content: "M143.90,110.81 C143.90,133.63 125.40,152.13 102.59,152.13 C79.77,152.13 61.27,133.63 61.27,110.81 C61.27,87.99 79.77,69.49 102.59,69.49 C125.40,69.49 143.90,87.99 143.90,110.81 Z "
},
Rectangle {
fill: Color.rgb(0xbf,0x1e,0x2d)
stroke: null
x: 0.0
y: 0.0
width: 171.0
height: 57.0
}, ]
}


As we can see the Rectangle is at point (0,0), but what coordinate is the circle at? The Circle is drawn as a series of strokes relative to the origin. This works great for static content, but for things that move in the scene, this is sort of a pain. It means that some nodes in the scene are offset from their origin and other nodes added to the scene might not be. This means you can't query the position of all nodes in the same way.

This problem is in part do to the incoherent way JavaFX deals with coordinates. But Ill get into that in another post. What I want to show you here is away to get a reference to a node within a node tree created from fxz content. The following code snippet does just this.

public function offsetFromZero(node:Node):Group{
var xOffset = node.boundsInParent.minX + node.boundsInParent.width/2.0;
var yOffset = node.boundsInParent.minY + node.boundsInParent.height/2.0;

var parent = node.parent as Group;
var index = Sequences.indexOf(parent.content, node);

removeFromParent(node);

node.translateX = -xOffset;
node.translateY = -yOffset;

var group = Group{
translateX: xOffset;
translateY: yOffset;
content: node;
}
insert group before parent.content[index];

return group;
}

The node passed in is located in some node tree, it's location from its parent is recorded based on its bounds. Its index in the content of its parent is recorded as well. The node is then removed from its parent and translated to the origin of its parent. The node is than wrapped in a new Group. The Group is than translated by the original offset of the node. Lastly the group is inserted back into the parent's content at the index that the original node held.

In this way the visual location of the node is not changed, but you now have a node which reports its translateX and translateY relative to the origin of the entire fxz node. This allows the application to use a single API for working with the location of nodes. The disadvantage is that it adds another node to the scene, which might contribute to a performance problem.

Cloning

It seems to me that the work flow from Illustrator to Netbeans is good, but could be better. Besides the naming issues I pointed out here, there are some other particulars that need addressing. There are basically two choices when it comes to organizing your Illustrator files, the first is to just create a single file with all of your assets in it, then rely on your JavaFX to position and hide nodes that are not used at a particular time. The other option is to break out all of your content into separate files and let your application do the work of positioning everything. There are disadvantages and advantages to both strategies.

When putting all of your assets into one file it makes it simple to use Illustrator to do your layout. This is really important, since it is much faster to lay out stuff in illustrator, the end result looks and works better. But it makes things difficult for dynamic content, say you are creating a game with power-ups in it. If the number of power-ups on the screen at a given time is dynamic, there is no easy way of creating new ones, as the content in the fxz file contains s0me fixed number of power-up nodes, usually one. The only way to create more is to create entire new fxz object and pull out the power-up, probably throwing away the rest of the objects. This is inefficient, and creates weird patterns in your code.

When using multiple files you loose the ability to do a lot of the layout in illustrator and is also harder to maintain. But this strategy makes it very simple to create multiple instances of nodes, you simply create a new fxz node and add it your scene.

I think a partial solution to this would be if JavaFX support an easy way to clone a node. This would allow you to create a single Illustrator file and the application could just clone items as needed. Being able to clone nodes would be handy even without the Illustrator to JavaFX work flow. I think this is a missing feature and would love to see it included.

Friday, September 18, 2009

Wish list for converting Adobe Illustrator Files to JavaFX

When developing JavaFX applications, much of the graphics work can be done in Adobe Illustrator and exported to a format friendly to JavaFX. This is done by using the JavaFX 1.2 Production Suite. Recently a friend at Sun pointed me to a link which shows which features in Adobe's tools are supported.

It is nice to have some clarification about which features are supported and which are not, until I looked at this document I was pretty frustrated with the guess work involved in getting content to export in a reasonable way. But now I can just look up what works, excellent.

Naming Nodes

This brings me to another struggle I have with the production suite, which involves how nodes are named. Currently you can name a node, say "jfx:ball" and when you export your file for JavaFX, one of the nodes that gets exported has its Id set to "ball", this allows you to pull out particular nodes in the scene. This works great for a lot of use cases, it allows your application set up code to find nodes and perform operations on them, like animate them or make them a button.

The trouble comes when you start to use Illustrator or Photoshop to create more complex scenes, with more complex interactions between the nodes. For example, lets say you are implementing a game like Super Mario Brothers. It makes a lot of sense to use illustrator to put each level together, it allows rapid placement of content and lots of fine tuning. The trouble, is how to identify which layers in the adobe file should be bricks, or coin bricks, or even goombas.

Right now you could go through and name each brick, but you have to give them all unique names, you have to say "jfx:brick1", "jfx:brick2", etc. This is error prone and a lot of work. Ideally you would want to be able to just create a brick layer, and then just copy it and place it without thinking about it.

So if we consider a developer/designer work flow, I the developer, would create a file containing one of each actor in the scene. An actor is something my code has to know about. The designer can then create a new scene by copying the basic building blocks I provided. They can also add other content, which are not actors, this would be backgrounds and other decorations. Interested designer could also look at my example content and create their own. Maybe on level 6 the coin bricks have a different look, they would be able to adapt the naming system to create the actors they require.

In this way, there is a streamlined designer/developer work flow. This is really important to creating excellent applications, because the less time spent making it just work and the more time spent on making it cool, hence cooler applications :)

Extensi0n To Naming System
To enable this work flow I suggest changing how the name system works for the production suite. We should be able to specify a class for a given node. I think it could work like this, first start with how it works now:

jfx:NODE_ID

This just simply says, set this nodes id to NODE_ID. This is perfect for singletons in the scene. But, now if you want to specify a node to be a particular class you would say:

jfx:[com.mygame.Brick]

Where Brick is a mixin class defined someplace else in your project. The production tool would then create a new class, which would look something like:

class Group_Brick extends Group, Brick{
//no additional implementation.
}

Then each time the export tool finds a layer which should be node of type brick, it uses the class Group_Brick instead of Group to represent that node. If the node which is to be a brick would normally be of a type type other then Group, say SVGPath, then another class like the following could be created.

class SVGPath_Brick extends SVGPath, Brick{
//no additional implementation.
}

In this case, maybe some Bricks are SVGPaths and some are Groups. But your code might not care, since bot SVGPath and Group are nodes.

If you wanted a node to be of two classes, you would name the layer:

jfx:[com.mygame.Brick,com.mygame.HasCoin]

this would cause the production suite to produce a class like the following:

class Group_Brick_HasCoin extends Group, Brick, HasCoin {
//no additional implementation.
}

In this way the node can be two things. Lastly, suppose you wanted a node to have its id set and be of a particular class, you would then name the node like this:

jfx:startspot[com.mygame.Brick]

This would create a node which was whatever type the layer should be (Group, SVGPath, Circle, etc) and also of type Brick, and have the id property set to 'startspot'.

Of course some tools on the javafx side must be created, it would be nice to say 'give me all nodes in this tree which are of a particular type', something like the following.

function getAll(clazz:Class, group:Group):Node[];

I am sure the above method can be implemented with little trouble.

Drawback
One drawback to this approach is that the exported fxz files would not be self sufficient, they could not be rendered unless the classes like Brick and HasCoin are present. But, this is an extension to the current system, so people would not be required to use it. The preview tool in illustrator could also just provide empty implementations for all unknown classes, which would allow it to at least draw the content in a static way.
Another drawback is that there might be some complexity when creating nodes with multiple classes, naming conflicts and that sort of thing. But even if only one class could be specified, inheritance would take care of most of the use cases. In the above example, Brick and HasCoin are separate classes. It might be possible to simply create the class Brick and the class CoinBrick, and have CoinBrick extend Brick. That being said, I think it best to allow nodes to have many, many classes, as it cuts down on the complexity of the inheritance model.

Conclusion
In short, this modification would allow us to specify that a layer in an adobe file is of a particular type as well as have a name. This would turn illustrator and photoshop into very powerful tools for creating complex content in a JavaFX application.

Monday, August 17, 2009

Java Store Affiliate Program - Sales As A Service

I always wanted to write and sell my own software. Even though I have a successful career as a Java developer I am sick of making money for other people. Over the last 4 or 5 years the cost of running a one or two man software company has plummeted. This has a lot to do with the increase in productivity modern developer tools provide as well as the large number of internet services which solve real business problems. For example, by using Google Apps the cost of email, basic web site hosting and a calendar is the cost of registering your domain, 11 bucks, or whatever it is.

Another service provided at very low cost is distributing and charging as can be seen with iTunes and hopefully the Java Store. However this model is not perfect, as many high quality apps never make any real money for their developers. This is because many developers simply lack the money or the "know how" to promote their application. I, for example, lack both the money and the "know how". As you can see from my two, part time attempts to create an internet business, QuackDuck and ClayWare Games, you have never heard of them, and this is because they have never been promoted properly. But I think I know a solution to this.

I recently started messing with the Amazon affiliate program, this is a program which rewards people for driving sales on Amazon.com. For each sell, you earn between 4-15% of the sale. Not bad if you can get people to the store for a reasonable amount of money. I found I did not have the time or the money to figure how to do this well, but it made me realize that Amazon is using a sales services in much the same way as I use Google for an email services. Much to my surprise, I was the one providing the service to Amazon.

So what I am suggesting is an affiliate program for the Java Store, but not just an affiliate program like the one for Amazon. I want an affiliate program were the developers can set terms for potential affiliates, receive feed back and generally provide sales as a service. Here are the features I would like to see:

Negotiate Percentage: The developers should have the power to set the percentage that the sales people receive. This should be more complex than just selecting a percentage number, this should include sliding scales for volume as well as a place where sales folks can make offers, even make guarantees on the number of units sold. Maybe some rates could be auctioned off.

Feedback from Sales: The sales people are going to know what customers want out of the next release or the next app. This should also include a bug report system, no software is perfect. There should be away for this information to get back the developers.

Media Files: Developers should provide media files, images, sounds, fonts, etc, that are used in the app so sales people can create ads or web pages with them. This could be as simple as a zip file to download.

Basically I want to see a store where professional sales people can find products that they think are worth something and sell those products. I want a store where I can post a good application and know there are people looking for good apps, not just for themselves, but because they want to help sell them. It seems to me that sales is the only thing missing for me to be successful as a independent developer.

Tuesday, June 16, 2009

Inkscape and JavaFX - Almost there

I was recently on a panel at JavaONE and a developer ask about support for open source tools and JavaFX. He was interested in exporting JavaFX content from Inkscape and possibly GIMP. Of course this is a perfectly reasonable idea. At the time, no one on the panel had a good answer for him. I did a little looking after the conference and found that Inkscape allows content to be saved in a format that JavaFX can work with, but with a number of limitations. Here are the details:

Requirements
I am on a Mac, so the instructions are Mac specific, but the basics steps should help anyone interested.

1) Install Macports
While Inkscape has an OS X installer, we require the devel version of inkscape and macports makes it easy to install the devel version. Here is a link for macports.

2) Install Inkspace
Open up a terminal and install Inkscape with macports.
sudo /opt/local/bin/port install inkscape-devel

(My installation of macports is at /opt/local/bin, yours might be different)

3) Launch Inkscape
On my machine Inkscape can be launched from the terminal by typing:
/opt/local/bin/inkscape

4) Install Netbeans
In case you are not set up for JavaFX develoment you can download and install Netbeans with the JavaFX pluging here.

Creating Content
Once Inkscape is open draw some basics shapes. The following screen shot shows some sample content which uses a few different types of objects and gradients.



Go ahead and save the content in Inkscape's flavor of SVG. Inkscape crashed a few times on me, so it is a good idea to save your work in format Inkscape is going to be happy with. Also save a copy of your content as a *.fx file. This file contains a javafx class which can be used in your JavaFX application. Here is a screen shot of my content.fx file open in Netbeans.
Look at all the errors!
[Update: Check the comments for an update on the status of the exported JavaFX code]

It seems that Inkscape is exporting a version of JavaFX which is no longer supported. It looks like some pre 1.0 version to me. Anyway, here are some basic tips to get the class to compile against JavaFX 1.2:
  1. Remove all bad import statements.
  2. Add the import statement 'import javafx.scene.shape.*;'
  3. Rename your class to start with a capital letter and make sure it is in the package you want it in.
  4. Remove the keyword 'private' from any method that uses it.
  5. change CurveTo to CubicCurveTo.
  6. Remove the Frame at the end of the file.
  7. Add the keyword 'override' to the function create().
  8. Change the property 'transform' to 'transforms' for the class Group.
Once the class compiles create a script to display the content, mine looks like this:
Stage {
title: "Application title"

width: 640

height: 480

scene: Scene{ content: [Content{}] }
}

And here is a screen shot:


That's not even close!

Looks like a number of features are yet to be implemented, the stroke paint was not honored, the radial gradient is missing and the stokeWidth did not get set.

But in all honesty, this is a really good start, and it is excellent to see JavaFX included in Inkscape at all. If there are any developers who are familiar with JavaFX and the Inkscape code base, this would be a really nice feature have improved.