Finally a project came along that provided a nice excuse to take that PolySmooth filament out of its shiny metal foil pouch.
Recently got one of those really nice Lego Floral kits, and they had taken taken over the real flower vase at home. So we figured they needed their own vase. After a while searching, I found a very reasonable looking Milk Jug:
A few tweaks in blender to make the bottom flat, and thicken the walls to 0.8mm for printing. I didn’t go with Vase mode since the handle makes it not suitable for that mode.
5 hours later, and some processing to remove supports and it’s done. I ended up really liking the shimmery surface, so I didn’t use alcohol to polish the surface, but I’ll try a second print with single 0.6mm walls to try to get a smooth and hopefully fully transparent finish.
This is a project that has been going on for some time. I figured it was time to learn Blender since I longer have access to a 3D Studio Max license. What better way than by designing and printing all the parts of a functional little robot mascot!
This is the reference image I used. Basically tried to replicate the body shape and moving parts. The front face/mask was somehow really tricky to dial in to get the same feeling.
After designing the parts, I printed them on my Cetus 3D. This was lots of fun and I learned a lot about the limits of FFF printing.
I ran into problems with the hinges. Being the first 3d printed hinges I’ve ever created (and really just pushing through with blender), I made a lot of mistakes, so I ended having to use metal rods for some bits and more glue that I originally intended.
This stayed on a drawer for a while until I figured out how I was going wire all the internals. I used the following:
Raspberry Pi Zero W
TFT LCD
Adafruit mono audio AMP,
PWM expansion board
3 servos
3 RGB LEDs
USB Microphone
I’ve got some code up and running already to create some rudimentary facial expressions, servo motor control and RGB colors. I’ve also been experimenting with a few things to get voice recognition working.
So, it’s been a long time since this project was started. Life has a strange was of getting in the way of hobbies, and sometimes we just can get enough time to finish things off. Thankfully, I had a chance a couple of weeks ago to finish off the last pieces I wanted to do and finally we have a real 3D rendered object on the screen, with real faces and even a simple texture. Last time this is how the architecture of the GPU looked like:
The rasterizer and shader were still not written or tested, and there was a camera space transform block missing in the pipeline as well. To speed up things a bit, I decided to drop the camera space transform block entirely. To be fair, that can be done in software and bundled into the word space transform + rotation block. That leaves two more blocks: the rasterizer and the shader.
For the rasterization process I tried several things, most of them ended up being bloated and/or not appropriate for implementation in an FGPA. I settled on finding out simply whether we were on the right side of the equations of three lines (the sides of the triangles). One of the rules I had when I started this project was that I would not look up other architectures or publicly available papers that would skew my design. It turns out that what I chose is pretty much what Juan Pineda proposed in 1988 in his paper “A Parallel Algorithm for Polygon Rasterization”.
The awesome thing of this algorithm is that (once optimized and after some solving some setup equations, the complexity boils down to tree sums per pixel, plus another thee sums per line:
// Scan through bounding rectangle
for (ap_uint<10> y = ymin; y < ymax; y++) {
int cx1 = cy1;
int cx2 = cy2;
int cx3 = cy3;
bool done = false;
for (ap_uint<10> x = xmin; x < xmax; x++) {
#pragma HLS PIPELINE II = 2 color_map_t a, b, c;
barycentric(x, y, p1.x, p2.x, p3.x, p1.y, p2.y, p3.y, &a, &b, &c); if (cx1 > 0 && cx2 > 0 && cx3 > 0){
output_pixel.depth = 1;
output_pixel.u = a*u_a + b*u_b + c*u_c;
output_pixel.v = a*v_a + b*v_b + c*v_c;
output_pixel.x = x;
output_pixel.y = y;
raw_pixel_out << output_pixel;
}
cx1 -= dy12;
cx2 -= dy23;
cx3 -= dy31;
}
cy1 += dx12;
cy2 += dx23;
cy3 += dx31;
}
The code shown above is the HLS code of the heart of the rasterizer. As pixels are produced, their barycentric coordinates are also generated. This is used in the shader module. The shader itself has very simple code. It essentially just looks up the right address of the texture in DDR memory (using nearest neighbor interpolation) and uses that as the final color.
Nothing fancy, but looking up data in DDR involves an AXI4-Full master interface, so that ends up consuming a fair amount of logic anyway.
I wrote a little code to convert a bitmap to a C array and loaded in on the ARM processor. The processor initializes the array in DDR during boot up so things *should* just work as long as the texture coordinates are input correctly. This block diagram now looks like this:
Sorry for the long delay between posts. Being a hobby project usually this takes low priority when other things are urgent. And actually, there have been many advances lately but I haven’t had the chance to post them. Lets go through these in order.
Although we have a fair point cloud rendering in place it really is not that useful unless we can move and rotate objects. For that purpose we have the object transform block highlighted below in red:
The idea is that we can have a list of object specific parameters in a BRAM that the block will use to alter the vertex stream:
number of vertices in the object
position
rotation
The block will take in vertices from the vertex pump, rotate them around the object’s axis and then add the object’s position as an offset to each vertex. The position part is actually quite simple since we only have to do an addition on each axis. The rotation part, well… no so much.
After successfully running the simulation, its time to see how the rendering works on the real HW. And as always, SW needs to be written to get the HW to know what to do. In this case, I will be loading the vertex data of a cube into memory to see it transformed, and then experiment with the same teapot data that I used in the C# version the app.
So, this is something I did a long time ago. I’m probably looking at this with nostalgia googles, so bear with me. I had seen lots of people do Touhou characters in Minecraft, but always in 2D!! Though awesome in itself, the whole point of Minecraft is to do things in 3d, so I got to work on voxelizing a 3d model of Flandre Scarlet that I had lying around. After importing the Voxels into Minecraft I then replaced the blocks (one by one, inside the game) with the right colored blocks. Took forever, but definitely worth it. Here it is:
In order to see if the design is working before committing to a full build in the FPGA I wanted to simulate it to see if it could render just a few pixels and return sensible pixel locations. There are of course lots of different complicated ways of doing this, some quite elaborated, but I just wanted to functionally verify the design in the shortest amount of time (this is a hobby project after all). So, I opted o use Xilinx BFMs. These are IP cores that can generate different kinds of traffic on AXI buses. Here’s the testbench that I created:
Well, due to being very busy at work I hand’t had a chance to actually post progress on the project, but we most definitely have progress! If you have been following these posts, you can see that last time we sketched out the overall architecture of the video card.
In order to render point clouds we only need the three blocks that are highlighted. Basically, a mechanism for pulling in raw vertexes from memory, a block that can transform the 3d points to a 2d screen space, and a block that can take those points and draw them on a frame buffer. So, here they are!
So, the time has finally arrived. Time to tackle the GPU in HW! So, a quick disclaimer: since this is a hobby project I will use HLS to quickly iterate designs and reach a functional RTL. All of the blocks will be designed considering that they are meant for RTL and (given enough time) could be replaced by hand coded VHDL/Verilog without too much hassle. This is the architecture that I am envisioning:
So, last week we got a basic projection algorithm in place. We “rendered” the vertices of the cube into a bitmap, but we barely know got it see it working. We definitely need something more complicated to see it operating. One option is to just try to hard code a list of larger vertices that describe a more complex object, but doing that by hand is definitely cumbersome, inexact and prone to errors. Instead, I decided to rely on the vast world wide web and find several 3d objects that I could use. It turns out that there are millions of such objects in many, many websites, but all of these are in different formats. After some hunting, I setted on using a .obj format. These are the reasons:
Plain ASCII format: Can’t beat this when it comes to ease of parsing