#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# What is the most efficient way to draw lots of meshes?

edited December 2014 Posts: 11

I am currently working on a Rts game and I want to draw at least a few hundred figures on the screen. Trees terrain and units. For the moment I am using "for" loops, but when I draw hundred trees the frame rate slows down to 30fps. I just want to know if someone knows a better way to draw a bunch of meshes at once?

Tagged:

• Posts: 791

Keep the number of meshes down to a minimum by using a sprite sheet approach. Here's a demo which draws 2000 objects as a single mesh. Each object is one of four possible images (the four corners of the Codea icon)

``````-- spritemesh

-- Use this function to perform your initial setup
function setup()
displayMode(FULLSCREEN)
m=mesh()
m.texture=img
obj={}
for i=1,2000 do
table.insert(obj,{x=math.random(WIDTH),y=math.random(HEIGHT),a=math.random(360),size=10+math.random(30),spin=-5+math.random(100)/10,xspd=-3+math.random(7),yspd=-3+math.random(7),xcoord=(math.random(2)-1)/2,ycoord=(math.random(2)-1)/2})
end
end

-- This function gets called once every frame
function draw()
m:clear()
-- This sets a dark background color
background(40, 40, 50)
for i,s in pairs(obj) do
m:setRectTex(id,s.xcoord,s.ycoord,0.5,0.5)
s.a = s.a + s.spin
s.x = s.x + s.xspd
s.y = s.y + s.yspd
if s.x>WIDTH then s.x=0 end
if s.x<0 then s.x=WIDTH end
if s.y>HEIGHT then s.y=0 end
if s.y<0 then s.y=HEIGHT end
end

m:draw()

end

``````

Don't know what the FPS is but I believe there is a bug in the current release of codea which may slow things down

• Posts: 688

Sprite sheets (upto 2048 pixels square) are definitely the way to go - texture changes are about the slowest thing you can do in OpenGL

• Posts: 8,737

@West I added frame rate code to your code and on my iPad Air, 2,000 meshes ran at 28. I upped it to 5,000 and that ran at 11. I upped it again to 10,000 and it ran at 5. Curious to see how it runs when we get the next version that fixes the speed drop.

• Posts: 5,396

@dave1707 - that speed is linear, it says Codea can draw (this sprite) on your iPad, about 55,000 times a second.

On my iPad3, I get an FPS of 17 for 2000 meshes (NB I am using a beta with the intended speed fix, which may give me an advantage).

• Posts: 791

@dave1707, @Ignatz thanks for the info. You both talk about 2000 meshes - I thought it was 2000 objects/Rects (4000 triangles) in a single mesh but maybe my terminology is off.

• Posts: 5,396

@West - no, you're right, it's 2000 objects

• Posts: 11

Thank you everybody. This will help my game along quite a bit. >-

• Posts: 11

@West Is it possible to move a single figur using the translate function, without moving all of them?

• Posts: 791

The easiest way to do it with the above code would be something like:

``````obj.x=obj.x+1

``````

which would move the 17th object to the right.

Do this outside the loop and you'll probably want to remove the other movement by deleting the following lines

``````        s.a = s.a + s.spin
s.x = s.x + s.xspd
s.y = s.y + s.yspd
``````

Not got my IPad at the moment so can't check

• Posts: 688

@West - sorry to come late to the party... but... looking at your demo above, I think it would be a lot faster if you create the mesh once in the setup function instead of recreating it every frame and then storing the id returned from mesh:addRect in a table so that you can update the individual rectangles each frame instead (in fact you probably don't even need to do that if all your doing is adding rectangles - just use the required index in the mesh:setRect() call

Also replace the

``````for i,s in pairs(obj) do
``````

with

``````local s
for i=1,2000 do
s = obj[i]
...
...
m:setRect(i,...)
``````

As you'll have the overhead of 2000 function calls as lua calls the pairs iterator which is also quite slow.

If you're worried about adding and deleting objects on the fly during your game, just add enough rects to the mesh at the start for your worst case scenario and then use the rects as a pool and just set unused ones to 1x1 pixels and move them off screen.

That way your frame rate should be consistent regardless of how many objects you have moving around.

• Posts: 791

Hi @TechDojo - thanks for the pointers - will try it later. The example was butchered from a previous test of sprites vs meshes I had kicking about

• Posts: 11

@TechDojo Could you please post your code, where you can specify what models are moving?

• Posts: 688

@Holger_gott - I'll see if I can dig some out later but in the mean time, I'll make the changes to @West's above although I'm not able to test it... Here goes...

``````-- spritemesh

-- Use this function to perform your initial setup
local numObjs = 2000
local obj
local objIDs = {}

function setup()
displayMode(FULLSCREEN)
m=mesh()
m.texture=img
obj={}
local id
for i=1,numObjs do
obj[i] = {x=math.random(WIDTH),y=math.random(HEIGHT),a=math.random(360),size=10+math.random(30),spin=-5+math.random(100)/10,xspd=-3+math.random(7),yspd=-3+math.random(7),xcoord=(math.random(2)-1)/2,ycoord=(math.random(2)-1)/2})

m:setRectTex(id,s.xcoord,s.ycoord,0.5,0.5)
objIDs[i] = id   -- not sure if this is required
end
end

-- This function gets called once every frame
function draw()
-- This sets a dark background color
background(40, 40, 50)
local s
for i=1,numObjs do
s = obj[i]

s.a = s.a + s.spin
s.x = s.x + s.xspd
s.y = s.y + s.yspd
if s.x>WIDTH then s.x=0 end
if s.x<0 then s.x=WIDTH end
if s.y>HEIGHT then s.y=0 end
if s.y<0 then s.y=HEIGHT end

m:setRect(objIDs[i],s.x,s.y,s.size,s.size)
-- m:setRect(i,s.x,s.y,s.size,s.size)    -- this *may* also work, not sure ????
end

m:draw()

end
``````

This is the basic idea. Be interesting to see what kind of speed difference this makes especially as the number of objects ramps up. @dave1707 any chance of you putting your FPS code in this and posting some stats?

• Posts: 8,737

@TechDojo Your version needs some work to get it to run.

• Posts: 791

Here's a working version

``````-- spritemesh

-- Use this function to perform your initial setup
local numObjs = 2000
local obj
local objIDs = {}

function setup()
displayMode(FULLSCREEN)
m=mesh()
m.texture=img
obj={}
local id
for i=1,numObjs do
obj[i] = {x=math.random(WIDTH),y=math.random(HEIGHT),a=math.random(360),size=10+math.random(30),spin=-5+math.random(100)/10,xspd=-3+math.random(7),yspd=-3+math.random(7),xcoord=(math.random(2)-1)/2,ycoord=(math.random(2)-1)/2}

m:setRectTex(id,obj[i].xcoord,obj[i].ycoord,0.5,0.5)
objIDs[i] = id   -- not sure if this is required
end
end

-- This function gets called once every frame
function draw()
-- This sets a dark background color
background(40, 40, 50)

for i=1,numObjs do
local s
s = obj[i]

s.a = s.a + s.spin
s.x = s.x + s.xspd
s.y = s.y + s.yspd
if s.x>WIDTH then s.x=0 end
if s.x<0 then s.x=WIDTH end
if s.y>HEIGHT then s.y=0 end
if s.y<0 then s.y=HEIGHT end

end

m:draw()

end

``````
• Posts: 8,737

I added my frame rate code and got the same values. 2,000 was 28, 5,000 was 11, 10,000 was 5. I tried 55,000 as @Ignatz suggested above and the frame rate was 1.

• Posts: 688

@dave1707, @West - thanks for fixing the code, I wasn't near my iPad so I was coding blond. Although personally I'd still move the 'local s' outside of the for loop.

To be honest, im surprised at the speed timings I'm assuming you're using the new fixed beta. So recreating a mesh every frame takes the same time as updating each text?? When I get five minutes I'm going to try and create a kind of profiling framework so we can time these functions to get a better understanding of what's happening.

• edited December 2014 Posts: 152

Hi @TechDojo,

We wrote a profiler for Codea, I cant get on to GitHub at the moment to create a gist, so code included below...

It works on an 'object' level, e.g. a table that contains functions, or a class instance...

Basically it swaps out every function it finds into a wrapper that does timing and counts of calls and calculates averages etc. It's a little clunky as it allows a maximum of 10 parameters per function, could probably do some cleverer arg unpacking...

Usage is as follows, when game is running (from the console), or in code if you like:

``````startProfiling(obj, delay)
stopProfiling()
``````

If you add a delay time in seconds then profiling will automatically stop and halt the game and report to the console...

Here are the helper functions, class definition is below:

``````-- ------------------
-- Profiler Functions
-- ------------------
local profiler = nil
function startProfiling(obj, delay)
if (profiler) then
profiler:stop()
end
profiler = Profiler4Codea(obj)
profiler:start()
if(delay) then
tween.delay(delay, function()
stopProfiling()
error("Stopping game for profiler results")
end)
end
end

function stopProfiling()
if (profiler) then
profiler:stop()
local report
print("TOTAL\r\n")
report = profiler:report(Profiler4Codea.TotalTime)
print(report)
print("AVERAGE\r\n")
report = profiler:report(Profiler4Codea.AvgTime)
print(report)
print("#INVOKED\r\n")
report = profiler:report(Profiler4Codea.TimesInvoked)
print(report)
end
end
``````

Class definition:

``````Profiler4Codea = class()

Profiler4Codea.TimesInvoked = "timesInvoked"
Profiler4Codea.TotalTime = "totalTime"
Profiler4Codea.AvgTime = "avgTime"

Profiler4Codea.Descending = "descending"
Profiler4Codea.Ascending = "ascending"

local table_insert = table.insert

function Profiler4Codea:init(obj, name)

assert(obj, "No object supplied")

self.obj = obj
if (type(obj) ~= "table") then
error("Profiler4Codea:init: obj must be table or class: " .. tostring(self.obj))
end
local metaTable = getmetatable(self.obj)
if (metaTable) then
self.obj = metaTable
end
if (name) then
end
end
end

function Profiler4Codea:start()

self.clockTime = os.clock()

for name, member in pairs(self.obj) do
local mType = type(member)
if (mType == "function" and name ~= "init" and name ~= "draw") then
totalTime = 0,
timesInvoked = 0,
func = name
}
end
self.obj[name] = function(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
local elapsedTime = os.clock()
local r1, r2, r3, r4, r5, r6, r7, r8, r9, r10 =
self.metaData[name].origFunction(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
(os.clock() - elapsedTime)
return r1, r2, r3, r4, r5, r6, r7, r8, r9, r10
end
end
end
end

function Profiler4Codea:stop()
-- Restore function pointers
for name, meta in pairs(self.metaData) do
self.obj[name] = meta.origFunction
end
return self:report()
end

function Profiler4Codea:report(sortKey, ascDesc, stringify)

if (not self.clockTime) then
return "No data"
end

sortKey = sortKey or Profiler4Codea.TimesInvoked
ascDesc = ascDesc or Profiler4Codea.Descending
if (stringify == nil) then stringify = true end

local data = {}
for name, meta in pairs(self.metaData) do
-- Calculate average time so we can sort by if required
if (meta.timesInvoked > 0) then
meta.avgTime = meta.totalTime / meta.timesInvoked
else
meta.avgTime = 0
end
table_insert(data, meta)
end

if (ascDesc == Profiler4Codea.Descending) then
table.sort(data, function(a, b) return b[sortKey] < a[sortKey] end)
elseif (ascDesc == Profiler4Codea.Ascending) then
table.sort(data, function(a, b) return a[sortKey] < b[sortKey] end)
else
error("Unknown sort key")
end

if (stringify) then
local sb = {}
table_insert(sb, "Profiler4Codea: Sample time: ")
table_insert(sb, os.clock() - self.clockTime)
table_insert(sb, "\r\n")

table_insert(sb, "Sort key: ")
table_insert(sb, sortKey)
table_insert(sb, "\r\n")

table_insert(sb, "Asc/Desc: ")
table_insert(sb, ascDesc)
table_insert(sb, "\r\n")

for k = 1, #data do
local meta = data[k]
if (meta.timesInvoked > 0) then
table_insert(sb, "[")
table_insert(sb, meta.func)
table_insert(sb, ",")
table_insert(sb, tostring(meta.totalTime))
table_insert(sb, ",")
table_insert(sb, string.format("%d", meta.timesInvoked))
table_insert(sb, ",")
table_insert(sb, string.format("%.6f", meta.avgTime))
table_insert(sb, "]\r\n")
end
end

return table.concat(sb)
end
return data
end

function Profiler4Codea.globalStop()

for name, profilerList in pairs(globalMetaData) do
for k = 1, #profilerList do
local profiler = profilerList[k]
profiler:stop()
end
end
end

function Profiler4Codea.globalReport(sortKey, ascDesc)
-- Iterate over all object types
local result = {}
local instances = {}
for name, profilerList in pairs(globalMetaData) do
-- Iterate over each instance
instances[name] = #profilerList
for k = 1, #profilerList do
local profiler = profilerList[k]
-- Get data for instance
local data = profiler:report(sortKey, ascDesc, false)
-- Iterate over instance functions
for k = 1, #data do
local meta = data[k]

if (not result[name .. "." .. meta.func]) then
result[name .. "." .. meta.func] = {
totalTime = 0,
timesInvoked = 0,
avgTime = 0
}
end
result[name .. "." .. meta.func].totalTime =
result[name .. "." .. meta.func].totalTime + meta.totalTime
result[name .. "." .. meta.func].timesInvoked =
result[name .. "." .. meta.func].timesInvoked + meta.timesInvoked
end
end
end
-- Calculate average
local final = {}
for k, data in pairs(result) do
if (data.timesInvoked > 0) then
data.avgTime = data.totalTime / data.timesInvoked
data.id = k
table_insert(final, data)
end
end

sortKey = sortKey or Profiler4Codea.TotalTime
ascDesc = ascDesc or Profiler4Codea.Descending

if (ascDesc == Profiler4Codea.Descending) then
table.sort(final, function(a, b) return b[sortKey] < a[sortKey] end)
elseif (ascDesc == Profiler4Codea.Ascending) then
table.sort(final, function(a, b) return a[sortKey] < b[sortKey] end)
else
error("Unknown sort key")
end

local sb = {}
table_insert(sb, "Profiler4Codea: ")
table_insert(sb, "\r\n")

for name, count in pairs(instances) do
table_insert(sb, "Class: ")
table_insert(sb, tostring(count))
table_insert(sb, "\r\n")
end

table_insert(sb, "Sort key: ")
table_insert(sb, sortKey)
table_insert(sb, "\r\n")

table_insert(sb, "Asc/Desc: ")
table_insert(sb, ascDesc)
table_insert(sb, "\r\n")

for k = 1, #final do
local meta = final[k]
if (meta.timesInvoked > 0) then
table_insert(sb, meta.id)
table_insert(sb, ",")
table_insert(sb, tostring(meta.totalTime))
table_insert(sb, ",")
table_insert(sb, string.format("%d", meta.timesInvoked))
table_insert(sb, ",")
table_insert(sb, string.format("%.6f", meta.avgTime))
table_insert(sb, "\r\n")
end
end
return table.concat(sb)
end
``````
• Posts: 688

@brooksie and this is why I love this forum! Thanks • Posts: 509

I got the same averages as @dave1707 on my iPad 4 using the latest beta.

However, I decided to look a little closer at the numbers being produced and discovered that the variance in the framerate is quite large. The framerate ranges from about half the average to about twice. This results in extremely choppy animation.

If your mesh rectangles are following a deterministic path (as they are in @West's code), it is far, far more efficient to use a shader to update their trajectory. Then you need to pass in a load of initial data but at each draw cycle you only pass in the elapsed time. The shader then computes the updated position. Moreover, as this is happening on the GPU, it can be done in parallel rather than in a single thread on the CPU. Not only is this far, far faster it also results in much smoother animation.

For example, using my explosion shader (which does the same: moves and rotates rectangles), I get a framerate of 30 with 27,000 rectangles. At 55,000 rectangles, my framerate is 20. At 110,000 the framerate is 11. As with the code here, once it starts going down then it is inversely proportional to the number of rectangles but the number of rectangles needed before it starts going down is far higher.

(Incidentally, @Ignatz's terminology is incorrect. A linear relationship is described by an equation of the form `y = m x + c` and this is not. Rather, it is inversely proportional in that the number of rectangles times the framerate is roughly constant. You could say that the relationship between the number of rectangles and the time taken for each frame to render is linear but "time taken" is the reciprocal of the framerate which was the quantity being discussed.)

So if you can, shift the mesh's movement into a shader. An explanation of my explosion shader can be found at http://loopspace.mathforge.org/HowDidIDoThat/Codea/Shaders/.

• Posts: 152

Hi @LoopSpace,

Regarding:

...it is far, far more efficient to use a shader to update their trajectory. Then you need to pass in a load of initial data but at each draw cycle you only pass in the elapsed time. The shader then computes the updated position.

How do you pass arbitrary data in? Is it just a set of numeric variables, or do you 'spoof' up a texture and read results out of eg rgba values?

Just curious...or does the shader now support table/array data?

@Brookesi

• Posts: 509

@Brookesi Take a look at the link I posted. That contains the details of how to pass this information through.

• Posts: 8,737

@TechDojo @LoopSpace I'm still running the slow version of Codea. I expect the fixed version will result in a speed increase of about 3 times. I tried doing a more instant timing and found that the time varied a lot from frame to frame. That's why I used an average time over the whole run, it's easier to get a fixed value. I use the total number of draw cycles divided by the total of DeltaTime.

• Posts: 688

Hmm - as a lot of those triangles would be passing over each other as they move I wonder if the GPU is doing some clever stuff ignoring overdrawn pixels and therefore the actual drawtime could fluctuate, or alternatively it might be down to the rotation (matrix calc) taking effect, it might be interesting to stop the rotation and just use a fixed angle to see if that makes a difference.

@dave1707 - I actually really noticed the slowdown for the first time yesterday, I was playing with an old fractal landscape demo, where most of the code is done in setup to actually generate the mesh and then each frame it simply repositions the camera. What I noticed is that the initial startup time took a lot longer (so much I'd initially thought the demo had crashed on the new build) but when actually running the difference was minimal (if anything). So I guess any timings on processor intensive operations should be ignored until the new version is released.

@LoopSpace - thanks for sharing the shader code, I'm still trying to get my head around your perspective correct shader • edited December 2014 Posts: 509

@dave1707 I ran West's code on my iPad and got the same figures as you did. So the speed-up from passing to shaders is entirely down to passing to shaders and not to being on different betas. Also, while average fps gives a reasonable overview, looking at variation can be important too. I looked at the time taken per frame and saw that it jumped a lot, so looked at the minimum and maximum time over the last ten frames as well. That's where I saw that it varies from half to double the average.

• edited December 2014 Posts: 5,396

@TechDojo - In my 3D work I've noticed (the obvious) that the more pixels need colouring, the slower it is, so I wondered if that had an effect here because the screen is so crowded.

I tried simply restricting the images to 1/4 of the screen, so 3/4 was blank, and it had no effect on speed whatsoever, which surprised me, because another thing I learned from 3D is that OpenGL is extremely efficient at culling unseen vertices, and when you restrict screen space, that should mean fewer visible vertices.

• Posts: 688

@Ignatz - From my readings of OpenGL I think it can detect (possibly through the use of a z buffer) if the pixels have already been drawn and then not draw them again - I remember something about rendering semi-transparent triangles and making sure that they are drawn in the correct Z order. This would obviously be beneficial if the triangles were pre-sorted in Z.

I worked for a company many years ago that created an arcade board that was very good at rasterising spans of pixels for objects across each scanline ensuring that there was no overdraw. It was very fast and particularly good at scaling sprites (ala Afterburner and Outrun) but semi-transparency then was a real issue (I don't think it was supported - but then it was 1993 ).