AS3.0 array vs. list speed test.

  • 2,574 Views
  • 17 Replies
New Topic Respond to this Topic
Fickludd
Fickludd
  • Member since: Feb. 28, 2004
  • Offline.
Forum Stats
Member
Level 15
Blank Slate
AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 03:31 PM Reply

This is the result of this thread about particles.

So the question was which data structure is faster in AS3 when iterating through the entire set, the array or the list. Many, including me, where of the opinion that an array should be faster on account of the horrible system cache behaviour of lists, even though information theoretically they both have linear iteration time. The other opinion was that the ability to type values in lists would make them faster.

The test:
I started with two setups, timing first the precaching of the sets, and then iterating through them and getting the one number stored.

Those two sets where:
<ul>
<li>An array with objects
I chose to have an array with objects since that seems like the most useful and usual way to have things if I want to store more than a single value in each object.
The objects:

package {
    public class SpeedTestObj {
        public var nbr:Number;
    }
}

</li> <li>A list with objects

package {
	public class SpeedTestObjList {
		public var nbr:Number;
		public var next:SpeedTestObjList = null;
	}
}

</li>
</ul>

With these I set up the follow testing code:

import flash.utils.getTimer;
var timer:uint;
var nbrOfObjects:uint = 1000000;
var i:int;
var s:Number;

timer = getTimer();
var objArr:Array = new Array(nbrOfObjects);
var obj:SpeedTestObj;
for (i=0; i<nbrOfObjects; i++) {
	obj = new SpeedTestObj();
	obj.nbr = i;
	objArr[i] = obj;
}
trace("creating "+i+" object array: "+(getTimer()-timer));

timer = getTimer();
var objList:SpeedTestObjList = new SpeedTestObjList();
var first:SpeedTestObjList = objList;
first.nbr = 0;
for (i=1; i<nbrOfObjects; i++, objList = objList.next) {
	objList.next = new SpeedTestObjList();
	objList.next.nbr = i;
}
trace("creating "+i+" object list: "+(getTimer()-timer));

timer = getTimer();
for (i=0; i<nbrOfObjects; i++) {
	s = objArr[i].nbr;
}
trace("getting Number from "+i+" objects in array: "+(getTimer()-timer));

timer = getTimer();
objList = first;
while (objList != null) {
	s = objList.nbr;
	objList = objList.next;
}
trace("getting Number from "+i+" objects in list: "+(getTimer()-timer));

The result? (times in ms)

creating 1000000 object array: 520
creating 1000000 object list: 472
getting Number from 1000000 objects in array: 129
getting Number from 1000000 objects in list: 16

Earth shattering victory for the list! There's no arguing against those numbers.

But wait a moment, the list is coded for this exact application while the array is übergeneric?
New hypothesis: The adding of objects to a list is actually the adding of pointers to other places in memory where the objects are, which then gives as bad space locality to the array as to the list, but with the overhead of the extra array look up this goes slower.
But what if but the numbers directly into the array? Numbers aren't pointers and should therefore allow better cache performance through space locality.

Testing:

timer = getTimer();
var nbrArr:Array = new Array(nbrOfObjects);
for (i=0; i<nbrOfObjects; i++) {
	nbrArr[i] = i;
}
trace("creating "+i+" Numbers in array: "+(getTimer()-timer));

timer = getTimer();
for (i=0; i<nbrOfObjects; i++) {
	s = nbrArr[i];
}
trace("getting Number from "+i+" Numbers in array: "+(getTimer()-timer));

Result: (in ms)

creating 1000000 Numbers in array: 67
getting Number from 1000000 Numbers in array: 36

The creation time is no suprise much lower since there is no need to find new memory space for a million objects.
But the getting time is still double the time of the list! Either the type checking cause by the array being generic is really heavy (on a relative scale of course), or the array implementation in the AVM2 is *strange*, maybe using some kind of number hashing?

Anyway, the list seems to walk victorious through my best tries of arrayiness, which is definitly something to remember when writing the next heavy AS3 application.

"But I want spacial locality!", you scream!

I actually came up with a way of further improving my test example:

// in separate file
package {
	public class DualSpeedTestObjList {
		public var nbr:Number;
		public var nbr2:Number;
		public var next:DualSpeedTestObjList = null;
	}
}

// in .fla 
timer = getTimer();
var objList2:DualSpeedTestObjList = new DualSpeedTestObjList();
var first2:DualSpeedTestObjList = objList2;
first2.nbr = 0;
first2.nbr2 = 1;
for (i=2; i<nbrOfObjects; i++, objList2 = objList2.next) {
	objList2.next = new DualSpeedTestObjList();
	objList2.next.nbr = i;
	objList2.next.nbr2 = ++i;
}
trace("creating "+i+" dualobject list: "+(getTimer()-timer));

timer = getTimer();
objList2 = first2;
while (objList2 != null) {
	s = objList2.nbr;
	s = objList2.nbr2;
	objList2 = objList2.next;
}
trace("getting Number from "+i+" dualobjects in list: "+(getTimer()-timer));

Results: (ms of course)

creating 1000000 dualobject list: 234
getting Number from 1000000 dualobjects in list: 10

Nuff ranting. Optimization is fun. There's always something more :D.

mariomaster123
mariomaster123
  • Member since: Apr. 27, 2008
  • Offline.
Forum Stats
Member
Level 11
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 03:33 PM Reply

ZOMG
ALL THAT CODE=SCARYZ!!!!!!!!!!!!!!!!!

:( I'm not sure what the problem is... I'll keep inversitgating though!


Told ya'll we shoulda practiced...

Fickludd
Fickludd
  • Member since: Feb. 28, 2004
  • Offline.
Forum Stats
Member
Level 15
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 03:41 PM Reply

At 5/13/08 03:33 PM, mariomaster123 wrote: ZOMG
ALL THAT CODE=SCARYZ!!!!!!!!!!!!!!!!!

Lol, be afraid!

( I'm not sure what the problem is... I'll keep inversitgating though!

Uhm, it's not a problem but more of a test report... but do tell me when you find anything =P

Fickludd
Fickludd
  • Member since: Feb. 28, 2004
  • Offline.
Forum Stats
Member
Level 15
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 03:43 PM Reply

Oh, and sorry all for the crappy formating! There's no preview or editing possibilities...

dELtaluca
dELtaluca
  • Member since: Apr. 16, 2004
  • Offline.
Forum Stats
Member
Level 20
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 04:47 PM Reply

Ofcourse, if AS had pointers then the execution time for sequentially accessing objects from an array would be much faster than a linked list, and if AS had typed arrays (like it will) and they were normal arrays rather than AS's retarded ones, then arrays would just generally be faster :P

in these tests that is, linked lists will always be faster generally for insertion and deletion when random access isn't an issue

using ShamelessPlug; NapePhysicsEngine.advertise();

BBS Signature
Fickludd
Fickludd
  • Member since: Feb. 28, 2004
  • Offline.
Forum Stats
Member
Level 15
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 05:01 PM Reply

At 5/13/08 04:36 PM, LCurtis wrote: Generally I dont access 1000000 objects at any given time (or anywhere close to that) but still something to keep in the back of my mind I guess.

Agreed. I tried to come up with something plausible, but couldn't... Still, 125 ms = 8 fps, which is without doing any real computing OR rendering. Also that's on a dual core 2 gHz machine and you probably want your games to run smooth on lesser comps. But no, it will probably never matter which way you do.

I was really hoping Adobe had someone figured out how to make arrays act faster than lists but oh well.

Ya, me to.

At 5/13/08 03:43 PM, Fickludd wrote: editing possibilities...
Copy my sig pic!

Just did!

DougyTheFreshmaker
DougyTheFreshmaker
  • Member since: Jul. 30, 2007
  • Offline.
Forum Stats
Member
Level 02
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 13th, 2008 @ 05:53 PM Reply

God dammit you kids are gonna kill me...

I made changes and got the array times to be equal/faster/much faster for different situations. It ended up being a bunch of code and benchmarks, probably beyond the character limit offered here. I may blog about it or some shit, but it boils down to inconsistencies in your tests (you need to be extra careful in these micro-benchmarks) as well as one important debatable point, which I had forgotten about since I messed with this stuff.

First, your common subexpression elimination is what was dragging down the array's creation time for me. If, instead of factoring out the array dereferences, you assign values to the array directly (ex arrObj[i] = new Obj(); arrObj[i].nbr = i or whatever) then the array's creation time sinks--in my case, it looked to be faster nearly always, by a small margin. You don't use this sort of CSE with the list's creation, so we won't use it here either.

Second, and this is really the crux of the matter and that debatable point I was speaking of, you're not comparing the same thing. You're comparing an array of objects to a list of nodes which directly contain the values you want. An array of numbers not wrapped in objects, for instance, out performed the list by an order of magnitude when I tried it. But, like you said, you'd generally have an array of objects, not intrinsic values.

If you create an array of objects and compare it to a list of *objects*, you'll find that your times are much more equal. When you're talking about things like the boxing/unboxing arrays go through, this cost is so tiny that every little thing counts. An extra dereference in one place will often nearly double execution times. So accessing a property of an object in each of these containers for iteration is like this:

some_array[index].property; i++; // Array version

some_node.object.property; // List version
some_node = some_node.next;

So the array has two dereferences, an unboxing, and an increment. The list has three dereferences and an assignment, plus coherency issues--and the question is which operations add up to more execution time. I've found them both to be roughly the same.

To further illustrate how the unboxing is significant but not THAT significant, you can change your array assignment by removing the indexing operation of the array and, instead, just use a variable with the "any" type for assignment:

var some_elm:* = objArr[1];
for (i=0; i<nbrOfObjects; ++i)
	s = some_elm.nbr;

The access time levels out with the list (iirc, which I may not) simply because you don't have to do that indexing operation, but still have to do the conversion, which also implies that the conversion is about as costly as an add operation (but don't quote me on that :p)

So anyway, the point is that, ALL THINGS EQUAL, an array and a list are roughly equal for traversal--but remember there are other important considerations to take into account when you're picking a datastructure beyond this traversal stuff...

The catch is that an argument can be easily made for a setup such as the one above, where the only way to store references to objects in an array is to subject it to the ugly conversions, where as you can store an object's properties directly in the nodes in a list and, if you're careful, see some improvements.

Of course you'd have to make a bunch of lists, which can get annoying and take time. The two structures have many more fundamental differences to consider when picking one for implementation, etc.

On top of all of that, traversal times like this aren't going to be the issue for the vast majority of cases. For instance, I had mentioned to Kajenx in his previous thread that the splicing/unshifting was likely the source of his performance issues, and that pooling or swap-and-pop would take care of those, but I think he opted for a list instead, and now everyone is on this 'lists are faster' bandwagon, which isn't strictly true.

I guess the main reason this gets my goad is because I think it derives from the haXe guy's post on it, as well as his FastList class. I haven't looked, but I'm sure that FastList takes an object as a template parameter, and thus falls into the case I mentioned above--a list of objects, which isn't faster as far as I can tell. The entire Flash community has this annoying habit of looking at one benchmark produced by one guy at one time, immediately accepting it and spreading around whatever was said without question.

creating 1000000 object array: 2583
creating 1000000 object list: 3470
getting Number from 1000000 objects in array: 308
getting Number from 1000000 objects in list: 299

Array constructs 0.887 seconds faster, and executes 9ms slower on my old, slow machine, for an entire traversal.

package {
	import flash.display.Sprite;
	import flash.utils.getTimer;
	
	public class FickluddsListTest extends Sprite
	{
		
		private const nbrOfObjects:int = 1000000;
		private const TEST_RUNS:int = 5;
		
		public function FickluddsListTest():void { 
				for (var i:int = 0; i < TEST_RUNS; i++) do_test(); 
			}
		
		private function do_test():void
		{
			var timer:int;
			var i:int;
			var s:Number;
			
			timer = getTimer();
			var objArr:Array = new Array(nbrOfObjects);
			for (i=0; i<nbrOfObjects; i++) {
				objArr[i] = new SpeedTestObj();
				objArr[i].nbr = i;
			}
			trace("creating " + i + " object array: " + (getTimer() - timer));

			timer = getTimer();
			var objList:SpeedTestObjList = new SpeedTestObjList();
			var first:SpeedTestObjList = objList;
			first.nbr = new SpeedTestObj();
			first.nbr.nbr = 0;
			for (i = 1; i < nbrOfObjects; i++, objList = objList.next) {
				objList.next = new SpeedTestObjList();
				objList.next.nbr = new SpeedTestObj();
				objList.next.nbr.nbr = i;
			}
			trace("creating " + i + " object list: " + (getTimer() - timer));
			
			timer = getTimer();
			for (i=0; i<nbrOfObjects; i++) {
				s = objArr[i].nbr;
			}
			trace("getting Number from " + i + " objects in array: " + (getTimer() - timer));

			timer = getTimer();
			objList = first;
			while (objList != null) {
				s = objList.nbr.nbr;
				objList = objList.next;
			}
			trace("getting Number from " + i + " objects in list: " + (getTimer() - timer));
		}
	}
}

class SpeedTestObjList {
	public var nbr:SpeedTestObj; // Doesn't this make so much more sense?  It is a "SpeedTestObjList", after all
	public var next:SpeedTestObjList = null;
}


class SpeedTestObj {
    public var nbr:Number;
}

We should take care not to make the intellect our god; it has, of course, powerful muscles, but no personality.
Freshmaking
Brainscrape

BBS Signature
Kajenx
Kajenx
  • Member since: Dec. 1, 2006
  • Offline.
Forum Stats
Member
Level 17
Artist
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 02:14 AM Reply

@Dougy: I tried both an array and a list on my particle system, and I found the list to be 4 - 5 times faster. GustTheASGuy explained why rather well in that thread. I'm not sure if this is the case in all situations, but for anything like a particle system with hundreds of thousands of particles or an AI system where you need to have 100 units checking eachother's stats, I think the list will be superior, because you just need to iterate through them and not access a given point. If Adobe introduced typed arrays, though, then the logic supporting the list collapses and they'll probably give equal results.


BBS Signature
Kajenx
Kajenx
  • Member since: Dec. 1, 2006
  • Offline.
Forum Stats
Member
Level 17
Artist
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 02:17 AM Reply

Opps, sorry, I didn't read your whole post. Could you explain the swap and pop method and I'll try it out to see if it's faster?


BBS Signature
Kajenx
Kajenx
  • Member since: Dec. 1, 2006
  • Offline.
Forum Stats
Member
Level 17
Artist
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 02:21 AM Reply

WHOO triple post!

I just remembered why I didn't look up the swap and pop. I got rid of the shifting and splicing and pre-generated everything. That DID speed up the array, but when I tried the list it was even faster.


BBS Signature
DougyTheFreshmaker
DougyTheFreshmaker
  • Member since: Jul. 30, 2007
  • Offline.
Forum Stats
Member
Level 02
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 03:48 AM Reply

I understand the scale of the particle system you're talking about... but the difference in benchmarks for a *million* elements was a handful of milliseconds. Not having a list of objects will have more significant gains (and having property containing nodes), as Fick demonstrated... but I just can't see those gains having such a huge impact when you also factor the particle repositioning, rendering, blurring etc... the cost of those functions should really eclipse the cost of traversal for either structure if they're used appropriately. You could probably add a few more particles using the list, but I doubt you'd see "10 times more" or anything stupid like that...

Preallocating a bunch of objects is along the lines of the pooling method I was talking about, but there are a lot of other considerations for going that route. It'll certainly be faster that constantly allocating nodes like you might with a linked list--but then again you can pool list nodes as well...

Anyway, I know I've explained this before, probably several times, but I think it was many moons ago. The swap-and-pop idiom is a trick that helps remove most of the cost from the usually expensive remove-in-middle-of-an-array operation. It really wouldn't be a complete solution for your problem, but it'd be a start. The pooling and ensuring your array container is sized appropriately are much more ideal.

At any rate, and this also applies to pooling, arrays by definition are stored in a contiguous block of memory. Most languages these days, AS3 included, will abstract their most basic array structure away from the memory underpinnings that it's actually based on (arrays can be declared in straight asm, iirc). There are a host of issues to consider with this, but we'll focus on the splicing since you specifically asked about that one.

Let's say you have an array with 100 elements and you decide you want to 'splice' element 50 out. Array[50] is gone, so it should now contain whatever value is in Array[51]. After than, Array[51] will need whatever value was in Array[52], etc. You'll essentially have to shift every single element after the spliced element down by one. After you do this to all of the element after your splice point, you'll have this 'hole' stuck at the end of your array. At this point it's easy to just shrink the amount of memory allocated for the array by freeing up only that last block.

If the order of your array doesn't matter, such as in your particle engine, you can alternatively just swap the value at Array[50] with the value at Array[99] directly. Now you have the element you want to delete at the end of your array, and can just call Array.pop to free up the memory on the end of the array without that ugly bubbling process, which can be VERY expensive.

I had done a test on this once... not sure if I still have it, but on some over-sized array I remember getting times of something like 300ms for random splicing on 25% of the array, and around 1ms runtimes for the swap-and-pop method. Decent improvement, no?

You have to be careful for other reasons as well--namely that arrays outgrowing their .length is major trouble, even if AS3 handles it gracefully.

The list is probably just plain better for this example, but not by any huge amount given both versions coded 'appropriately'. Take the splicing of the array, for example. Sure, we know that splicing is going to be a game killer for a big array, but even if it didn't the function call itself (just the fact that you had to call a function) is significant versus inline code for these tiny benchmarking purposes.

Blah blah blah... I've just got this pet-peeve for when people replace a poor implementation of technique A with a decent implementation of technique B and then say technique B is always better than A, even an explanation for why B is better is supplied. I'd be interested in seeing exactly how much of a difference the list made in a particle engine given two acceptable implementations, but I'm already aware that a shoddy array implementation will drastically underperform a decent list implementation :p No offense intended or anything, of course--it sounds like your later attempts at the array version were probably much more reasonable


We should take care not to make the intellect our god; it has, of course, powerful muscles, but no personality.
Freshmaking
Brainscrape

BBS Signature
Kajenx
Kajenx
  • Member since: Dec. 1, 2006
  • Offline.
Forum Stats
Member
Level 17
Artist
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 04:05 AM Reply

Haha, none taken. I basically had a crash course in arrays these last few days. I look at my first post in the particle thread and it's seems pretty obvious where to improve it now. :D


BBS Signature
GustTheASGuy
GustTheASGuy
  • Member since: Nov. 2, 2005
  • Offline.
Forum Stats
Member
Level 08
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 08:20 AM Reply

Suppose AS could use a vector implementation?

As for list iteration, uh well what the fuck I don't get it. http://lists.motion-twin.com/pipermail/h axe/2008-May/016733.html
In this the array lagged horribly for me. You can try it if you have haXe.


BBS Signature
Fickludd
Fickludd
  • Member since: Feb. 28, 2004
  • Offline.
Forum Stats
Member
Level 15
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 09:07 AM Reply

I think we all agree that in the end the choise between a list and an array is not a speed issue, but a question of what you need to do with it. There are in almost every case other stuff that stands for 99% of the cycles.

Still I find it fashinating that the list with numbers (ie. the specialized list I used for testing) was faster to iterate than the array with only Numbers.

Does anyone by the way know how AS arrays are implemented? I couldn't find anything on the net explaning it... I mean since they're obviously not pure vectors, do they use some kind of combined list and indexing? Or are they vectors of 4 byte chunks containing pointers or some number implementation?

haven't got hAxe btw...

GustTheASGuy
GustTheASGuy
  • Member since: Nov. 2, 2005
  • Offline.
Forum Stats
Member
Level 08
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 09:44 AM Reply

You can check the results anyway. Pretty random though.

It looks like a pointer vector (naturally they have a wrapper variant thing for variable values) plus a hash table for when you assign an illegal index. Didn't check if it's less retarded in AS3.


BBS Signature
DougyTheFreshmaker
DougyTheFreshmaker
  • Member since: Jul. 30, 2007
  • Offline.
Forum Stats
Member
Level 02
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 03:29 PM Reply

Still I find it fashinating that the list with numbers (ie. the specialized list I used for testing) was faster to iterate than the array with only Numbers.

This wasn't the case at all for me *shrug*

Does anyone by the way know how AS arrays are implemented? I couldn't find anything on the net explaning it... I mean since they're obviously not pure vectors, do they use some kind of combined list and indexing? Or are they vectors of 4 byte chunks containing pointers or some number implementation?

My best guess is that they're based on an actual array implementation--seems like if they weren't, they'd be called something else, right? I don't actually know for sure how they're implemented by AVM2, and I've looked :p You could probably keep the nature of arrays in mind and run a bunch of these "big loop benchmarks" to see if you get the results you expect, if you wanted to. Wouldn't give a definitive answer, but could provide grounds for a guess one way or the other.


We should take care not to make the intellect our god; it has, of course, powerful muscles, but no personality.
Freshmaking
Brainscrape

BBS Signature
DougyTheFreshmaker
DougyTheFreshmaker
  • Member since: Jul. 30, 2007
  • Offline.
Forum Stats
Member
Level 02
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 14th, 2008 @ 07:57 PM Reply

I found a much less retarded answer on Adobe LiveDocs under serialization for communication between languages, here.

Arrays are dense as long as they've got only numeric indices with no gaps, otherwise they're sparse. I'm not sure what underlying structure the sparse array relies on--maybe a hash like Gust says, maybe a list, dunno. But there you go.


We should take care not to make the intellect our god; it has, of course, powerful muscles, but no personality.
Freshmaking
Brainscrape

BBS Signature
Fickludd
Fickludd
  • Member since: Feb. 28, 2004
  • Offline.
Forum Stats
Member
Level 15
Blank Slate
Response to AS3.0 array vs. list speed test. Posted May. 15th, 2008 @ 03:18 AM Reply

At 5/14/08 07:57 PM, DougyTheFreshmaker wrote: I found a much less retarded answer on Adobe LiveDocs under serialization for communication between languages, here.

Thanks! I'm still not sure how they work, but diving deeper into this seems kinda pointless. When I wan't better control over my arrays I'll change to some other language (until AS comes with typed arrays).