Not logged in, Join Here! or Log In Below:  
 
News Articles Search    
 

 Home / General Programming / std::vector performance (quick question) Account Manager
 
Archive Notice: This thread is old and no longer active. It is here for reference purposes. This thread was created on an older version of the flipcode forums, before the site closed in 2005. Please keep that in mind as you view this thread, as many of the topics and opinions may be outdated.
 
Scoot

February 23, 2005, 09:54 AM

Up until now i've been lazy and simply used the [] operator for accessing data from an STL vector.

99% of the time i'm accessing the data linearly from start to end.

Would it be faster to use the iterator access method instead for accessing data from a vector, and any idea how much faster?

Thanks in advance
-Scoot

 
Rui Martins

February 23, 2005, 10:21 AM

You could try make a quick test instead of a quick question 8)

Now seriously, other guys values won't give any real info, since you may have a different implementation of the lib, and only you know exactly the specific case/conditions you want to test/compare.

 
Scoot

February 23, 2005, 10:29 AM

Well, I did warn you I was lazy ;)

What's your personal experience?

 
Victor Widell

February 23, 2005, 09:25 PM

It shouldn't be any difference. If there is any, the [] method might be faster since the compiler might unroll loops etc.

 
Gerald Knizia (cgk)

February 24, 2005, 02:49 AM

I disagree. The iterator approach should be faster on many implementations, since these don't actually store the length of a vector but calculate it from the begin() and end() values (which they store directly, for example in Dinkumware). Often it is, from the C++ standpoint the compile could know, not sure that these values are unchanged during the loop body. Therefore i also seriously doubt any loop unrolling (how should that work with dynmic arrays anyway?)

Beside that, accessing vector elemens via iterator should usually be faster than doing it via [], because the iterators _are_ just pointers in most implementions for the Release-Mode builds, while the [] function first hast to _calculate_ the pointer.
It is possible that this calculation of the vector element address is executed _each time_ vec is accessed in the loop, because it might be possible that that the start address of the vector array is changed externally during the loop body. Beside the (usually neglectable) added cost of the [] approach, it might also be less obvious what the coder is doing. If possible iterators should be used. Once typeof() extensions to C++ compilers are commonly aviable, it would also be simpler to use iterators, because that would allow one to write a _for_each macro that is worth carrying that name.

 
zdax

February 24, 2005, 03:15 AM

Iterator method is indeed a *lot* of faster. I tested with VC6 (SP5), P4 2,8GHz and with 5 million iterations, iterator method was about 20 (!) times faster.

  1.  
  2. QueryPerformanceCounter(&start1);
  3. for(int j=0;j<COUNT;j++)
  4. {
  5.         number = nums[j];
  6. }
  7. QueryPerformanceCounter(&stop1);
  8.  
  9. QueryPerformanceCounter(&start2);
  10. std::vector<int>::iterator it=nums.begin();
  11. for(int k=0;k<COUNT;k++)
  12. {
  13.         number = (*it);it++;
  14. }
  15. QueryPerformanceCounter(&stop2);
  16. double time1 = (double)(stop1.QuadPart - start1.QuadPart) / freq.QuadPart;
  17. double time2 = (double)(stop2.QuadPart - start2.QuadPart) / freq.QuadPart;
  18.  
  19.  

time 1:0.306409
time 2:0.0160359

 
Ono-Sendai

February 24, 2005, 04:23 AM

ZDax's results were a bit surprising, so I tried the test myself.
I think what you're doing ZDax is compiling the test in debug mode, which is disabling function inlining, so the [] operator is actually making a function call to std::vector::operator[].
If you compile the test in release mode (with function inlining), the difference goes away.

 
Victor Widell

February 24, 2005, 04:36 AM

"while the [] function first hast to _calculate_ the pointer"

Not necesarily. The pointer to the beginning of the memmory block is obviously stored. And that should be the only thing needed to access memmory like an ordinary C-style array.


"The iterator approach should be faster on many implementations, since these don't actually store the length of a vector"

There is no bounds checking on a vector... Why would length (size?) of the vector be used at all?




"with 5 million iterations, iterator method was about 20 (!) times faster."

Of course the compiler won't unroll a loop of 5 million iterations. It would add over 10 MB to the executable, totally trashing any caching etc.

 
Ono-Sendai

February 24, 2005, 04:47 AM

  1.  
  2.         const int VECSIZE = 50000000;
  3.         std::vector<int> vec(VECSIZE);
  4.         int num;
  5.  
  6.                   //------------------------------------------------------------------------
  7.         //get the data all nice and cached up
  8.         //------------------------------------------------------------------------
  9.         {
  10.         for(int i=0; i<VECSIZE; ++i)
  11.                 num += vec[i];
  12.         }
  13.  
  14.         Timer timer;
  15.  
  16.         //------------------------------------------------------------------------
  17.         //test 1
  18.         //------------------------------------------------------------------------
  19.         for(std::vector<int>::iterator it = vec.begin(); it != vec.end(); ++it)
  20.                 num += (*it);
  21.        
  22.         double time = timer.getSecondsElapsed();
  23.  
  24.         std::cout << "iterator: " << time << std::endl;
  25.  
  26.         timer.reset();
  27.         //------------------------------------------------------------------------
  28.         //test 2
  29.         //------------------------------------------------------------------------
  30.         for(int i=0; i<VECSIZE; ++i)
  31.                 num += vec[i];
  32.        
  33.  
  34.         time = timer.getSecondsElapsed();
  35.         std::cout << "operator[]: " << time << std::endl;
  36.  
  37.  
  38.  



Output:
C:programmingtestvectest>vectest
iterator: 0.193529
operator[]: 0.181977

C:programmingtestvectest>vectest
iterator: 0.193592
operator[]: 0.180703

C:programmingtestvectest>vectest
iterator: 0.19378
operator[]: 0.179197

etc..
So the operator[] code seems slightly faster

 
Nils Pipenbrinck

February 24, 2005, 05:11 AM

I just compiled the two functions and took a look at the assembly output:

Here's the code:

  1.  
  2.  
  3. const int VECSIZE = 50000000;
  4. std::vector<int> vec(VECSIZE);
  5.  
  6. int blah1 (void)
  7. {
  8.   int num = 0;
  9.   for(int i=0; i<VECSIZE; ++i) num += vec[i];
  10.   return num;
  11. }
  12.  
  13. int blah2 (void)
  14. {
  15.   int num=0;
  16.   for(std::vector<int>::iterator it = vec.begin(); it != vec.end(); ++it)
  17.     num += (*it);
  18.   return num;
  19. }
  20.  
  21.  
  22. blah1 PROC
  23.         mov     ecx, DWORD PTR vector.begin
  24.         xor     eax, eax
  25.         add     ecx, 8
  26.         mov     edx, 10000000                          
  27. $Label1:
  28.         add     eax, DWORD PTR [ecx-8]
  29.         add     eax, DWORD PTR [ecx-4]
  30.         add     eax, DWORD PTR [ecx]
  31.         add     eax, DWORD PTR [ecx+4]
  32.         add     eax, DWORD PTR [ecx+8]
  33.         add     ecx, 20                                
  34.         dec     edx
  35.         jne     $Label1
  36.         ret     0
  37. blah1 ENDP
  38.  
  39.  
  40. blah2 PROC
  41.         mov     ecx, DWORD PTR vector.begin
  42.         mov     edx, DWORD PTR vector.end
  43.         xor     eax, eax
  44.         cmp     ecx, edx
  45.         je      $End
  46.         push    esi
  47. $Label2:
  48.         mov     esi, DWORD PTR [ecx]
  49.         add     ecx, 4
  50.         add     eax, esi
  51.         cmp     ecx, edx
  52.         jne     $Label2
  53.         pop     esi
  54. $End:
  55.         ret     0
  56. blah2   ENDP
  57.  


Imho it does not matter how you iterate through the vector. The two assembly codes just show for-loops. The first one using an index, the second one using a pointer (ain't it nice how the compiler unrolled the function using operator[] ?)

The runtime performance in a real application should be almost identical. The timing differences measured are most likely just second level cache effects.

 
kiddygames

February 24, 2005, 05:35 AM

Quite strange as we can get different results. Here is my test with 3 different loops this time, compiled in release mode for vc 6:

  1.  
  2.  
  3. #include "windows.h"
  4. #include <vector>
  5.  
  6. #define COUNT 50000000
  7.  
  8. int main(int argc, char* argv[])
  9. {
  10.  
  11.   LARGE_INTEGER freq;
  12.   QueryPerformanceFrequency(&freq);
  13.  
  14.   std::vector<int> nums;
  15.   nums.push_back(1);
  16.   nums.push_back(2);
  17.   nums.push_back(3);
  18.   nums.push_back(4);
  19.   nums.push_back(5);
  20.   nums.push_back(6);
  21.   nums.push_back(7);
  22.   nums.push_back(8);
  23.  
  24.   int number=0;
  25.   LARGE_INTEGER start1,stop1,start2,stop2,start3,stop3;
  26.         QueryPerformanceCounter(&start1);
  27.  
  28.   int j,i;
  29.  
  30.   for(j=0;j<COUNT/8;++j)
  31.   {    
  32.     for(i=0;i<8;++i)
  33.     {
  34.       number += nums[i];
  35.     }
  36.   }
  37.  
  38.   QueryPerformanceCounter(&stop1);
  39.  
  40.   std::vector<int>::iterator it;
  41.  
  42.   QueryPerformanceCounter(&start2);
  43.  
  44.   for(j=0;j<COUNT/8;++j)
  45.   {    
  46.     for(it=nums.begin();it<nums.end();++it)
  47.     {
  48.       number += (*it);
  49.     }
  50.   }
  51.  
  52.   QueryPerformanceCounter(&stop2);
  53.   it=nums.begin();
  54.   QueryPerformanceCounter(&start3);
  55.  
  56.   for(j=0;j<COUNT/8;++j)
  57.   {    
  58.     for(i=0;i<8;++i)
  59.     {
  60.       number += (*it);(++it);
  61.     }
  62.     it=nums.begin();
  63.   }
  64.  
  65.   QueryPerformanceCounter(&stop3);
  66.  
  67.   double time1 = (double)(stop1.QuadPart - start1.QuadPart) / freq.QuadPart;
  68.   double time2 = (double)(stop2.QuadPart - start2.QuadPart) / freq.QuadPart;
  69.   double time3 = (double)(stop3.QuadPart - start3.QuadPart) / freq.QuadPart;
  70.  
  71.   printf(" %f n",time1);
  72.   printf(" %f n",time2);
  73.   printf(" %f n",time3);
  74.  
  75.         return 0;
  76. }
  77.  
  78.  
  79.  
  80.  


and the results:

time1 (number += nums;) : 0.083509 time2 (number += (*it);) : 0.078845 time3 (number += (*it);(++it);) : 0.084383 So for me the "for" loop directly using an iterator is faster, but not a great optimisation anyway...

 
Rui Martins

February 24, 2005, 05:44 AM

8 Samples is not nearly enough !

You are mixing the test with the external for loop overhead.

You have to use a much larger vector, for the result to have any practical meaning.
At least 1 thousand items or so, but you might use something less, since 1 thousand may trash your cache introducing another side effect in the equations.

The best way to do it, is to have another external cycle that will run this test, but will give you the number of elements to ahve in vector, and the number of test to perform.
Then you can plot a graph with the several sizes and you will notice an abrupt shape change in the graph when you start trashing your cache.

It would be a nice graph.

 
Scoot

February 24, 2005, 05:45 AM

First of all thanks for your interest :)

When (taking Rui's advice) I got off my ass and did my own testing, my suspicions were confirmed:

In debug mode, iterators are significantly slower than using the [] operator, probably due to lots of memory checking and function sandwiching.

In release mode, you can expect a significant improvement using iterators when stepping sequentially from beginning to end.

It may only be 1/2 m/s, but in a realtime game app this is a huge improvement when accessing large vector arrays many times per second (e.g. scene-graphs or poly-caches).

Thanks again for all of your input :)
-Scoot

 
Rasmus Christian Kaae

February 24, 2005, 08:26 AM

However, using the mytype::iterator and mytype::const_iterator is prefered since then you may change from f.ex. vector to list, map, multimap, etc. without too much code revision.

 
Gerald Knizia (cgk)

February 24, 2005, 01:28 PM

Your code is flawed because you use a constant vector size - a very unrealisitc scenario. Try replacing the
for(int i=0; i

 
Victor Widell

February 24, 2005, 07:06 PM

Bottomline:
iterator or [] doesn't matter. Use whatever you feel like, but if you are unshure, use the iterator so you can switch container easily.

 
This thread contains 16 messages.
 
 
Hosting by Solid Eight Studios, maker of PhotoTangler Collage Maker.