See what's going on with flipcode!




This section of the archives stores flipcode's complete Developer Toolbox collection, featuring a variety of mini-articles and source code contributions from our readers.

 

  Optimizing Strings with Pascal-Strings
  Submitted by



Just recently one of my class-mates pointed me to an article where the performance difference between c-type strings and pascal-types strings was discussed. The main point in this discussion is, that pascal-strings store their length at the beginning of the string instead of the 0-sign at the end of the string. This means that if you do a strcat() a c-type string has to search from the beginning of the string until he finds the end of the string and appends the new one at the newly found position. In contrast a pascal string knows always where he's got to append the new string! This can cause a big performance difference if you have to use many strcat() or similiar functions very often. With this in mind I tried to measure the performance-difference and was really surprised by the result!! However pascal-type strings have a big disadvantage: they are bound to maximum size. Originally this maximum was a size of 256 chars (one byte). However you can use a short without losing much mem. With all this in mind I started yesterday to code a first test-example of an optimized string class which uses the advantages of both string types. What I mail you is the result of this work. Trying it out with a ton of strcats this class is even without optimization able to outperform the Microsoft std::string. I think if I'd test compares or right-trims the same would result. I think an optimized pascal-string class or my own combination of both classes could easily speed up apps which use string-functions heavily. I post you this code because:
  • I'd like to share this knowledge with everyone who doesn't knows this already.
  • I'd like to hear comments about my solution, possible optimizations or just experiences from the people who already studied this difference and (maybe) have their own solutions.


  • And finally if someone wants to use my code in his/her app there I just want to mention the following:
  • I'd appreciate credits or drop me an-email. I'd really like to know if someone can use this.
  • Don't blame me if there are bugs, mem-leaks or something else! I never tested this code fully, it's just here to give everyone an idea of the whole problem.
  • If there are questions write to coberholzer@gmx.ch and I'll try to answer them.
  • Regards, Chris

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/main.cpp] - (2,057 bytes)

    #include <stdio.h>
    #include <string>
    #include <windows.h>
    #include "optimized_string/string.h"

    void main() { optimized_string::String String("blabla"); optimized_string::String Other1("blabla"); optimized_string::String Other2("blabla1"); if (String == "blabla") printf("OK!\n"); if (!(String == "blabla1")) printf("OK!\n"); if (String == Other1) printf("OK!\n"); if (!(String == Other2)) printf("OK!\n");

    String = " blabla\n\n\n"; printf("%s\n", String.GetCString()); String.LTrim(); printf("%s\n", String.GetCString()); String.RTrim(); printf("%s\n", String.GetCString());

    LONGLONG Start; LONGLONG Stop; LONGLONG Delta; LONGLONG Frequency;

    double dDelta; double dTime; double dFrequency;

    printf("-- optimized string --\n"); QueryPerformanceCounter((LARGE_INTEGER*)&Start);

    for (int i = 0; i < 2; i++) { optimized_string::String str("blabla"); optimized_string::String strCompare("blablaaaaaa");

    for (int i = 0; i < 60000; i++) { str.Append("a");

    if (str.Compare(strCompare)) printf("Equals\n");; } }

    QueryPerformanceCounter((LARGE_INTEGER*)&Stop); QueryPerformanceFrequency((LARGE_INTEGER*)&Frequency);

    Delta = Stop - Start; dDelta = (double)Delta; dFrequency = (double)Frequency; dTime = dDelta / dFrequency;

    printf("optimized string: %f ms\n", dTime * 1000);

    printf("-- std::string --\n"); QueryPerformanceCounter((LARGE_INTEGER*)&Start);

    std::string strHallo; std::string strHallo2("1234567890 1234567890"); for (int i = 0; i < 2; i++) { std::string str("blabla"); std::string strCompare("blablaaaaaa");

    for (int i = 0; i < 60000; i++) { str.append("a");

    if (str == strCompare) printf("Equals\n");; } }

    QueryPerformanceCounter((LARGE_INTEGER*)&Stop); QueryPerformanceFrequency((LARGE_INTEGER*)&Frequency);

    Delta = Stop - Start; dDelta = (double)Delta; dFrequency = (double)Frequency; dTime = dDelta / dFrequency;

    printf("std::string: %f ms\n", dTime * 1000); }

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/optimized_string/cstring.cpp] - (3,709 bytes)

    #include "cstring.h"

    namespace optimized_string {

    CString::CString() : m_szPtr(NULL), m_BufferLength(0) { this->_Allocate(1); this->m_szPtr[0] = 0; }

    CString::CString(const CString& Other) : m_szPtr(NULL), m_BufferLength(0) { this->Set(Other); }

    CString::CString(const char* szString, const size_t& Length) : m_szPtr(NULL), m_BufferLength(0) { this->Set(szString, Length); }

    CString::~CString() { if (this->m_szPtr) { delete[] this->m_szPtr; } }

    void CString::Reset() { if (this->m_szPtr) { delete[] this->m_szPtr; this->m_szPtr = NULL; } }

    void CString::Set(const CString& Other) { size_t Len = Other.Len(); this->Reset(); this->_Allocate(Len + 1);

    memcpy(this->m_szPtr, Other.m_szPtr, sizeof(char) * (Len + 1)); }

    void CString::Set(const char* szString, const size_t& Len) { this->Reset(); this->_Allocate(Len + 1);

    memcpy(this->m_szPtr, szString, sizeof(char) * (Len + 1)); }

    void CString::Append(const CString& Other, const size_t& Length) { size_t CurLength = this->Len(); this->_CheckGrow(Length, CurLength);

    // append memcpy(&this->m_szPtr[CurLength], Other.m_szPtr, sizeof(char) * Length); this->m_szPtr[CurLength + Length] = 0; }

    void CString::Append(const char* szString, const size_t& Length) { size_t CurLength = this->Len(); this->_CheckGrow(Length, CurLength);

    // append memcpy(&this->m_szPtr[CurLength], szString, sizeof(char) * Length); this->m_szPtr[CurLength + Length] = 0; }

    const bool CString::Compare(const CString& Other) const { return (strcmp(this->m_szPtr, Other.m_szPtr) == 0); }

    const bool CString::Compare(const char* szString, const size_t& Length) const { if (memcmp(this->m_szPtr, szString, Length) == 0) { if (this->m_szPtr[Length] == 0) return true; }

    return false; }

    void CString::LTrim() { size_t NumToMove = 0; while (this->m_szPtr[NumToMove] == ' ' || this->m_szPtr[NumToMove] == '\t' || this->m_szPtr[NumToMove] == '\r' || this->m_szPtr[NumToMove] == '\n') { NumToMove++; }

    size_t Len = (this->Len() + 1) - NumToMove; memmove(this->m_szPtr, &this->m_szPtr[NumToMove], sizeof(char) * Len); }

    void CString::RTrim() { size_t CurrentPos = this->Len(); while (this->m_szPtr[CurrentPos] == ' ' || this->m_szPtr[CurrentPos] == '\t' || this->m_szPtr[CurrentPos] == '\r' || this->m_szPtr[CurrentPos] == '\n') { CurrentPos--; }

    this->m_szPtr[CurrentPos + 1] = 0; }

    size_t CString::Len() { return strlen(this->m_szPtr); }

    const size_t CString::Len() const { return strlen(this->m_szPtr); }

    const char* CString::GetCString() { return this->m_szPtr; }

    const char* CString::GetCString() const { return this->m_szPtr; }

    void CString::_Allocate(const size_t& Length) { size_t AllocSize = DEFAULT_ALLOC_SIZE; if (Length > DEFAULT_ALLOC_SIZE) { AllocSize = Length; }

    this->m_szPtr = new char[AllocSize]; this->m_BufferLength = AllocSize; }

    void CString::_CheckGrow(const size_t& Length, const size_t& CurrentLength) { // correct 0-char size_t CorrectLength = CurrentLength + 1;

    // check to grow if ((CorrectLength + Length) > this->m_BufferLength) { size_t SizeDiff = (CorrectLength + Length) - this->m_BufferLength; size_t GrowSize = DEFAULT_GROW_SIZE;

    if (SizeDiff > DEFAULT_GROW_SIZE) { GrowSize = SizeDiff; }

    size_t NewSize = this->m_BufferLength + GrowSize; char* szNew = new char[NewSize]; memcpy(szNew, this->m_szPtr, sizeof(char) * CorrectLength); delete this->m_szPtr; this->m_szPtr = szNew; this->m_BufferLength = NewSize; } }

    };

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/optimized_string/cstring.h] - (1,119 bytes)

    #ifndef CSTRING_H
    #define CSTRING_H

    #include <memory.h> #include <string.h>

    namespace optimized_string {

    class CString { public: enum { DEFAULT_ALLOC_SIZE = 128, DEFAULT_GROW_SIZE = 128, };

    public: // constructor / destructor CString(); CString(const CString& Other); CString(const char* szString, const size_t& Length); ~CString();

    // set / reset void Reset(); void Set(const CString& Other); void Set(const char* szString, const size_t& Len);

    // basic string functionality void Append(const CString& Other, const size_t& Length); void Append(const char* szString, const size_t& Length); const bool Compare(const CString& Other) const; const bool Compare(const char* szString, const size_t& Length) const; void LTrim(); void RTrim(); size_t Len(); const size_t Len() const; const char* GetCString(); const char* GetCString() const;

    protected: void _Allocate(const size_t& Length); void _CheckGrow(const size_t& Length, const size_t& CurrentLength);

    protected: char* m_szPtr; size_t m_BufferLength; };

    };

    #endif // CSTRING_H

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/optimized_string/pascalstring.cpp] - (4,938 bytes)

    
    #include "pascalstring.h"

    namespace optimized_string {

    PascalString::PascalString() : m_usLength(0), m_szString(NULL), m_usBufferLength(0) { }

    PascalString::PascalString(const PascalString& strOther) : m_usLength(0), m_szString(NULL), m_usBufferLength(0) { this->Set(strOther); }

    PascalString::PascalString(const char* szString, const size_t& Length) : m_usLength(0), m_szString(NULL), m_usBufferLength(0) { const unsigned short usLength = static_cast<const unsigned short>(Length); this->Set(szString, usLength); }

    PascalString::~PascalString() { if (this->m_szString) { delete[] this->m_szString; } }

    void PascalString::Reset() { if (this->m_usLength > 0) { if (this->m_szString) { delete[] this->m_szString; this->m_szString = 0; }

    this->m_usLength = 0; } }

    void PascalString::Set(const PascalString& Other) { this->Reset(); this->_Allocate(Other.m_usLength);

    this->m_usLength = Other.m_usLength; memcpy(this->m_szString, &Other.m_szString, sizeof(char) * this->m_usLength); }

    void PascalString::Set(const char* szString, const unsigned short usLength) { this->Reset(); this->_Allocate(usLength);

    this->m_usLength = usLength; memcpy(this->m_szString, szString, sizeof(char) * this->m_usLength); }

    void PascalString::Append(const PascalString& Other, const size_t& Length) { unsigned short usLength = static_cast<unsigned short>(Length); this->_CheckGrow(usLength);

    // append memcpy(&this->m_szString[this->m_usLength], Other.m_szString, sizeof(char) * usLength); this->m_usLength += usLength; }

    void PascalString::Append(const char* szString, const size_t& Length) { unsigned short usLength = static_cast<unsigned short>(Length); this->_CheckGrow(usLength);

    // append memcpy(&this->m_szString[this->m_usLength], szString, sizeof(char) * usLength); this->m_usLength += usLength; }

    const bool PascalString::Compare(const PascalString& Other) const { if (this->m_usLength == Other.m_usLength) { if (memcmp(this->m_szString, Other.m_szString, this->m_usLength) == 0) return true; }

    return false; }

    const bool PascalString::Compare(const char* szString, const size_t& Length) const { unsigned short usLength = static_cast<unsigned short>(Length); if (usLength == this->m_usLength) { if (memcmp(this->m_szString, szString, Length) == 0) return true; }

    return false; }

    void PascalString::LTrim() { unsigned short usNumToMove = 0; while (this->m_szString[usNumToMove] == ' ' || this->m_szString[usNumToMove] == '\t' || this->m_szString[usNumToMove] == '\r' || this->m_szString[usNumToMove] == '\n' || this->m_szString[usNumToMove] == '\0') { usNumToMove++; }

    this->m_usLength -= usNumToMove; memmove(this->m_szString, &this->m_szString[usNumToMove], sizeof(char) * this->m_usLength); }

    void PascalString::RTrim() { unsigned short usCurrentPos = this->m_usLength - 1; while (this->m_szString[usCurrentPos] == ' ' || this->m_szString[usCurrentPos] == '\t' || this->m_szString[usCurrentPos] == '\r' || this->m_szString[usCurrentPos] == '\n' || this->m_szString[usCurrentPos] == '\0') { usCurrentPos--; }

    this->m_usLength = usCurrentPos + 1; }

    size_t PascalString::Len() { return this->m_usLength; }

    const size_t PascalString::Len() const { return this->m_usLength; }

    char* PascalString::GetString() { return this->m_szString; }

    const char* PascalString::GetString() const { return this->m_szString; }

    const char* PascalString::GetCString() { this->Append("\0", 1); return this->m_szString; }

    void PascalString::_Allocate(unsigned short usLength) { unsigned short usAllocSize = DEFAULT_ALLOC_SIZE; if (usLength > DEFAULT_ALLOC_SIZE) { usAllocSize = usLength; }

    this->m_szString = new char[usAllocSize]; this->m_usBufferLength = usAllocSize; }

    void PascalString::_CheckGrow(unsigned short usLength) { if ((this->m_usLength + usLength) > this->m_usBufferLength) { unsigned short usSizeDiff; usSizeDiff = (this->m_usLength + usLength) - this->m_usBufferLength; unsigned short usGrowSize = DEFAULT_GROW_SIZE;

    if (usSizeDiff > DEFAULT_GROW_SIZE) { usGrowSize = usSizeDiff; }

    if (this->m_usBufferLength >= 60000) this->m_usBufferLength = this->m_usBufferLength;

    unsigned int uiNewSize = this->m_usBufferLength + usGrowSize; unsigned short usNewSize = static_cast<unsigned short>(uiNewSize); if (uiNewSize > PascalString::MAX_LENGTH) usNewSize = PascalString::MAX_LENGTH;

    char* szNew = new char[usNewSize]; memcpy(szNew, this->m_szString, sizeof(char) * this->m_usLength); delete[] this->m_szString; this->m_szString = szNew; this->m_usBufferLength = usNewSize; } }

    };

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/optimized_string/pascalstring.h] - (1,256 bytes)

    #ifndef PASCALSTRING_H
    #define PASCALSTRING_H

    #include <memory.h> #include <string.h>

    namespace optimized_string {

    class PascalString { public: enum { DEFAULT_ALLOC_SIZE = 128, DEFAULT_GROW_SIZE = 128, MAX_LENGTH = 65535, };

    public: // constructor / destructor PascalString(); PascalString(const PascalString& strOther); PascalString(const char* szString, const size_t& Length); ~PascalString();

    // set / reset void Reset(); void Set(const PascalString& Other); void Set(const char* szString, const unsigned short usLength);

    // basic string functionality void Append(const PascalString& Other, const size_t& Length); void Append(const char* szString, const size_t& Length); const bool Compare(const PascalString& Other) const; const bool Compare(const char* szString, const size_t& Length) const; void LTrim(); void RTrim(); size_t Len(); const size_t Len() const; char* GetString(); const char* GetString() const; const char* GetCString();

    protected: void _Allocate(unsigned short usLength); void _CheckGrow(unsigned short usLength);

    protected: unsigned short m_usLength; char* m_szString; unsigned short m_usBufferLength; };

    };

    #endif // PASCALSTRING_H

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/optimized_string/string.cpp] - (7,453 bytes)

    #include "string.h"

    namespace optimized_string {

    String::String() { this->m_pString = new PascalString(); this->m_Type = PASCAL_STRING; }

    String::String(const String& Other) : m_pString(NULL) { this->Set(Other); }

    String::String(const char* szString) : m_pString(NULL) { this->Set(szString); }

    String::~String() { if (this->m_pString) { delete this->m_pString; } }

    void String::Reset() { if (this->m_pString) { delete this->m_pString; this->m_pString = NULL; } }

    void String::Set(const String& Other) { this->Reset();

    // CONSTRUCTION SET if (Other.m_Type == PASCAL_STRING) { const PascalString* pOther; pOther = reinterpret_cast<const PascalString*>(Other.m_pString); this->m_pString = new PascalString(*pOther); this->m_Type = PASCAL_STRING; } else { const CString* pOther; pOther = reinterpret_cast<const CString*>(Other.m_pString); this->m_pString = new CString(*pOther); this->m_Type = C_STRING; } }

    void String::Set(const char* szString) { this->Reset();

    size_t Size = strlen(szString); if (Size > PascalString::MAX_LENGTH) { // MAKE C-String this->m_pString = new CString(szString, Size); this->m_Type = C_STRING; } else { this->m_pString = new PascalString(szString, static_cast<unsigned short>(Size)); this->m_Type = PASCAL_STRING; } }

    void String::Append(const String& Other) { size_t Len = Other.Len();

    if (this->m_Type == PASCAL_STRING) { PascalString* pThis = reinterpret_cast<PascalString*>(this->m_pString); if ((Len + pThis->Len()) <= PascalString::MAX_LENGTH) { if (Other.m_Type == PASCAL_STRING) { const PascalString* pOther; pOther = reinterpret_cast<const PascalString*>(Other.m_pString); pThis->Append(*pOther, Len); } else { const CString* pOther; pOther = reinterpret_cast<const CString*>(Other.m_pString); pThis->Append(pOther->GetCString(), Len); } } else { // CONVERSION const char* szOtherString = NULL; if (Other.m_Type == PASCAL_STRING) { const PascalString* pOther; pOther = reinterpret_cast<const PascalString*>(Other.m_pString); szOtherString = pOther->GetString(); } else { const CString* pOther; pOther = reinterpret_cast<const CString*>(Other.m_pString); szOtherString = pOther->GetCString(); }

    size_t NewLen = pThis->Len() + Len; char* szNewString = new char[NewLen + 1]; memcpy(szNewString, pThis->GetString(), sizeof(char) * pThis->Len()); memcpy(&szNewString[pThis->Len()], szOtherString, sizeof(char) * Len); szNewString[NewLen] = 0;

    this->Reset(); this->m_pString = new CString(szNewString, NewLen); this->m_Type = C_STRING; } } else { CString* pThis = reinterpret_cast<CString*>(this->m_pString); if (Other.m_Type == PASCAL_STRING) { const PascalString* pOther; pOther = reinterpret_cast<const PascalString*>(Other.m_pString); pThis->Append(pOther->GetString(), Len); } else { const CString* pOther; pOther = reinterpret_cast<const CString*>(Other.m_pString); pThis->Append(*pOther, Len); } } }

    void String::Append(const char* szString) { size_t Len = strlen(szString);

    if (this->m_Type == PASCAL_STRING) { PascalString* pThis = reinterpret_cast<PascalString*>(this->m_pString); if ((Len + pThis->Len()) <= PascalString::MAX_LENGTH) { pThis->Append(szString, Len); } else { // CONVERSION size_t NewLen = pThis->Len() + Len; char* szNewString = new char[NewLen + 1]; memcpy(szNewString, pThis->GetString(), sizeof(char) * pThis->Len()); memcpy(&szNewString[pThis->Len()], szString, sizeof(char) * Len); szNewString[NewLen] = 0;

    this->Reset(); this->m_pString = new CString(szNewString, NewLen); this->m_Type = C_STRING; } } else { CString* pThis = reinterpret_cast<CString*>(this->m_pString); pThis->Append(szString, Len); } }

    const bool String::Compare(const String& Other) const { if (!this->m_pString) { return false; } else if (this->m_Type == PASCAL_STRING) { if (Other.m_Type != PASCAL_STRING) { return false; } else { const PascalString* pThis; pThis = reinterpret_cast<const PascalString*>(this->m_pString); const PascalString* pOther; pOther = reinterpret_cast<const PascalString*>(Other.m_pString); return pThis->Compare(*pOther); } } else if (this->m_Type == C_STRING) { const CString* pThis; pThis = reinterpret_cast<const CString*>(this->m_pString); const CString* pOther; pOther = reinterpret_cast<const CString*>(Other.m_pString); return pThis->Compare(*pOther); }

    return false; }

    const bool String::Compare(const char* szString) const { if (!this->m_pString) { return false; } else if (this->m_Type == PASCAL_STRING) { size_t Len = strlen(szString); if (Len < PascalString::MAX_LENGTH) { const PascalString* pThis; pThis = reinterpret_cast<const PascalString*>(this->m_pString); return pThis->Compare(szString, Len); } else { return false; } } else if (this->m_Type == C_STRING) { size_t Len = strlen(szString); if (Len < PascalString::MAX_LENGTH) { return false; } else { const CString* pThis; pThis = reinterpret_cast<const CString*>(this->m_pString); return pThis->Compare(szString, Len); } }

    return false; }

    void String::LTrim() { if (!this->m_pString) return;

    if (this->m_Type == PASCAL_STRING) { PascalString* pThis = reinterpret_cast<PascalString*>(this->m_pString); pThis->LTrim(); } else if (this->m_Type == C_STRING) { CString* pThis = reinterpret_cast<CString*>(this->m_pString); pThis->LTrim(); } }

    void String::RTrim() { if (!this->m_pString) return;

    if (this->m_Type == PASCAL_STRING) { PascalString* pThis = reinterpret_cast<PascalString*>(this->m_pString); pThis->RTrim(); } else if (this->m_Type == C_STRING) { CString* pThis = reinterpret_cast<CString*>(this->m_pString); pThis->RTrim(); } }

    size_t String::Len() { if (!this->m_pString) return 0;

    if (this->m_Type == PASCAL_STRING) { PascalString* pThis = reinterpret_cast<PascalString*>(this->m_pString); return pThis->Len(); } else if (this->m_Type == C_STRING) { CString* pThis = reinterpret_cast<CString*>(this->m_pString); return pThis->Len(); }

    return 0; }

    const size_t String::Len() const { if (!this->m_pString) return 0;

    if (this->m_Type == PASCAL_STRING) { const PascalString* pThis; pThis = reinterpret_cast<const PascalString*>(this->m_pString); return pThis->Len(); } else if (this->m_Type == C_STRING) { const CString* pThis; pThis = reinterpret_cast<const CString*>(this->m_pString); return pThis->Len(); }

    return 0; }

    const char* String::GetCString() { if (!this->m_pString) return NULL;

    if (this->m_Type == PASCAL_STRING) { PascalString* pThis = reinterpret_cast<PascalString*>(this->m_pString); return pThis->GetCString(); } else if (this->m_Type == C_STRING) { CString* pThis = reinterpret_cast<CString*>(this->m_pString); return pThis->GetCString(); }

    return NULL; }

    };

    Currently browsing [pascal_string.zip] (7,936 bytes) - [optstring_test/optimized_string/string.h] - (1,536 bytes)

    #ifndef STRING_H
    #define STRING_H

    #include <string.h> #include "pascalstring.h" #include "cstring.h"

    namespace optimized_string {

    class String { public: enum TYPE { PASCAL_STRING, C_STRING, };

    public: // constructor / destructor String(); String(const String& Other); String (const char* szString); ~String();

    // set / reset void Reset(); void Set(const String& Other); void Set(const char* szString);

    // basic string functionality void Append(const String& Other); void Append(const char* szString); const bool Compare(const String& Other) const; const bool Compare(const char* szString) const; void LTrim(); void RTrim(); size_t Len(); const size_t Len() const; const char* GetCString();

    // operators inline String& operator = (const String& Other) { this->Set(Other); return (*this); } inline String& operator = (const char* szString) { this->Set(szString); return (*this); } inline String& operator + (const String& Other) { this->Append(Other); return (*this); } inline String& operator + (const char* szString) { this->Append(szString); return (*this); } inline const bool operator == (const String& Other) { return this->Compare(Other); } inline const bool operator == (const char* szString) { return this->Compare(szString); } inline operator const char* () { return this->GetCString(); }

    protected: void* m_pString; TYPE m_Type; };

    };

    #endif // STRING_H

    The zip file viewer built into the Developer Toolbox made use of the zlib library, as well as the zlibdll source additions.

     

    Copyright 1999-2008 (C) FLIPCODE.COM and/or the original content author(s). All rights reserved.
    Please read our Terms, Conditions, and Privacy information.