Skip to main content

MBCS/Unicode enabled C++ string class.

TCHAR, the generic text mapping data type. This is Microsoft specific extension and is not ANSI-compatible. I have used this extension to create my small prototype MBCS/Unicode compatible class. This class consists of all the basic functionality required to represent a minimal string class. I have given the name for this class as "CStringUNI".
// Header File
#pragma once

const int INIT_ALLOC_SIZE = 10;

class CStringUNI
{
private:
 TCHAR *m_szBuffer;
 TCHAR *AllocateMemory(size_t size);

public:
 CStringUNI();
 CStringUNI(const CStringUNI&);
 CStringUNI(const TCHAR*);
 CStringUNI& operator=(const TCHAR*);
 CStringUNI& operator=(const CStringUNI&);
 CStringUNI& operator+=(const CStringUNI&);
 
 // Access Operator
 TCHAR operator[](const size_t n)const;

 TCHAR *GetBuffer() const;

 bool operator==(const TCHAR*) const;
 bool operator==(const CStringUNI&) const;
 virtual ~CStringUNI();

 friend CStringUNI operator+(const CStringUNI& lhs, const CStringUNI& rhs);
};
// CPP file, the actual implemenation file.

#include "StdAfx.h"
#include "StringUNI.h"
#include &ltstring.h&gt

TCHAR *CStringUNI::AllocateMemory(size_t size)
{
 m_szBuffer = new TCHAR[size];

 return m_szBuffer;
}

CStringUNI::CStringUNI()
{
 m_szBuffer = NULL;
}

CStringUNI::CStringUNI(const CStringUNI& source)
{
 const size_t nSizeSource = _tcslen(source.m_szBuffer);
 AllocateMemory(nSizeSource + 1);
 memset(m_szBuffer, 0, nSizeSource);
 _tcscpy(m_szBuffer, source.m_szBuffer);
}

CStringUNI::CStringUNI(const TCHAR* source)
{
 AllocateMemory(_tcslen(source)+ 1);
 memset(m_szBuffer, 0, _tcslen(source)+ 1);
 _tcscpy(m_szBuffer, source);
}

CStringUNI& CStringUNI::operator=(const TCHAR* source)
{
 size_t nSourcelen = _tcslen(source);
 
 if(m_szBuffer != NULL)
 {
  delete[] m_szBuffer;
  m_szBuffer = NULL;
 }

 AllocateMemory(nSourcelen + 1);
 memset(m_szBuffer, 0, (nSourcelen + 1));
 _tcscpy(m_szBuffer, source);

 return *this;
}

CStringUNI& CStringUNI::operator=(const CStringUNI& source)
{
 return operator=(source.m_szBuffer);
}

CStringUNI& CStringUNI::operator+=(const CStringUNI& source)
{
 size_t nSourceLen = _tcslen(source.m_szBuffer);
 size_t nBufferLen = _tcslen(m_szBuffer);

 TCHAR *pTempChar = new TCHAR[nSourceLen + nBufferLen + 1];
 _tcscpy(pTempChar, m_szBuffer);
 _tcscat(pTempChar, source.m_szBuffer);

 if(m_szBuffer != NULL)
 {
  delete []m_szBuffer;
  m_szBuffer = NULL;
 }

 m_szBuffer = pTempChar;
 
 return *this;
}

CStringUNI operator+(const CStringUNI& lhs, const CStringUNI& rhs)
{
 return CStringUNI(lhs) += rhs;
}

bool CStringUNI::operator ==(const TCHAR* source)const
{
 return (!_tcscmp(m_szBuffer, source));
}

bool CStringUNI::operator==(const CStringUNI& source) const
{
 return (!_tcscmp(source.m_szBuffer, m_szBuffer));
}

TCHAR CStringUNI::operator[](const size_t n)const
{
 return m_szBuffer[n];
}

TCHAR * CStringUNI::GetBuffer()const
{
 return m_szBuffer;
}

CStringUNI::~CStringUNI(void)
{
 if(m_szBuffer != NULL)
 {
  delete [] m_szBuffer;
  m_szBuffer = NULL;
 }
}
The following lines demonstrates the usage of this class.
// effString.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "StringUNI.h"
#include &ltlocale.h&gt
#include &ltwindows.h&gt

int _tmain(int argc, _TCHAR* argv[])
{
 
 TCHAR *locale = _tsetlocale(LC_ALL, L"Japanese");
 CStringUNI objString(L"日本語がわかりません");

 CStringUNI objString1(L"日本語でなんと言いますか");

 objString += objString1;
 
 wprintf(L"%s\n", objString.GetBuffer());
 return 0;
}

In the above mentioned lines, I tried to create two different objects of type CStringUNI and then concatenated two objects and displayed the cancatenated string on the output window.
Once we run this program, after setting locale for the system to Japanese, we will get following output: -

Comments

Popular posts from this blog

Reversing char array without splitting the array to tokens

 I was reading about strdup, a C++ function and suddenly an idea came to my mind if this can be leveraged to aid in reversing a character array without splitting the array into words and reconstructing it again by placing spaces and removing trailing spaces. Again, I wanted an array to be passed as a function argument and an array size to be passed implicitly with the array to the function. Assumed, a well-formed char array has been passed into the function. No malformed array checking is done inside the function. So, the function signature and definition are like below: Below is the call from the client code to reverse the array without splitting tokens and reconstructing it. Finally, copy the reversed array to the destination.  For GNU C++, we should use strdup instead _strdup . On run, we get the following output: Demo code

XOR (Exclusive OR) for branchless coding

The following example shows the array reversing using the  XOR operator . No need to take any additional variable to reverse the array.   int main(int argc, _TCHAR* argv[]) { char str[] = "I AM STUDENT"; int length = strlen(str); for(int i = 0; i < ((length/2)); i++) { str[i] ^= str[length - (1+i)]; str[length - (1+i)] ^= str[i]; str[i] ^= str[length - (1+i)]; } cout << str << endl; return 0; } The above example is one of the uses of XOR but XOR comes in handy when we can do branchless coding  methods like butterfly switch etc. Sometimes this is very effective in speeding up the execution.  Let's see one of the uses of XOR in branchless coding. I am taking a simple example of Y = | X |.  Yes, I am generating abs of a supplied number. So, my function signature/definition in C++ looks like below: int absoluteBranch( int x) {     if (x < 0 ) {         return -x;     }     else {         retur

Power of Two

  I n this post will be discussing how to calculate if a number is a power of two or not. As an example, 8 is a power of two but the number 10 is not. There are many ways we can solve this. First , we will take an approach which is simple and iterative. In this case, we will calculate the power of two one by one and check with the supplied number. The below code illustrates it. bool isPowerofTwo(unsigned num) { auto y = 1; while (0 != y) { if (num == y) return true; if (num < y) return false; y <<= 1; } return false; } Second , assuming, the number is a 32-bit number, this is also an iterative solution. In this scenario, iterating all bits and counting the set bits. Any number which is a power of 2 will have only one bit set and the rest will be zeros. As an example, 8 in binary representation is 1000. Using this observation, we can implement an iterative solution. bool isPowerofTwo(unsigned num) { auto one_count = 0; for (auto index = 0; index < 32;