Coding Guidelines/String Functions: Difference between revisions

From wiki.zmanda.com
Jump to navigation Jump to search
(get strappend in there too)
No edit summary
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Amanda provides a library of string functions with names that parse like this:
== In general ==
new|v|stralloc|f
as well as a few others (described below).  The entire set of functions appears in <tt>amanda.h</tt>.  The functions are all implemented as macros to debug functions, to allow tracing of file/line numbers.


The base function is, of course, ''stralloc'', which simply allocates enough space, and then copies the string it is given as an argument.  Adding ''new'' means that the first argument is a dynamically allocated string which this function will be '''replacing''': if it is not NULL, it will be freed after the new string is allocated and constructed.  This allows code like
Amanda uses GLib to manage strings. You should, in particular, see these pages:
{
  char *tmp = NULL;
  tmp = newstralloc(tmp, "first string");
  /* ... */
  tmp = newstralloc(tmp, "second string");
  /* ... */
  amfree(tmp);
}
without worrying about memory leaks (since each ''newstralloc'' invocation will ''free(tmp)'' if it is non-NULL).


Adding ''v'' to a function name indicates that it is '''variadic''' - that is, that it takes a variable number of arguments. Variadic functions which end in ''f'' act like ''printf'', with the format as the first (or second, in the case of ''new*'' functions) argument. Variadic functions without ''f'' must be given a NULL as the final argument.  Examples:
* [http://wiki.zmanda.com/glib-docs/glib/glib-String-Utility-Functions.html string utility functions];
tmp = newvstrallocf(tmp, "could not open '%s': %s", filename, strerror(errno));
* [http://wiki.zmanda.com/glib-docs/glib/glib-Strings.html the GString data type];
*errmsg = vstralloc(TMPDIR, "/foo/", filename, NULL);
* [http://wiki.zmanda.com/glib-docs/glib/glib-Pointer-Arrays.html the GPtrArray data type].


= Misc Functions =
== String utility functions: dynamically allocated strings ==
The function ''stralloc2'' is a simple concatenation option; ''stralloc2(x, y)'' is equivalent to ''vstralloc(x, y, NULL)''.


The function ''strappend'' is a macro which appends its second argument to its first. ''strappend(x, y)'' is equivalent to ''newvstralloc(x, x, y, NULL)''.
GLib's string utilities contain a lot of very convenient functions to dynamically allocate strings. Examples:
 
<pre>
/* Duplicate a plain string */
char *p = g_strdup("Hello world");
char *p = g_strdup(othervariablehere);
 
/* Allocate a directly formatted string */
char *p = g_strdup_printf("Hello %s", world);
 
/* Concatenate a list of random strings together */
char *p = g_strconcat(string1, string2, string3, NULL);
</pre>
 
Finally, use <tt>g_free()</tt> to free the strings.
 
== String utility functions: splitting and joining ==
 
Think before using <tt>strchr()</tt> or <tt>strstr()</tt>. GLib has <tt>g_strsplit()</tt>:
 
<pre>
/* Split a string against a separator */
gchar **strings = g_strsplit("\n", thebigstring);
/* Work with strings, and, when done: */
g_strfreev(strings);
</pre>
 
The returned pointer array is guaranteed to be terminated with a NULL pointer. So, if you want to walk the resulting array, you will do:
 
<pre>
gchar **strings = g_strsplit(...);
gchar **ptr
 
for (ptr = strings; *ptr; ptr++)
    /* do something with *ptr here */
</pre>
 
One thing, though: if the string you are trying to split ends with the separator, then the last argument before NULL will be an empty string. So, you might want to test for it if you don't want to do anything with it - or, if you're so inclined, just remove the last separator from the string before proceeding.
 
In order to join, you have the choice between <tt>g_strjoin()</tt> and <tt>g_strjoinv()</tt>. The first uses a NULL-terminated list of arguments, while the second uses a NULL-terminated string array:
 
<pre>
char *p = g_strjoin(" ", str1, str2, NULL);
 
gchar **strings = ...;
char *p = g_strjoinv(" ", strings);
</pre>
 
== Other string utility functions... ===
 
GLib has "locale-safe" string functions, in the sense that they are guaranteed to behave the same, whatever your locale is set to. It is particularly useful, for instance, when trying to identify the real type of a character: the result of <tt>isalpha</tt>, for instance, will vary depending on the locale. GLib's <tt>g_ascii_isalpha</tt>, on the other hand, will always behave the same.
 
Also, have you ever been bitten by the behavior of the <tt>strcmp</tt> family of functions? Consider using <tt>g_str_equal</tt>...
 
== GString ==
 
GString is the base string buffer to which you can append/allocate etc. Its usage is quite simple, however you must pay attention to what you want to do when finished with the object. Either you want to get the stored string back, or you don't:
 
<pre>
GString *strbuf = g_string_new(NULL); /* This will spawn an empty string */
g_string_append(strbuf, "some string");
g_string_append_printf(strbuf, "Hello %s", "Mars");
g_string_append_c(strbuf, '\n');
/* g_string_prepend(), etc etc - see the link above */
/*
* Getting the string stored in it while discarding the GString:
*/
char *p = g_string_free(strbuf, FALSE);
/*
* Completely scratching the buffer, including its contents:
*/
g_string_free(strbuf, TRUE);
</pre>
 
== GPtrArray ==
 
While this data type can store ANY type of data, it is hugely convenient for building separator-based strings. For instance, here is how to build a space-separated string of a command line received as an argument:
 
<pre>
#include <stdlib.h>
#include <glib.h>
 
int main(int argc, char **argv)
{
    int i;
    GPtrArray *array = g_ptr_array_new();
    gchar **strings;
    char *result;
 
    for (i = 0; i < argc; i++)
        g_ptr_array_add(array, argv[i]);
 
    /* NEVER FORGET THAT! */
    g_ptr_array_add(array, NULL);
 
    strings = (gchar **)g_ptr_array_free(array, FALSE);
 
    p = g_strjoinv(" ", strings);
   
    /*
    * BEWARE here: if your GPtrArray contains elements of dynamically allocated strings,
    * then you should use g_strfreev(). BUT if the pointers contain only statically
    * allocated strings, you MUST use g_free().
    */
 
    g_free(strings);
 
    g_fprintf(stderr, "My command line was: \"%s\"\n", p);
 
    g_free(p);
 
    exit(0);
}
</pre>
 
ENSURE that your array contains either statically allocated strings or dynamically allocated strings, but NEVER both mixed!
 
Here is another example which duplicates all of its inputs. Look at the differences:
 
<pre>
#include <stdlib.h>
#include <glib.h>
 
int main(int argc, char **argv)
{
    int i;
    GPtrArray *array = g_ptr_array_new();
    gchar **strings;
    char *result;
 
    for (i = 0; i < argc; i++)
        g_ptr_array_add(array, g_strdup(argv[i]));
 
    /* NEVER FORGET THAT! */
    g_ptr_array_add(array, NULL);
 
    strings = (gchar **)g_ptr_array_free(array, FALSE);
 
    p = g_strjoinv(" ", strings);
   
    /*
    * Note: g_strfreev(), not g_free()!
    */
 
    g_strfreev(strings);
 
    g_fprintf(stderr, "My command line was: \"%s\"\n", p);
 
    g_free(p);
 
    exit(0);
}
</pre>

Latest revision as of 18:10, 5 June 2011

In general

Amanda uses GLib to manage strings. You should, in particular, see these pages:

String utility functions: dynamically allocated strings

GLib's string utilities contain a lot of very convenient functions to dynamically allocate strings. Examples:

/* Duplicate a plain string */
char *p = g_strdup("Hello world");
char *p = g_strdup(othervariablehere);

/* Allocate a directly formatted string */
char *p = g_strdup_printf("Hello %s", world);

/* Concatenate a list of random strings together */
char *p = g_strconcat(string1, string2, string3, NULL);

Finally, use g_free() to free the strings.

String utility functions: splitting and joining

Think before using strchr() or strstr(). GLib has g_strsplit():

/* Split a string against a separator */
gchar **strings = g_strsplit("\n", thebigstring);
/* Work with strings, and, when done: */
g_strfreev(strings);

The returned pointer array is guaranteed to be terminated with a NULL pointer. So, if you want to walk the resulting array, you will do:

gchar **strings = g_strsplit(...);
gchar **ptr

for (ptr = strings; *ptr; ptr++)
    /* do something with *ptr here */

One thing, though: if the string you are trying to split ends with the separator, then the last argument before NULL will be an empty string. So, you might want to test for it if you don't want to do anything with it - or, if you're so inclined, just remove the last separator from the string before proceeding.

In order to join, you have the choice between g_strjoin() and g_strjoinv(). The first uses a NULL-terminated list of arguments, while the second uses a NULL-terminated string array:

char *p = g_strjoin(" ", str1, str2, NULL);

gchar **strings = ...;
char *p = g_strjoinv(" ", strings);

Other string utility functions... =

GLib has "locale-safe" string functions, in the sense that they are guaranteed to behave the same, whatever your locale is set to. It is particularly useful, for instance, when trying to identify the real type of a character: the result of isalpha, for instance, will vary depending on the locale. GLib's g_ascii_isalpha, on the other hand, will always behave the same.

Also, have you ever been bitten by the behavior of the strcmp family of functions? Consider using g_str_equal...

GString

GString is the base string buffer to which you can append/allocate etc. Its usage is quite simple, however you must pay attention to what you want to do when finished with the object. Either you want to get the stored string back, or you don't:

GString *strbuf = g_string_new(NULL); /* This will spawn an empty string */
g_string_append(strbuf, "some string");
g_string_append_printf(strbuf, "Hello %s", "Mars");
g_string_append_c(strbuf, '\n');
/* g_string_prepend(), etc etc - see the link above */
/*
 * Getting the string stored in it while discarding the GString:
 */
char *p = g_string_free(strbuf, FALSE);
/*
 * Completely scratching the buffer, including its contents:
 */
g_string_free(strbuf, TRUE);

GPtrArray

While this data type can store ANY type of data, it is hugely convenient for building separator-based strings. For instance, here is how to build a space-separated string of a command line received as an argument:

#include <stdlib.h>
#include <glib.h>

int main(int argc, char **argv)
{
    int i;
    GPtrArray *array = g_ptr_array_new();
    gchar **strings;
    char *result;

    for (i = 0; i < argc; i++)
        g_ptr_array_add(array, argv[i]);

    /* NEVER FORGET THAT! */
    g_ptr_array_add(array, NULL);

    strings = (gchar **)g_ptr_array_free(array, FALSE);

    p = g_strjoinv(" ", strings);
    
    /*
     * BEWARE here: if your GPtrArray contains elements of dynamically allocated strings,
     * then you should use g_strfreev(). BUT if the pointers contain only statically
     * allocated strings, you MUST use g_free().
     */

    g_free(strings);

    g_fprintf(stderr, "My command line was: \"%s\"\n", p);

    g_free(p);

    exit(0);
}

ENSURE that your array contains either statically allocated strings or dynamically allocated strings, but NEVER both mixed!

Here is another example which duplicates all of its inputs. Look at the differences:

#include <stdlib.h>
#include <glib.h>

int main(int argc, char **argv)
{
    int i;
    GPtrArray *array = g_ptr_array_new();
    gchar **strings;
    char *result;

    for (i = 0; i < argc; i++)
        g_ptr_array_add(array, g_strdup(argv[i]));

    /* NEVER FORGET THAT! */
    g_ptr_array_add(array, NULL);

    strings = (gchar **)g_ptr_array_free(array, FALSE);

    p = g_strjoinv(" ", strings);
    
    /*
     * Note: g_strfreev(), not g_free()!
     */

    g_strfreev(strings);

    g_fprintf(stderr, "My command line was: \"%s\"\n", p);

    g_free(p);

    exit(0);
}