I have been debugging a piece of legacy code, running on an XScale (arm v5te) System with linux, that crashes reproducible.
I have debugged using gdb and set MALLOC_CHECK_ to 1. It's a lot of code, so just some snippets:
We have this structure:
typedef struct {
...clipped..
char **data_column_list;
/** data column count */
int data_column_cnt;
...clipped
} csv_t;
We initialize the columns in a function, putting them in a variable "columns"
/* Allocating memory for pointer to every register id */
columns = (char **) malloc(column_cnt * sizeof(char *));
column_cnt = 0;
/* loop over all sensors */
for(i=0; i<cfg.sen_cnt; i++) {
/* loop over all registers */
for(j=0; j<cfg.sen_list[i]->data_cnt; j++) {
/* Storing all the pointers to id */
columns[column_cnt++] = cfg.sen_list[i]->data_list[j]->id;
}
}
In another function, what happens is this:
/* free the previous list */
csv_free(lc_csv);
lc_csv->data_column_list = columns;
lc_csv->data_column_cnt = column_cnt;
csv_free being:
void csv_free(csv_t *csv) {
if(csv->data_column_cnt > 0)
free(csv->data_column_list);
csv->data_column_cnt = 0;
}
Now, there is another function, building the whole "cfg"/config structure, that contains these ids.
Code frome above: cfg.sen_list[i]->data_list[j]->id; where cfg is a struct, sen_list is an array of pointers to structs, data_list is an array of pointers to other structs, that contain a string "id".
Whe the program gets a signal SIGUSR1, the config is being updated. All of these data_list and sen_list structs are being freed, then new ones are generated.
Then with the first function, new collumns of ids are generated and put into the csv structure, but the old list is being freed before.
Thats where it crashes. In csv_free.
*** glibc detected *** /root/elv: free(): invalid pointer: 0x0001ae88 ***
I thought it should be like this. You have an array of pointers. When you free the pointers, you have to free the pointer, pointing to a set of pointers (the array).
Or put in code terms, the above situation should be analog to:
char **ar = malloc(n * sizeof(char *));
char *xn = malloc(10 * sizeof(char)); // Do for 0 to n strings
...
ar[n] = xn; // Do for 0 to n strings
...do stuff...
free(xn); // Do for 0 to n strings
free(ar);
When the structs, containing the id strings, are freed, I still have my pointer arrays with (invalid) pointers, not null pointers:
(gdb) p csv
$40 = {sysid = 222, ip = '' <repeats 49 times>,
module = "elv_v2", '' <repeats 14 times>, format_type = 1, msg_id = 0,
data_column_list = 0x1ae88, data_column_cnt = 10, pub_int = 30,
line_cnt = 0, pub_seq = -1, format = 0x18260}
(gdb) p csv.data_column_list[0]
$41 = 0x1b378 "0"
But I get the above error message (or SIGABRT without the MALLOC_CHECK_).
I don't understand this at all. I have to free this array of pointers, or it will become a memory leak. There is no other call of free before that, that I could find. I don't know why csv.data_column_list is considered an invalid pointer.
Valgrind is unfortunately not availiable on arm v5te :(
Have been debugging this for hours and hours and would be happy for any help.
Thank you very much,
Cheers,
Ben
UPDATE:
I'm wondering if it could be connected to some "scope" issue. There is almost identical code in another application, which works. The function which crashes, "csv_free" is used by both programs (statically linked). The only difference is, that the struct containing the pointer to be freed is declared and defined normally in the working program and declared as external
and defined in another file than main.c
Calling "free" manually in main.c works, while calling "csv_free" crashes. Riddle me this...