Tuesday 24 March 2009

Awk Scripting, combining columns

If you have multiple files named filename.< a number > e.g. filename.0, filename.1, filename.2,filename.3 etc then you can take an average of multiple columns using awk as below:

you need to know how many files you have, in this num_files=16, and you need to know the total number of columns per file in this tot_num_cols=13. Here I wanted to average colums 1 and 4. Using the paste command which pastes a number of files to the screen side by side and piping to awk the script will output the average of columns 1 and 4 to a file called anewfile.txt

paste filename.* | awk 'BEGIN{num_col=4;num_files=16;timecol=1;tot_num_cols} {c=0; for (j=0;j < num_files;j++){c+=$(timecol+tot_num_cols*j)}}; s="0;" i="0;i < ="num_files;i++){s+="$(num_col+tot_num_cols*i)};" > | anewfile.txt

1 comment:

  1. Also to calculate the error associated with the values you have just output across the files you can again use awk in a similar way. Beware here i have created an intermediate file called afile. If you want to find the error of column 4 using the anewfile.txt as above across the files then use as below

    paste anewfile.txt filename.* | awk 'BEGIlockquoteN{av_col=2;num_col=4;num_files=16;tot_num_files} {c=0; for (j=0;j &lt num_files;j++){c+=($(num_col + tot_num_files*j + 2)-$(av_col))**2}; printf("%20.12f \n",sqrt(c/num_files))}' &gt | afile
    paste anewfile.txt afile &gt | file_with_error.txt
    rm afile

    ReplyDelete