How to awk every nth line starting from different lines each iteration -
i awk
print every nth line out of file starting line 0. then, after awk has gone through whole file, print every nth line starting line 1...then print every nth line starting line 2...etc, printing every nth line starting line n-1. sad attempt far:
#!/bin/bash rm *.sad *.sadd *.out #create loop index in $(seq 20 1 36); listm+=($i) done #create input file j in "${listm[@]}" if [ $j -eq 20 ]; awk 'nr % 20 == 0' vel_vmdout > atomvel.dat awk '{print $2,$3,$4}' atomvel.dat > velocity.dat else awk 'nr % 20 == 1' vel_vmdout > $j.sad egrep -v "^[[:space:]]*$|^#" $j.sad > $j.sadd awk '{print $2, $3, $4}' $j.sadd > $j.out paste velocity.dat $j.out > taste fi done
let me try clarify providing input , output should like. th input xyz file of md simulation consisting of frames of atoms' xyz coordinates.
input:
input http://i61.tinypic.com/2l8hz79.jpg
this image shows 1st snapshot , part of second snapshot. because these snapshot, ordering of atoms not change. thus, trying print xyz coordinates each snapshot each specific atom in own columns shown below. make file consisting of 3n columns, n number of atoms.
output:
output http://i60.tinypic.com/i3v1ax.png
as can see, each atoms' coordinates in own columns , total file nx3n array. bash script me trying this, first 2 atoms. wanted print every nth line (coordinates of nth atom) output. appreciate patience all.
generating sample data
this step should not necessary; question should have included usable sample data , required output sample data.
at 1 level, won't because don't have random number generator program, script below shows how generated data follows, , illustrates lengths might necessary go when question doesn't supply readable data. generated data looks similar data in question (at least superficially):
18 generated vmd in absentia c 0.979485 -6.665347 0.575383 c 1.191999 -3.002386 2.859484 c 3.151517 -5.610077 0.429413 c 3.439828 -6.454984 1.319724 c 3.726201 -0.123038 2.096854 c 1.363325 -3.031238 0.016019 c 6.090283 -3.915340 2.396358 c 0.407755 -7.957784 -0.846842 c 0.203074 -0.796428 2.659573 o 2.600610 -2.259674 -0.260378 o 4.773839 -6.765097 0.588508 h 2.743424 -2.890016 2.906452 h 2.810233 -6.641054 -0.797672 h 6.854169 -3.191721 -0.925670 o 2.914233 -1.060001 0.776983 h 3.803923 -1.497032 2.908799 h 5.669443 -7.227666 -0.647552 h 0.092455 -5.850637 2.959987 18 generated vmd in absentia c 6.042840 -7.254720 2.093573 c 2.551942 -6.044322 2.061072 c 3.523150 -6.167163 2.451689 c 5.197316 -3.429866 -0.412062 c 2.548777 -6.422851 1.282846 c 3.775197 -2.012031 1.377440 c 3.405112 -3.206415 -0.879886 c 1.448359 -5.419629 0.467291 c 3.661964 -2.789234 2.644294 o 4.214854 -2.439574 -0.951704 o 5.297609 -2.320418 2.709898 h 2.653940 -4.431080 -0.511743 h 5.040635 -0.676199 -0.590970 h 1.546725 -1.294582 2.562937 o 4.231461 -7.180908 1.629901 h 3.297836 -1.557133 -0.133280 h 3.442481 -4.489962 2.111930 h 1.423611 -7.982655 0.715618 18 generated vmd in absentia c 1.432495 -7.686243 2.525734 c 5.038409 -4.976270 2.826846 c 6.184137 -7.303094 2.711561 c 3.208125 -0.606556 1.978725 c 2.171859 -6.792060 0.678988 c 6.521124 -5.622797 -0.773797 c 1.725619 -5.768633 -0.223397 c 3.602427 -2.325680 1.762008 c 1.937521 -1.686895 1.743159 o 0.745526 -0.114246 -0.949490 o 4.754360 -6.531145 1.998913 h 1.114732 -1.158810 1.486939 h 6.410490 -5.411647 0.062737 h 4.164330 -6.743763 1.802804 o 2.587841 -3.979700 2.609748 h 2.192073 -2.815376 -0.809569 h 5.501795 -2.326438 1.325829 h 3.285032 -1.212541 1.284453 18 generated vmd in absentia c 3.564424 -3.117406 -0.032879 c 2.894745 -0.632591 0.532311 c 3.384916 -5.383135 1.179585 c 0.793488 -0.894539 -0.886891 c 1.348785 -6.501867 1.648604 c 2.189941 -2.438067 0.616090 c 2.043378 -4.966472 0.691603 c 3.124161 -5.792896 0.545362 c 5.741472 -0.640590 2.825374 o 0.300550 -7.149663 0.942726 o 1.344387 -0.121382 2.169401 h 4.963296 -0.964665 -0.230523 h 6.651423 -4.905053 2.509626 h 5.059694 -6.166516 0.102255 o 5.046864 -3.288883 0.853948 h 2.389007 -3.057664 1.806301 h 2.365876 -0.956860 1.458959 h 2.892502 -0.097422 -0.531714
the script used was:
random -n $((4 * 18)) -t '%8:6[0:7]f %8:6[-8:0]f %8:6[-1:3]f' | awk 'begin { n = split("cccccccccoohhhohhh", atoms, ""); atoms[0] = atoms[n] } nr % n == 1 { print n; print " generated vmd in absentia" } { print "", atoms[nr%18], " ", $0 }'
the -n
option random
says how many rows generate; chose 72. -t
option template, , notation %8:6[0:7]f
means use %8.6f
format print uniformly distributed random numbers between 0 , 7. awk
script takes data generated , interpolates noise (the number of atoms , variant on 'generated vmd' line), tagging lines appropriate atomic symbol.
processing sample data
given data, need munge required output. script more or less job. there endless ways should improved, of course, such taking file names command line arguments, using temporary file names instead of fixed names, cleaning intermediate files, different compounds, different atoms (nitrogen, phosphorous, etc), , on. however, should adapt reasonably easily.
input="data" output="output" n=$(sed 1q "$input") n2=$(($n+2)) ((i = 3; <= n2; i++)) colno=$(printf "%.2d" $(($i-2))) awk -v n=$n2 -v r=$i \ ' begin { name["c"] = "carbon"; name["h"] = "hydrogen"; name["o"] = "oxygen"; r0 = r % n } nr > 2 && nr <= r { count[$1]++; } nr == r { printf "%-32.32s\n", name[$1] " " count[$1]; } nr % n == r0 { xyz = sprintf("%s %s %s", $2, $3, $4); printf "%-32.32s\n", xyz } ' "$input" > "column.$colno" done paste -d ' ' column.* > "$output"
the first 4 lines set control parameters, collecting number of lines per unit of data input file, , adjusting things accordingly. for
loop iterates on offsets 3
$n2
inclusive (skipping 2 header lines), , runs awk
script. encodes atom types (begin
), determines atom processing time (nr > 2 && nr <= r
, nr == r
), , arranges print triplets of data relevant atom. formatting organized column headings , actual xyz-triplets uniformly spaced. these written file column.$colno
. when all's done, column.*
files pasted generate single output file, looks this:
carbon 1 carbon 2 carbon 3 carbon 4 carbon 5 carbon 6 carbon 7 carbon 8 carbon 9 oxygen 1 oxygen 2 hydrogen 1 hydrogen 2 hydrogen 3 oxygen 3 hydrogen 4 hydrogen 5 hydrogen 6 0.979485 -6.665347 0.575383 1.191999 -3.002386 2.859484 3.151517 -5.610077 0.429413 3.439828 -6.454984 1.319724 3.726201 -0.123038 2.096854 1.363325 -3.031238 0.016019 6.090283 -3.915340 2.396358 0.407755 -7.957784 -0.846842 0.203074 -0.796428 2.659573 2.600610 -2.259674 -0.260378 4.773839 -6.765097 0.588508 2.743424 -2.890016 2.906452 2.810233 -6.641054 -0.797672 6.854169 -3.191721 -0.925670 2.914233 -1.060001 0.776983 3.803923 -1.497032 2.908799 5.669443 -7.227666 -0.647552 0.092455 -5.850637 2.959987 6.042840 -7.254720 2.093573 2.551942 -6.044322 2.061072 3.523150 -6.167163 2.451689 5.197316 -3.429866 -0.412062 2.548777 -6.422851 1.282846 3.775197 -2.012031 1.377440 3.405112 -3.206415 -0.879886 1.448359 -5.419629 0.467291 3.661964 -2.789234 2.644294 4.214854 -2.439574 -0.951704 5.297609 -2.320418 2.709898 2.653940 -4.431080 -0.511743 5.040635 -0.676199 -0.590970 1.546725 -1.294582 2.562937 4.231461 -7.180908 1.629901 3.297836 -1.557133 -0.133280 3.442481 -4.489962 2.111930 1.423611 -7.982655 0.715618 1.432495 -7.686243 2.525734 5.038409 -4.976270 2.826846 6.184137 -7.303094 2.711561 3.208125 -0.606556 1.978725 2.171859 -6.792060 0.678988 6.521124 -5.622797 -0.773797 1.725619 -5.768633 -0.223397 3.602427 -2.325680 1.762008 1.937521 -1.686895 1.743159 0.745526 -0.114246 -0.949490 4.754360 -6.531145 1.998913 1.114732 -1.158810 1.486939 6.410490 -5.411647 0.062737 4.164330 -6.743763 1.802804 2.587841 -3.979700 2.609748 2.192073 -2.815376 -0.809569 5.501795 -2.326438 1.325829 3.285032 -1.212541 1.284453 3.564424 -3.117406 -0.032879 2.894745 -0.632591 0.532311 3.384916 -5.383135 1.179585 0.793488 -0.894539 -0.886891 1.348785 -6.501867 1.648604 2.189941 -2.438067 0.616090 2.043378 -4.966472 0.691603 3.124161 -5.792896 0.545362 5.741472 -0.640590 2.825374 0.300550 -7.149663 0.942726 1.344387 -0.121382 2.169401 4.963296 -0.964665 -0.230523 6.651423 -4.905053 2.509626 5.059694 -6.166516 0.102255 5.046864 -3.288883 0.853948 2.389007 -3.057664 1.806301 2.365876 -0.956860 1.458959 2.892502 -0.097422 -0.531714
your task understand why bits of awk
script present. example, why r0
needed (hint, experiment without r0
calculation, , use r
in place).
Comments
Post a Comment