csv - Python reading cvs files recursively in tree directory and append one of the two columns to data frame -
i have root directory contains hundreds sub-folders. want read csv files in each sub-folder, names same, study.csv
after reading csv files, want create data frame store part of data csv files. new data frame contain 3 columns. 1 column newly created mark csv file id, , other 2 columns 2 of csv file columns.
for example: structure of original csv file is:
row1.... row2.... row3.... row4: column1 column2 column3 column14 column5 row5: 1 2 3 4 5 row6: 2 4 2 1 10 row7: 3 8 9 11 23 ...
the expected data frame want:
new column column3 column4 1 3 4 1 2 1 1 2 1 1 9 11
so read csv files starting row 4, new column in data frame, value same if rows same csv files. can regard new column csv file id.
i found os.walk
me traverse tree directory, how can read 2 of specific columns in csv while creating new column id accordingly?
to iterate on each csv file in root directory (including sub folders), iterate on os.walk()
, check each file .csv
file extention, pass filepath , filename process_file()
for root, dirs, files in os.walk(root_dir): fi in files: if fi.split(".")[-1] == 'csv': process_file(root + fi)
load each line of csv file list., can separate values in each line string.split()
.
each value can referenced row number , column number csv_file[row_num][col_num]
to process single file, can iterate using values row_num
, col_num
want:
def process_file(filename): title_line = 3 # indexing starts @ 0, 1 less 4 cols_to_keep = [0, 2, 3] # load entire csv file list (not massive files) f_lines = open(filename).readlines() out_file = open("out.csv", "w") f_lines = [line.strip().split(",") line in f_lines] # split each line in f_lines if os.stat("file").st_size == 0: # if file empty, add title line out_file.write(",".join(f_lines[title_line])) line in f_lines[title_line:]: # each line after title line new_line = [] col_index in cols_to_keep: new_line.append(line[col_index]) out_file.write(",".join(new_line))
Comments
Post a Comment