python - My for-Loop isn't working as intended -
i want fill in list_of_occurences
correct item list grundformen
.
my for-loop doesn't work intended though. doesn't restart beginning , goes through rows in reader once. therefore won't fill list completely.
this prints (you can see part missing - because doesn't start searching beginning of list - ):
# list_of_occurrences (1 line - wrapped easier reading) [['nn', 1328, ('ziel',)], ['art', 771, ('der',)], ['$.', 732, ('_',)], ['vvfin', 682, ('schlagen',)], ['pper', 592, ('sie',)], ['$,', 561, ('_',)], ['adv', 525, ('so',)], ['appr', 507, ('in',)], ['ne', 433, ('johanna',)], ['$(', 363, ('_',)], ['vafin', 334, ('haben',)], ['adja', 307, ('tragisch',)], ['adjd', 278, ('recht',)], ['kon', 228, ('doch',)], ['vvpp', 194, ('reichen',)], ['vvinf', 161, ('stören',)], ['kous', 151, ('während',)], ['pposat', 120, ('ihr',)], ['ptkvz', 104, ('weiter',)], ['prf', 98, ('sich',)], ['apprart', 90, ('zu',)], ['ptkneg', 87, ('nicht',)], ['vmfin', 76, ('sollen',)], ['piat', 66, ('kein',)], ['pis', 65, ('etwas',)], ['ptkzu', 52, ('zu',)], ['prels', 51, ('wer',)], ['proav', 42, ('dabei',)], ['pds', 38, ('jener',)], ['pdat', 37, ('dieser',)], ['pwav', 30, ('wie',)], ['pws', 26, ('was',)], ['card', 24, ('drei',)], ['kokom', 21, ('wie',)], ['vainf', 18, ('werden',)], ['koui', 15, ('um',)], ['vminf', 10, ('können',)], ['vvizu', 10, ('aufklären',)], ['vapp', 10], ['ptka', 6], ['ptkant', 6], ['pwat', 4], ['vvimp', 4], ['prelat', 4], ['apzr', 3], ['appo', 2], ['fm', 1]] # grundformen (1 line, wrapped reading) ['ziel', 'der', '_', 'schlagen', 'sie', '_', 'so', 'in', 'johanna', '_', 'haben', 'tragisch', 'recht', 'doch', 'reichen', 'stören', 'während', 'ihr', 'weiter', 'sich', 'zu', 'nicht', 'sollen', 'kein', 'etwas', 'zu', 'wer', 'dabei', 'jener', 'dieser', 'wie', 'was', 'drei', 'wie', 'werden', 'um', 'können', 'aufklären']
occurences = collections.counter() open("material-2.csv", mode='r', newline='', encoding="utf-8") material: reader = csv.reader(material, delimiter='\t', quotechar="\t") line in reader: if line: occurences[line[5]] += 1 else: pass list_of_occurences = [list(elem) elem in occurences.most_common()] grundformen = [] open('material-2.csv', mode='r', newline='', encoding="utf-8") material: reader = csv.reader(material, delimiter='\t', quotechar="\t") elem in list_of_occurences: row in reader: if row != [] , row[5] == elem[0]: grundformen.append(row[2]) break iterator = 0 elem in grundformen: list_of_occurences[iterator].insert(2, elem) iterator = iterator + 1 pass print(list_of_occurences) print(grundformen)
whole inputfile: https://www.dropbox.com/sh/xyktjk4ycm8x6v0/aacou438_eewx-zymbybiqp_a/material-2.csv?dl=0
part of input file:
1 als als _ _ kous _ _ 6 6 cp cp _ _ 2 es es _ _ pper _ 3|nom|sg|neut 6 6 sb sb _ _ 3 zu zu _ _ ptka _ _ 4 4 mo mo _ _ 4 schneien schneien _ _ adjd _ comp|dat|sg|fem 5 5 mo mo _ _ 5 aufgehört aufhören _ _ vvpp _ psp 6 6 oc oc _ _ 6 hatte haben _ _ vafin _ 3|sg|past|ind 8 8 mo mo _ _ 7 , _ _ _ $, _ _ 8 8 punc punc _ _ 8 verließ verlassen _ _ vvfin _ 3|sg|past|ind 0 0 root root _ _ 9 johanna johanna _ _ ne _ nom|sg|masc 8 8 sb sb _ _ 10 von von _ _ appr _ _ 5 5 sbp sbp _ _ 11 rotenhoff rotenhoff _ _ ne _ dat|sg|neut 10 10 nk nk _ _ 12 , _ _ _ $, _ _ 8 8 punc punc _ _ 13 ohne ohne _ _ koui _ _ 18 18 cp cp _ _ 14 ein ein _ _ art _ nom|sg|neut 16 16 nk nk _ _ 15 rechtes recht _ _ adja _ pos|nom|sg|neut 16 16 nk nk _ _ 16 ziel ziel _ _ nn _ nom|sg|neut 18 18 oa oa _ _ 17 zu zu _ _ ptkzu _ _ 18 18 pm pm _ _ 18 haben haben _ _ vainf _ inf 8 8 mo mo _ _ 19 , _ _ _ $, _ _ 18 18 punc punc _ _ 20 das der _ _ art _ nom|sg|neut 21 21 nk nk _ _ 21 gutshaus gutshaus _ _ nn _ nom|sg|neut 16 16 app app _ _ 22 . _ _ _ $. _ _ 8 8 punc punc _ _
how can change loop, can fill in everything?
you had issue how reading in csv
data.
here data read list
, can gone through second loop instead of opening file-object
don't need loop through csv
data twice:
import csv import collections occurences = collections.counter() grundformen = collections.defaultdict(list) open("material-2.csv", mode='r', newline='', encoding="utf-8") material: reader = [ln ln in csv.reader(material, delimiter='\t', quotechar="\t") if ln] line in reader: occurences[line[5]] += 1 grundformen[line[5]].append(line[2]) list_of_occurences = list(map(list, occurences.most_common())) elem in list_of_occurences: elem.append(grundformen[elem[0]][0]) print(occurences)
by making list
out of csv
data, able call break
statement , still able start fresh @ head of list
next loop. when loop on csv.reader
iterator
when calling break
start left off until data exhausted.
Comments
Post a Comment