python - writing only one row after website scraping -
i trying extract list of golf courses in usa through this link. need extract name of golf course, address, , phone number. script suppose extract data website looks prints 1 row in csv file. noticed when print "name" field prints once despite find_all
function. need data , not 1 field multiple links on website.
how go fixing script prints needed data csv file.
here script:
import csv import requests bs4 import beautifulsoup courses_list = [] in range(1): url="http://www.thegolfcourses.net/page/1?ls&location=california&orderby=title&radius=6750#038;location=california&orderby=title&radius=6750" #.format(i) r = requests.get(url) soup = beautifulsoup(r.content) g_data2=soup.find_all("div",{"class":"list"}) item in g_data2: try: name= item.contents[7].find_all("a",{"class":"entry-title"})[0].text print name except: name='' try: phone= item.contents[7].find_all("p",{"class":"listing-phone"})[0].text except: phone='' try: address= item.contents[7].find_all("p",{"class":"listing-address"})[0].text except: address='' course=[name,phone,address] courses_list.append(course) open ('pgn_final.csv','a') file: writer=csv.writer(file) row in courses_list: writer.writerow([s.encode("utf-8") s in row])
here neat implementation code. can use library urllib2
instead of requests
. , bs4
works same though.
import csv import urllib2 beautifulsoup import * url="http://www.thegolfcourses.net/page/1?ls&location=california&orderby=title&radius=6750#038;location=california&orderby=title&radius=6750" #.format(i) r = urllib2.urlopen(url).read() soup = beautifulsoup(r) courses_list = [] courses_list.append(("course name","phone number","address")) names = soup.findall('h2', attrs={'class':'entry-title'}) phones = soup.findall('p', attrs={'class':'listing-phone'}) address = soup.findall('p', attrs={'class':'listing-address'}) na, ph, add in zip(names,phones, address): courses_list.append((na.text,ph.text,add.text)) open ('pgn_final.csv','a') file: writer=csv.writer(file) row in courses_list: writer.writerow([s.encode("utf-8") s in row])
Comments
Post a Comment