csv - Repeating Data after web scraping using python and beautiful soup4 -

- March 15, 2015

i trying scrape data garmin site golf. want name of golf course , address after running script. have noticed codes repeats first page data on , on again. noticed page numbers on website not start @ 1 @ 10 second page. how go extracting data website , getting , instead of repeat of first page.

import csv import codecs import requests  bs4 import beautifulsoup   courses_list= [] in range(10):     url = "http://sites.garmin.com/clsearch/courses?browse=1&country=us&lang=en&per_page={}".format(i)     r = requests.get(url)      soup = beautifulsoup(r.content)      g_data2=soup.find_all("div",{"class":"result"})      item in g_data2:      try:         name= item.contents[3].find_all("div",{"class":"name"})[0].text         print name      except:         name=''     try:         address= item.contents[3].find_all("div",{"class":"location"})[0].text     except:         address=''       course=[name,address]     courses_list.append(course)   open ('g_final.csv','a') file:     writer=csv.writer(file)     row in courses_list:         writer.writerow([s.encode("utf-8") s in row])

you discovered problem.

then change

url = "http://...?browse=1&country=us&lang=en&per_page={}".format(i)

url = "http://...?browse=1&country=us&lang=en&per_page={}".format(i*20)

Search This Blog

Overvie

csv - Repeating Data after web scraping using python and beautiful soup4 -

Comments

Post a Comment

Popular posts from this blog

android - Gradle sync Error:Configuration with name 'default' not found -

StringGrid issue in Delphi XE8 firemonkey mobile app -

html - jQuery UI Sortable - Remove placeholder after item is dropped -