Skip to content

Conversation

ymann
Copy link

@ymann ymann commented Mar 17, 2018

  • Updated scraper to scrape homepage instead of different pages for each hall
  • Added option to request multiple halls
  • Removed broken code

penn/laundry.py Outdated
detailed = []

rows = soup.find_all('tr')
for row in rows:
cols = row.find_all('td')
if len(cols) > 1:
if len(cols) == 1 and len(cols[0].find_all('center')) == 1 and len(cols[0].find_all('center')[0].find_all('h2')) == 1: # Title element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this on 2 columns; enforce the 80 character limit for style purposes.

penn/laundry.py Outdated
detailed = []

rows = soup.find_all('tr')
for row in rows:
cols = row.find_all('td')
if len(cols) > 1:
if len(cols) == 1 and len(cols[0].find_all('center')) == 1 and len(cols[0].find_all('center')[0].find_all('h2')) == 1: # Title element
if(cols[0].find_all('center')[0].find_all('h2')[0].find_all('a')[0].getText() == hall): # Check if found correct hall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if-else clause can be shortened: found_hall =

Copy link
Contributor

@esqu1 esqu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make some style changes with the 80 character limit.

@ezwang
Copy link
Member

ezwang commented Jun 10, 2018

Endpoint is still slow after only retrieving data once. Retrieving multiple halls is currently O(n^2), but could be refactored to be O(n). Since HTML parsing might be a relatively expensive operation, this might be why this endpoint is still slow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants