D>
04:15:33 WARNING scrapelib: got HTTPSConnectionPool(host='www.ourcommons.ca', port=443): Read timed out. (read timeout=60) sleeping for 10 seconds before retry
04:23:15 WARNING scrapelib: got HTTPSConnectionPool(host='www.ourcommons.ca', port=443): Read timed out. (read timeout=60) sleeping for 10 seconds before retry
04:30:53 WARNING pupa: validation of Membership 434d65ce-0f7b-11f0-b7f4-1a512ca27df6 failed: 2 validation errors:
Value '--' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
Value 'Telephone: --' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
|
ca
|
|
2025-04-02 04:30:53
|
Value 'Telephone: --' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 175, in validate
validator.validate(self.as_dict(), schema)
File "/app/.heroku/python/lib/python3.9/site-packages/validictory/validator.py", line 616, in validate
raise MultipleValidationError(self._errors)
validictory.validator.MultipleValidationError: 2 validation errors:
Value '--' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
Value 'Telephone: --' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 104, in do_scrape
self.save_object(obj)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 93, in save_object
self.save_object(obj)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 89, in save_object
raise ve
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 85, in save_object
obj.validate()
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 177, in validate
raise ScrapeValueError('validation of {} {} failed: {}'.format(
pupa.exceptions.ScrapeValueError: validation of Membership 434d65ce-0f7b-11f0-b7f4-1a512ca27df6 failed: 2 validation errors:
Value '--' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
Value 'Telephone: --' for field '' does not match regular expression '\A1 \d{3} \d{3}-\d{4}(?: x\d+)?\Z'
|
C
|
ca_ab
|
2025-04-02 04:34:36
|
2025-04-02 04:34:37
|
|
D>
04:46:07 WARNING pupa: validation of CanadianPerson 6447e96e-0f7d-11f0-b7f4-1a512ca27df6 failed: 1 validation errors:
Value 'Information site' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
|
ca_ab_calgary
|
|
2025-04-02 04:46:07
|
Value 'Information site' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|…
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 175, in validate
validator.validate(self.as_dict(), schema)
File "/app/.heroku/python/lib/python3.9/site-packages/validictory/validator.py", line 616, in validate
raise MultipleValidationError(self._errors)
validictory.validator.MultipleValidationError: 1 validation errors:
Value 'Information site' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 104, in do_scrape
self.save_object(obj)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 89, in save_object
raise ve
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 85, in save_object
obj.validate()
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 177, in validate
raise ScrapeValueError('validation of {} {} failed: {}'.format(
pupa.exceptions.ScrapeValueError: validation of CanadianPerson 6447e96e-0f7d-11f0-b7f4-1a512ca27df6 failed: 1 validation errors:
Value 'Information site' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
|
C
|
ca_ab_edmonton
|
2025-04-02 04:38:42
|
2025-04-02 04:38:42
|
|
C
|
ca_ab_grande_prairie
|
2025-04-02 04:35:46
|
2025-04-02 04:35:46
|
|
D>
|
ca_ab_grande_prairie_county_no_1
|
|
2025-04-02 04:01:57
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_ab_grande_prairie_county_no_1/people.py", line 17, in scrape
name = councillor.xpath('.//div[@class="lb-imageBox_header {headColor}"]')[0].text_content()
IndexError: list index out of range
|
D>
|
ca_ab_lethbridge
|
|
2025-04-02 04:41:17
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_ab_lethbridge/people.py", line 38, in scrape
yield self.scrape_mayor()
File "/app/scrapers/ca_ab_lethbridge/people.py", line 12, in scrape_mayor
name = " ".join([paragraph[0], paragraph[1]])
IndexError: list index out of range
|
C
|
ca_ab_strathcona_county
|
2025-04-02 04:42:41
|
2025-04-02 04:42:41
|
|
C
|
ca_ab_wood_buffalo
|
2025-04-02 04:42:46
|
2025-04-02 04:42:46
|
|
C
|
ca_bc
|
2025-04-02 04:34:47
|
2025-04-02 04:34:47
|
|
C
|
ca_bc_abbotsford
|
2025-04-02 04:31:30
|
2025-04-02 04:31:30
|
|
C
|
ca_bc_burnaby
|
2025-04-02 05:43:46
|
2025-04-02 05:43:46
|
|
C
|
ca_bc_coquitlam
|
2025-04-02 04:44:42
|
2025-04-02 04:44:42
|
|
C
|
ca_bc_kelowna
|
2025-04-02 04:49:46
|
2025-04-02 04:49:46
|
|
C
|
ca_bc_langley
|
2025-04-02 04:41:53
|
2025-04-02 04:41:53
|
|
D>
05:42:12 WARNING scrapelib: sleeping for 10 seconds before retry
05:42:22 WARNING scrapelib: sleeping for 20 seconds before retry
05:42:42 WARNING scrapelib: sleeping for 40 seconds before retry
|
ca_bc_langley_city
|
|
2025-04-02 05:43:23
|
scrapelib.HTTPError: 403 while retrieving https://www.langleycity.ca/cityhall/city-council/council-members
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_bc_langley_city/people.py", line 11, in scrape
page = self.lxmlize(COUNCIL_PAGE)
File "/app/scrapers/utils.py", line 217, in lxmlize
response = self.get(url, cookies=cookies, verify=verify)
File "/app/scrapers/utils.py", line 198, in get
return super().get(*args, verify=kwargs.pop("verify", SSL_VERIFY), **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 602, in request
raise HTTPError(resp)
scrapelib.HTTPError: 403 while retrieving https://www.langleycity.ca/cityhall/city-council/council-members
|
C
|
ca_bc_new_westminster
|
2025-04-02 04:52:23
|
2025-04-02 04:52:23
|
|
C
|
ca_bc_richmond
|
2025-04-02 04:46:02
|
2025-04-02 04:46:02
|
|
C
|
ca_bc_saanich
|
2025-04-02 04:10:20
|
2025-04-02 04:10:20
|
|
C
|
ca_bc_surrey
|
2025-04-02 04:31:45
|
2025-04-02 04:31:45
|
|
C
|
ca_bc_vancouver
|
2025-04-02 04:37:35
|
2025-04-02 04:37:35
|
|
C
|
ca_bc_victoria
|
2025-04-02 04:51:07
|
2025-04-02 04:51:07
|
|
C
04:59:32 WARNING scrapelib: got HTTPConnectionPool(host='shaun_chen', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7815a16070>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
04:59:42 WARNING scrapelib: got HTTPConnectionPool(host='shaun_chen', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7815be8b20>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:00:02 WARNING scrapelib: got HTTPConnectionPool(host='shaun_chen', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7815b8d970>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:00:42 WARNING ca_candidates.people: HTTPConnectionPool(host='shaun_chen', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7816258640>: Failed to establish a new connection: [Errno -2] Name or service not known')) (http://@Shaun_Chen)
05:11:17 ERROR ca_candidates.people:
05:13:28 WARNING scrapelib: got HTTPSConnectionPool(host='nosca.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815a3d880>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:13:38 WARNING scrapelib: got HTTPSConnectionPool(host='nosca.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78154d7b50>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:13:58 WARNING scrapelib: got HTTPSConnectionPool(host='nosca.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f781555f0a0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:14:38 ERROR ca_candidates.people:
05:14:49 ERROR ca_candidates.people:
05:15:24 WARNING scrapelib: got HTTPSConnectionPool(host='yorksimcoe.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7816531e50>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:15:34 WARNING scrapelib: got HTTPSConnectionPool(host='yorksimcoe.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f781641ed30>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:15:54 WARNING scrapelib: got HTTPSConnectionPool(host='yorksimcoe.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815a4e8e0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:16:34 ERROR ca_candidates.people:
05:16:46 WARNING scrapelib: sleeping for 10 seconds before retry
05:16:56 WARNING scrapelib: sleeping for 20 seconds before retry
05:17:16 WARNING scrapelib: sleeping for 40 seconds before retry
05:17:56 ERROR ca_candidates.people:
05:17:58 WARNING scrapelib: got HTTPSConnectionPool(host='www.claudedusseault.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f781643f790>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:18:08 WARNING scrapelib: got HTTPSConnectionPool(host='www.claudedusseault.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78165229a0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:18:28 WARNING scrapelib: got HTTPSConnectionPool(host='www.claudedusseault.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78164c37c0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:19:08 ERROR ca_candidates.people:
05:19:31 WARNING scrapelib: got HTTPSConnectionPool(host='klc.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78164c33a0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:19:41 WARNING scrapelib: got HTTPSConnectionPool(host='klc.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815b1e5b0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:20:01 WARNING scrapelib: got HTTPSConnectionPool(host='klc.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815de7460>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:20:41 ERROR ca_candidates.people:
05:21:23 WARNING urllib3.connection: Certificate did not match expected hostname: www.ericlefebvre.ca. Certificate: {'subject': ((('commonName', '*.namespro.ca'),),), 'issuer': ((('countryName', 'GB'),), (('stateOrProvinceName', 'Greater Manchester'),), (('localityName', 'Salford'),), (('organizationName', 'Sectigo Limited'),), (('commonName', 'Sectigo RSA Domain Validation Secure Server CA'),)), 'version': 3, 'serialNumber': '579103DDBCC2CB6ED6431111A3112BDB', 'notBefore': 'Sep 7 00:00:00 2024 GMT', 'notAfter': 'Oct 8 23:59:59 2025 GMT', 'subjectAltName': (('DNS', '*.namespro.ca'), ('DNS', 'namespro.ca')), 'OCSP': ('http://ocsp.sectigo.com',), 'caIssuers': ('http://crt.sectigo.com/SectigoRSADomainValidationSecureServerCA.crt',)}
05:21:23 ERROR ca_candidates.people:
05:21:39 WARNING scrapelib: got HTTPSConnectionPool(host='dauphinswanriverneepawa.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78159f8f10>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:21:49 WARNING scrapelib: got HTTPSConnectionPool(host='dauphinswanriverneepawa.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78166d7c10>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:22:09 WARNING scrapelib: got HTTPSConnectionPool(host='dauphinswanriverneepawa.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815b1e2e0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:22:49 ERROR ca_candidates.people:
05:22:55 WARNING scrapelib: got HTTPSConnectionPool(host='kenora.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78159fac10>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:23:05 WARNING scrapelib: got HTTPSConnectionPool(host='kenora.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78155983a0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:23:25 WARNING scrapelib: got HTTPSConnectionPool(host='kenora.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78162f0f10>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:24:05 ERROR ca_candidates.people:
05:24:13 WARNING scrapelib: got HTTPSConnectionPool(host='kootenaycolumbia.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7814540d30>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:24:23 WARNING scrapelib: got HTTPSConnectionPool(host='kootenaycolumbia.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7814540820>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:24:43 WARNING scrapelib: got HTTPSConnectionPool(host='kootenaycolumbia.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815a16d30>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:25:23 ERROR ca_candidates.people:
05:25:50 WARNING scrapelib: got HTTPSConnectionPool(host='vancouvergranville.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78147799d0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:26:00 WARNING scrapelib: got HTTPSConnectionPool(host='vancouvergranville.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78165937c0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:26:20 WARNING scrapelib: got HTTPSConnectionPool(host='vancouvergranville.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815af4fd0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:27:00 ERROR ca_candidates.people:
05:29:05 WARNING scrapelib: got HTTPSConnectionPool(host='www.bgosconservativeeda.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f781589e6a0>, 'Connection to www.bgosconservativeeda.com timed out. (connect timeout=60)')) sleeping for 10 seconds before retry
05:31:16 WARNING scrapelib: got HTTPSConnectionPool(host='www.bgosconservativeeda.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f7815b2cbe0>, 'Connection to www.bgosconservativeeda.com timed out. (connect timeout=60)')) sleeping for 20 seconds before retry
05:33:36 WARNING scrapelib: got HTTPSConnectionPool(host='www.bgosconservativeeda.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f7814516b80>, 'Connection to www.bgosconservativeeda.com timed out. (connect timeout=60)')) sleeping for 40 seconds before retry
05:36:16 ERROR ca_candidates.people:
05:36:38 WARNING scrapelib: got No connection adapters were found for "'https://kitchenersouthhespeler.conservativeeda.ca/'" sleeping for 10 seconds before retry
05:36:48 WARNING scrapelib: got No connection adapters were found for "'https://kitchenersouthhespeler.conservativeeda.ca/'" sleeping for 20 seconds before retry
05:37:08 WARNING scrapelib: got No connection adapters were found for "'https://kitchenersouthhespeler.conservativeeda.ca/'" sleeping for 40 seconds before retry
05:37:48 ERROR ca_candidates.people:
05:38:01 WARNING scrapelib: got HTTPSConnectionPool(host='langleyaldergrove.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f781589e3d0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:38:11 WARNING scrapelib: got HTTPSConnectionPool(host='langleyaldergrove.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78145e7d90>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:38:31 WARNING scrapelib: got HTTPSConnectionPool(host='langleyaldergrove.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7814731820>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:39:11 ERROR ca_candidates.people:
05:39:13 WARNING scrapelib: got HTTPSConnectionPool(host='missionmatsquifrasercanyon.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78147a1d90>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
05:39:23 WARNING scrapelib: got HTTPSConnectionPool(host='missionmatsquifrasercanyon.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f781471dcd0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
05:39:43 WARNING scrapelib: got HTTPSConnectionPool(host='missionmatsquifrasercanyon.conservativeeda.ca', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78146b8610>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
05:40:23 ERROR ca_candidates.people:
|
ca_candidates
|
2025-04-02 05:41:56
|
2025-04-02 05:41:59
|
|
C
|
ca_mb
|
2025-04-02 04:01:39
|
2025-04-02 04:01:40
|
|
C
|
ca_mb_winnipeg
|
2025-04-02 04:36:25
|
2025-04-02 04:36:25
|
|
D>
|
ca_nb
|
|
2025-04-02 04:46:46
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_nb/people.py", line 16, in scrape
address = node.xpath('//td[contains(text(),"Address")]/parent::tr//td[2]')[0]
IndexError: list index out of range
|
C
|
ca_nb_fredericton
|
2025-04-02 04:41:40
|
2025-04-02 04:41:40
|
|
C
|
ca_nb_moncton
|
2025-04-02 04:41:13
|
2025-04-02 04:41:13
|
|
C
|
ca_nb_saint_john
|
2025-04-02 04:37:31
|
2025-04-02 04:37:31
|
|
C
|
ca_nl
|
2025-04-02 04:42:16
|
2025-04-02 04:42:16
|
|
C
|
ca_nl_st_john_s
|
2025-04-02 04:38:23
|
2025-04-02 04:38:23
|
|
C
|
ca_ns
|
2025-04-02 04:32:53
|
2025-04-02 04:32:53
|
|
D>
04:48:06 WARNING pupa: validation of CanadianPerson ab15db8a-0f7d-11f0-b7f4-1a512ca27df6 failed: 2 validation errors:
Value '' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
Value '' for field '<obj>.name' cannot be blank'
|
ca_ns_cape_breton
|
|
2025-04-02 04:48:06
|
Value '' for field '<obj>.name' cannot be blank'
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 175, in validate
validator.validate(self.as_dict(), schema)
File "/app/.heroku/python/lib/python3.9/site-packages/validictory/validator.py", line 616, in validate
raise MultipleValidationError(self._errors)
validictory.validator.MultipleValidationError: 2 validation errors:
Value '' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
Value '' for field '<obj>.name' cannot be blank'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 104, in do_scrape
self.save_object(obj)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 89, in save_object
raise ve
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 85, in save_object
obj.validate()
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 177, in validate
raise ScrapeValueError('validation of {} {} failed: {}'.format(
pupa.exceptions.ScrapeValueError: validation of CanadianPerson ab15db8a-0f7d-11f0-b7f4-1a512ca27df6 failed: 2 validation errors:
Value '' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
Value '' for field '<obj>.name' cannot be blank'
|
C
|
ca_ns_halifax
|
2025-04-02 04:53:00
|
2025-04-02 04:53:01
|
|
C
|
ca_nt
|
2025-04-02 04:31:22
|
2025-04-02 04:31:22
|
|
A
|
ca_nu
|
2025-04-02 04:35:37
|
2025-04-02 04:35:37
|
|
C
|
ca_on
|
2025-04-02 04:55:21
|
2025-04-02 04:55:21
|
|
C
|
ca_on_ajax
|
2025-04-02 04:36:43
|
2025-04-02 04:36:43
|
|
C
|
ca_on_belleville
|
2025-04-02 04:35:42
|
2025-04-02 04:35:42
|
|
C
|
ca_on_brampton
|
2025-04-02 04:45:47
|
2025-04-02 04:45:47
|
|
C
|
ca_on_brantford
|
2025-04-02 04:51:41
|
2025-04-02 04:51:41
|
|
C
|
ca_on_burlington
|
2025-04-02 04:45:38
|
2025-04-02 04:45:38
|
|
C
|
ca_on_caledon
|
2025-04-02 04:40:45
|
2025-04-02 04:40:46
|
|
C
|
ca_on_cambridge
|
2025-04-02 04:46:26
|
2025-04-02 04:46:26
|
|
C
|
ca_on_chatham_kent
|
2025-04-02 04:51:37
|
2025-04-02 04:51:37
|
|
C
|
ca_on_clarington
|
2025-04-02 04:35:57
|
2025-04-02 04:35:57
|
|
C
|
ca_on_fort_erie
|
2025-04-02 04:36:39
|
2025-04-02 04:36:39
|
|
C
|
ca_on_georgina
|
2025-04-02 04:02:30
|
2025-04-02 04:02:30
|
|
C
|
ca_on_greater_sudbury
|
2025-04-02 04:45:42
|
2025-04-02 04:45:42
|
|
C
|
ca_on_grimsby
|
2025-04-02 04:52:15
|
2025-04-02 04:52:15
|
|
C
|
ca_on_guelph
|
2025-04-02 04:51:44
|
2025-04-02 04:51:44
|
|
D>
|
ca_on_haldimand_county
|
|
2025-04-02 04:37:39
|
AssertionError: No councillors found
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_on_haldimand_county/people.py", line 12, in scrape
assert len(councillors), "No councillors found"
AssertionError: No councillors found
|
C
|
ca_on_hamilton
|
2025-04-02 04:47:49
|
2025-04-02 04:47:49
|
|
C
|
ca_on_huron
|
2025-04-02 04:44:47
|
2025-04-02 04:44:47
|
|
C
|
ca_on_kawartha_lakes
|
2025-04-02 04:36:10
|
2025-04-02 04:36:10
|
|
C
|
ca_on_king
|
2025-04-02 04:41:10
|
2025-04-02 04:41:10
|
|
C
|
ca_on_kingston
|
2025-04-02 04:45:28
|
2025-04-02 04:45:28
|
|
C
|
ca_on_kitchener
|
2025-04-02 04:51:53
|
2025-04-02 04:51:53
|
|
D>
|
ca_on_lambton
|
|
2025-04-02 04:38:10
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_on_lambton/people.py", line 15, in scrape
text = councillor.xpath(".//h3/text()")[0]
IndexError: list index out of range
|
C
|
ca_on_lasalle
|
2025-04-02 04:49:27
|
2025-04-02 04:49:27
|
|
C
|
ca_on_lincoln
|
2025-04-02 04:41:57
|
2025-04-02 04:41:57
|
|
C
|
ca_on_london
|
2025-04-02 04:43:08
|
2025-04-02 04:43:08
|
|
D>
|
ca_on_markham
|
|
2025-04-02 04:36:31
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 101, in do_scrape
for iterobj in obj:
File "/app/scrapers/ca_on_markham/people.py", line 87, in scrape_mayor
name = page.xpath(
IndexError: list index out of range
|
D>
|
ca_on_milton
|
|
2025-04-02 07:26:09
|
AssertionError: No councillors found
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_on_milton/people.py", line 19, in scrape
assert len(councillors), "No councillors found"
AssertionError: No councillors found
|
C
|
ca_on_mississauga
|
2025-04-02 04:49:43
|
2025-04-02 04:49:43
|
|
C
|
ca_on_newmarket
|
2025-04-02 04:45:22
|
2025-04-02 04:45:22
|
|
C
|
ca_on_niagara
|
2025-04-02 04:36:04
|
2025-04-02 04:36:04
|
|
C
|
ca_on_niagara_on_the_lake
|
2025-04-02 04:01:49
|
2025-04-02 04:01:50
|
|
C
|
ca_on_north_dumfries
|
2025-04-02 04:49:20
|
2025-04-02 04:49:20
|
|
C
|
ca_on_oakville
|
2025-04-02 04:49:16
|
2025-04-02 04:49:16
|
|
C
|
ca_on_oshawa
|
2025-04-02 04:36:34
|
2025-04-02 04:36:34
|
|
C
|
ca_on_ottawa
|
2025-04-02 04:37:44
|
2025-04-02 04:37:44
|
|
D>
04:43:11 WARNING scrapelib: sleeping for 10 seconds before retry
04:43:21 WARNING scrapelib: sleeping for 20 seconds before retry
04:43:41 WARNING scrapelib: sleeping for 40 seconds before retry
|
ca_on_peel
|
|
2025-04-02 04:44:21
|
scrapelib.HTTPError: 500 while retrieving https://services6.arcgis.com/ONZht79c8QWuX759/arcgis/rest/services/Peel_Ward_Bound…
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/utils.py", line 405, in scrape
reader = self.csv_reader(
File "/app/scrapers/utils.py", line 251, in csv_reader
response = self.get(url, **kwargs)
File "/app/scrapers/utils.py", line 198, in get
return super().get(*args, verify=kwargs.pop("verify", SSL_VERIFY), **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 602, in request
raise HTTPError(resp)
scrapelib.HTTPError: 500 while retrieving https://services6.arcgis.com/ONZht79c8QWuX759/arcgis/rest/services/Peel_Ward_Boundary/FeatureServer/replicafilescache/Peel_Ward_Boundary_-3456469171846657907.csv
|
D>
|
ca_on_pickering
|
|
2025-04-02 04:00:29
|
AssertionError: No councillors found
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_on_pickering/people.py", line 17, in scrape
assert len(councillors), "No councillors found"
AssertionError: No councillors found
|
C
|
ca_on_richmond_hill
|
2025-04-02 04:45:00
|
2025-04-02 04:45:01
|
|
C
|
ca_on_sault_ste_marie
|
2025-04-02 04:52:07
|
2025-04-02 04:52:07
|
|
C
|
ca_on_st_catharines
|
2025-04-02 04:10:25
|
2025-04-02 04:10:25
|
|
C
|
ca_on_thunder_bay
|
2025-04-02 04:43:03
|
2025-04-02 04:43:03
|
|
C
|
ca_on_toronto
|
2025-04-02 04:33:01
|
2025-04-02 04:33:01
|
|
C
|
ca_on_uxbridge
|
2025-04-02 04:48:10
|
2025-04-02 04:48:10
|
|
C
|
ca_on_vaughan
|
2025-04-02 04:47:29
|
2025-04-02 04:47:29
|
|
C
|
ca_on_waterloo
|
2025-04-02 04:45:25
|
2025-04-02 04:45:25
|
|
C
|
ca_on_waterloo_region
|
2025-04-02 04:41:05
|
2025-04-02 04:41:05
|
|
C
|
ca_on_welland
|
2025-04-02 04:53:05
|
2025-04-02 04:53:05
|
|
C
|
ca_on_wellesley
|
2025-04-02 04:46:23
|
2025-04-02 04:46:23
|
|
C
|
ca_on_whitby
|
2025-04-02 04:46:11
|
2025-04-02 04:46:11
|
|
C
|
ca_on_whitchurch_stouffville
|
2025-04-02 04:46:18
|
2025-04-02 04:46:19
|
|
C
|
ca_on_wilmot
|
2025-04-02 04:52:26
|
2025-04-02 04:52:26
|
|
D>
|
ca_on_windsor
|
|
2025-04-02 04:30:57
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_on_windsor/people.py", line 13, in scrape
data = json.loads(self.get(data_url).text.split(" = ")[1])
IndexError: list index out of range
|
D>
|
ca_on_woolwich
|
|
2025-04-02 04:52:10
|
scrapelib.HTTPError: 404 while retrieving https://www.woolwich.ca/en/council/council.asp
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_on_woolwich/people.py", line 13, in scrape
page = self.lxmlize(COUNCIL_PAGE)
File "/app/scrapers/utils.py", line 217, in lxmlize
response = self.get(url, cookies=cookies, verify=verify)
File "/app/scrapers/utils.py", line 198, in get
return super().get(*args, verify=kwargs.pop("verify", SSL_VERIFY), **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 602, in request
raise HTTPError(resp)
scrapelib.HTTPError: 404 while retrieving https://www.woolwich.ca/en/council/council.asp
|
C
|
ca_pe
|
2025-04-02 04:37:18
|
2025-04-02 04:37:18
|
|
C
|
ca_pe_charlottetown
|
2025-04-02 04:42:03
|
2025-04-02 04:42:03
|
|
C
|
ca_pe_stratford
|
2025-04-02 04:51:49
|
2025-04-02 04:51:49
|
|
C
|
ca_pe_summerside
|
2025-04-02 04:02:17
|
2025-04-02 04:02:18
|
|
C
|
ca_qc
|
2025-04-02 04:09:02
|
2025-04-02 04:09:03
|
|
C
|
ca_qc_beaconsfield
|
2025-04-02 04:41:22
|
2025-04-02 04:41:22
|
|
D>
04:49:23 WARNING pupa: validation of CanadianPerson d93529bc-0f7d-11f0-b7f4-1a512ca27df6 failed: 1 validation errors:
Value 'None' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
|
ca_qc_brossard
|
|
2025-04-02 04:49:23
|
Value 'None' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|C…
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 175, in validate
validator.validate(self.as_dict(), schema)
File "/app/.heroku/python/lib/python3.9/site-packages/validictory/validator.py", line 616, in validate
raise MultipleValidationError(self._errors)
validictory.validator.MultipleValidationError: 1 validation errors:
Value 'None' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 104, in do_scrape
self.save_object(obj)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 89, in save_object
raise ve
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 85, in save_object
obj.validate()
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 177, in validate
raise ScrapeValueError('validation of {} {} failed: {}'.format(
pupa.exceptions.ScrapeValueError: validation of CanadianPerson d93529bc-0f7d-11f0-b7f4-1a512ca27df6 failed: 1 validation errors:
Value 'None' for field '<obj>.name' does not match regular expression 'regex.Regex('\\A(?!(?:Chair|Commissioner|Conseiller|Councillor|Deputy|Dr|M|Maire|Mayor|Miss|Mme|Mr|Mrs|Ms|Regional|Warden)\\b)(?:(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)(?:\'|-| - | ))+(?:(?:\\p{Lu}\\.)+|\\p{Lu}+|(?:Jr|Rev|Sr|St)\\.|da|de|den|der|la|van|von|[("](?:\\p{Lu}+|\\p{Lu}\\p{Ll}*(?:-\\p{Lu}\\p{Ll}*)*)[)"]|(?:D\'|d\'|De|de|Des|Di|Du|L\'|La|Le|Mac|Mc|O\'|San|St\\.|Van|Vander?|van|vanden)?\\p{Lu}\\p{Ll}+|\\p{Lu}\\p{Ll}+Anne?|Marie\\p{Lu}\\p{Ll}+|[ᐁᐃᐄᐅᐆᐊᐋᐯᐱᐲᐳᐴᐸᐹᑉᑊᑌᑎᑏᑐᑑᑕᑖᑦᑫᑭᑮᑯᑰᑲᑳᒃᒉᒋᒌᒍᒎᒐᒑᒡᒣᒥᒦᒧᒨᒪᒫᒻᓀᓂᓃᓄᓅᓇᓈᓐᓓᓕᓖᓗᓘᓚᓛᓪᓭᓯᓰᓱᓲᓴᓵᔅᔦᔨᔩᔪᔫᔭᔮᔾᕂᕆᕇᕈᕉᕋᕌᕐᕓᕕᕖᕗᕘᕙᕚᕝᕴᕵᕶᕷᕸᕹᕺᕻᕼᕿᖀᖁᖂᖃᖄᖅᖏᖐᖑᖒᖓᖔᖕᖖᖠᖡᖢᖣᖤᖥᖦᖨᖩᖪᖫᖬᖭᖮᖯᙯᙰᙱᙲᙳᙴᙵᙶ\U00011ab0\U00011ab1\U00011ab2\U00011ab3\U00011ab4\U00011ab5\U00011ab6\U00011ab7\U00011ab8\U00011ab9\U00011aba\U00011abb]+|Á\'a:líya|A\'aliya|Ch\'ng|Prud\'homme|Qwulti\'stunaat|Ya\'ara|D!ONNE|ChiefCalf|IsaBelle)\\Z', flags=regex.V0)'
|
C
|
ca_qc_cote_saint_luc
|
2025-04-02 04:36:47
|
2025-04-02 04:36:47
|
|
C
|
ca_qc_dollard_des_ormeaux
|
2025-04-02 04:38:46
|
2025-04-02 04:38:46
|
|
C
|
ca_qc_dorval
|
2025-04-02 04:44:28
|
2025-04-02 04:44:28
|
|
C
|
ca_qc_gatineau
|
2025-04-02 04:35:11
|
2025-04-02 04:35:11
|
|
C
|
ca_qc_kirkland
|
2025-04-02 04:42:09
|
2025-04-02 04:42:10
|
|
C
|
ca_qc_laval
|
2025-04-02 05:42:08
|
2025-04-02 05:42:08
|
|
C
|
ca_qc_levis
|
2025-04-02 04:37:26
|
2025-04-02 04:37:27
|
|
C
|
ca_qc_longueuil
|
2025-04-02 04:49:06
|
2025-04-02 04:49:06
|
|
D>
04:49:50 WARNING scrapelib: sleeping for 10 seconds before retry
04:50:01 WARNING scrapelib: sleeping for 20 seconds before retry
04:50:21 WARNING scrapelib: sleeping for 40 seconds before retry
|
ca_qc_mercier
|
|
2025-04-02 04:51:02
|
scrapelib.HTTPError: 403 while retrieving https://www.ville.mercier.qc.ca/affaires-municipales/conseil-municipal/membres-du-…
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_qc_mercier/people.py", line 9, in scrape
page = self.lxmlize(COUNCIL_PAGE)
File "/app/scrapers/utils.py", line 217, in lxmlize
response = self.get(url, cookies=cookies, verify=verify)
File "/app/scrapers/utils.py", line 198, in get
return super().get(*args, verify=kwargs.pop("verify", SSL_VERIFY), **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 602, in request
raise HTTPError(resp)
scrapelib.HTTPError: 403 while retrieving https://www.ville.mercier.qc.ca/affaires-municipales/conseil-municipal/membres-du-conseil/
|
C
|
ca_qc_montreal
|
2025-04-02 04:36:19
|
2025-04-02 04:36:19
|
|
C
|
ca_qc_montreal_est
|
2025-04-02 04:31:26
|
2025-04-02 04:31:26
|
|
C
|
ca_qc_pointe_claire
|
2025-04-02 04:35:51
|
2025-04-02 04:35:51
|
|
C
|
ca_qc_quebec
|
2025-04-02 04:31:49
|
2025-04-02 04:31:49
|
|
C
|
ca_qc_saguenay
|
2025-04-02 05:43:53
|
2025-04-02 05:43:53
|
|
C
|
ca_qc_sainte_anne_de_bellevue
|
2025-04-02 04:55:25
|
2025-04-02 04:55:25
|
|
C
|
ca_qc_saint_jean_sur_richelieu
|
2025-04-02 04:38:06
|
2025-04-02 04:38:06
|
|
D>
|
ca_qc_saint_jerome
|
|
2025-04-02 04:32:57
|
AssertionError: No councillors found
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_qc_saint_jerome/people.py", line 11, in scrape
assert len(councillors), "No councillors found"
AssertionError: No councillors found
|
D>
|
ca_qc_senneville
|
|
2025-04-02 04:49:12
|
IndexError: list index out of range
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_qc_senneville/people.py", line 25, in scrape
image = councillor.xpath(".//img/@src")[0]
IndexError: list index out of range
|
C
|
ca_qc_sherbrooke
|
2025-04-02 04:40:22
|
2025-04-02 04:40:23
|
|
D>
04:09:06 WARNING scrapelib: sleeping for 10 seconds before retry
04:09:16 WARNING scrapelib: sleeping for 20 seconds before retry
04:09:36 WARNING scrapelib: sleeping for 40 seconds before retry
|
ca_qc_terrebonne
|
|
2025-04-02 04:10:16
|
scrapelib.HTTPError: 403 while retrieving https://terrebonne.ca/membres-du-conseil-municipal/
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_qc_terrebonne/people.py", line 9, in scrape
page = self.lxmlize(COUNCIL_PAGE, "utf-8")
File "/app/scrapers/utils.py", line 217, in lxmlize
response = self.get(url, cookies=cookies, verify=verify)
File "/app/scrapers/utils.py", line 198, in get
return super().get(*args, verify=kwargs.pop("verify", SSL_VERIFY), **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 602, in request
raise HTTPError(resp)
scrapelib.HTTPError: 403 while retrieving https://terrebonne.ca/membres-du-conseil-municipal/
|
C
|
ca_qc_trois_rivieres
|
2025-04-02 04:42:36
|
2025-04-02 04:42:36
|
|
C
|
ca_qc_westmount
|
2025-04-02 04:51:12
|
2025-04-02 04:51:12
|
|
C
|
ca_sk
|
2025-04-02 05:45:03
|
2025-04-02 05:45:03
|
|
C
|
ca_sk_regina
|
2025-04-02 04:45:18
|
2025-04-02 04:45:18
|
|
D>
04:38:49 WARNING scrapelib: got HTTPSConnectionPool(host='saskatoonopendataconfig.blob.core.windows.net', port=443): Max retries exceeded with url: /converteddata/MayorAndCityCouncilContactInformation.csv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815bcd2e0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 10 seconds before retry
04:38:59 WARNING scrapelib: got HTTPSConnectionPool(host='saskatoonopendataconfig.blob.core.windows.net', port=443): Max retries exceeded with url: /converteddata/MayorAndCityCouncilContactInformation.csv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f7815a16bb0>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 20 seconds before retry
04:39:19 WARNING scrapelib: got HTTPSConnectionPool(host='saskatoonopendataconfig.blob.core.windows.net', port=443): Max retries exceeded with url: /converteddata/MayorAndCityCouncilContactInformation.csv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78159f2940>: Failed to establish a new connection: [Errno -2] Name or service not known')) sleeping for 40 seconds before retry
|
ca_sk_saskatoon
|
|
2025-04-02 04:39:59
|
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='saskatoonopendataconfig.blob.core.windows.net', port=443): Ma…
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/app/.heroku/python/lib/python3.9/socket.py", line 966, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connectionpool.py", line 716, in urlopen
httplib_response = self._make_request(
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connectionpool.py", line 404, in _make_request
self._validate_conn(conn)
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1061, in _validate_conn
conn.connect()
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect
self.sock = conn = self._new_conn()
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f78159f2a90>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/connectionpool.py", line 802, in urlopen
retries = retries.increment(
File "/app/.heroku/python/lib/python3.9/site-packages/urllib3/util/retry.py", line 594, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='saskatoonopendataconfig.blob.core.windows.net', port=443): Max retries exceeded with url: /converteddata/MayorAndCityCouncilContactInformation.csv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78159f2a90>: Failed to establish a new connection: [Errno -2] Name or service not known'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/utils.py", line 405, in scrape
reader = self.csv_reader(
File "/app/scrapers/utils.py", line 251, in csv_reader
response = self.get(url, **kwargs)
File "/app/scrapers/utils.py", line 198, in get
return super().get(*args, verify=kwargs.pop("verify", SSL_VERIFY), **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 579, in request
resp = super().request(
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 404, in request
resp = super().request(
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 232, in request
return super().request(
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 175, in request
raise exception_raised
File "/app/.heroku/python/lib/python3.9/site-packages/scrapelib/__init__.py", line 122, in request
resp = super().request(
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/app/.heroku/python/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='saskatoonopendataconfig.blob.core.windows.net', port=443): Max retries exceeded with url: /converteddata/MayorAndCityCouncilContactInformation.csv (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f78159f2a90>: Failed to establish a new connection: [Errno -2] Name or service not known'))
|
D>
|
ca_yt
|
|
2025-04-02 04:52:18
|
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://yukonassembly.ca/mlas
Traceback (most recent call last):
File "/app/reports/utils.py", line 73, in scrape_people
report.report = subcommand.handle(args, other)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 260, in handle
return self.do_handle(args, other, juris)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 305, in do_handle
report['scrape'] = self.do_scrape(juris, args, scrapers)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/cli/commands/update.py", line 173, in do_scrape
report[scraper_name] = scraper.do_scrape(**scrape_args)
File "/app/.heroku/python/lib/python3.9/site-packages/pupa/scrape/base.py", line 99, in do_scrape
for obj in self.scrape(**kwargs) or []:
File "/app/scrapers/ca_yt/people.py", line 13, in scrape
page = self.cloudscrape(COUNCIL_PAGE)
File "/app/scrapers/utils.py", line 205, in cloudscrape
response.raise_for_status()
File "/app/.heroku/python/lib/python3.9/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://yukonassembly.ca/mlas
|