Python

Search for '{{search_term}}'

urllib vs urllib2 in Python - fetch the content of 404 or raise exception?

CMOS is the Code-Maven Open Source podcast that also includes video interviews. Subscribe to this feed RSS feed with your Podcast listener app or via iTunes iTunes.

A small snippet of code using urllib and urllib2 of Python to fetch a page.

The difference I've just encountered is that urllib will return the content of a page even if the page is not found (404) while urllib2 will throw an exception.

urllib

examples/python/try_urllib.py

from __future__ import print_function
import urllib, sys


def fetch():
    if len(sys.argv) != 2:
        print("Usage: {} URL".format(sys.argv[0]))
        return
    url = sys.argv[1]
    f = urllib.urlopen(url)
    html = f.read()
    print(html)

fetch()

Running python try_urllib.py https://www.python.org/xyz will print a big HTML page because https://www.python.org/xyz is a big HTML page.

urllib2

examples/python/try_urllib2.py

from __future__ import print_function
import urllib2, sys


def fetch():
    if len(sys.argv) != 2:
        print("Usage: {} URL".format(sys.argv[0]))
        return
    url = sys.argv[1]
    try:
        f = urllib2.urlopen(url)
        html = f.read()
        print(html)
    except urllib2.HTTPError as e:
        print(e)

fetch()

Running python try_urllib2.py https://www.python.org/xyz will print

HTTP Error 404: OK

Comments

In the comments, please wrap your code snippets within <pre> </pre> tags and use spaces for indentation.
comments powered by Disqus