2013-05-13, 08:36
As every year the German documentation for the TeX Live distribution is on my agenda. To check the more than 100 weblinks in the document I wrote a small Python script which does the job fairly well.
import re
import urllib2
filehandle = open("texlive-de-new.tex")
text = filehandle.read()
filehandle.close()
# regexp from http://www.noah.org/wiki/RegEx_Python
m = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', text)
i = 0
for item in m:
i=i+1
print i, '\t', item, '\t',
try:
response = urllib2.urlopen(item)
except urllib2.HTTPError, e:
print e.code
except urllib2.URLError, u:
print u.args
print "\n" |
import re
import urllib2
filehandle = open("texlive-de-new.tex")
text = filehandle.read()
filehandle.close()
# regexp from http://www.noah.org/wiki/RegEx_Python
m = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', text)
i = 0
for item in m:
i=i+1
print i, '\t', item, '\t',
try:
response = urllib2.urlopen(item)
except urllib2.HTTPError, e:
print e.code
except urllib2.URLError, u:
print u.args
print "\n"
Uwe Ziegenhagen likes LaTeX and Python, sometimes even combined.
Do you like my content and would like to thank me for it? Consider making a small donation to my local fablab, the Dingfabrik Köln. Details on how to donate can be found here Spenden für die Dingfabrik.
More Posts - Website