2019-11-24, 21:20
Mit dem folgenden Python-Skript lassen sich auf einfache Weise alle PDFs von einer Webseite herunterladen.
from bs4 import BeautifulSoup
import urllib.request
import requests
url = 'http://irgendeineurl.de'
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
if link.get('href').endswith('.pdf'):
urllib.request.urlretrieve(url + link.get('href'), link.get('href'))
print(url + link.get('href')) |
from bs4 import BeautifulSoup
import urllib.request
import requests
url = 'http://irgendeineurl.de'
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
if link.get('href').endswith('.pdf'):
urllib.request.urlretrieve(url + link.get('href'), link.get('href'))
print(url + link.get('href'))
Uwe Ziegenhagen likes LaTeX and Python, sometimes even combined.
Do you like my content and would like to thank me for it? Consider making a small donation to my local fablab, the Dingfabrik Köln. Details on how to donate can be found here Spenden für die Dingfabrik.
More Posts - Website