Download documents from Scribd in pdf format
Scribd-dl uses selenium and headless Chrome to take high resolutions screenshots of the document pages, and eventually merges them into a pdf file.
$ scribd-dl (https://www.)scribd.com/(doc|document|presentation)/(document_id)/* [-p PAGES] [-v]
Examples
$ scribd-dl https://www.scribd.com/document/90403141/Social-Media-Strategy $ scribd-dl scribd.com/document/351688288 scribd.com/document/90403141 -p 1-3 $ scribd-dl https://www.scribd.com/document/352366744 --pages 10-16 $ scribd-dl scribd.com/document/351688288 -p 20 --verbose
you can embed scribd-dl, using a context manager like this:
import scribd_dl
options = {
'pages': '1-3',
'log-level': '2' # info
}
with scribd_dl.ScribdDL(options) as session:
session.download([
'https://www.scribd.com/document/352366744/',
'https://www.scribd.com/document/351688288/'
])
use different page ranges in each document:
import scribd_dl
with scribd_dl.ScribdDL() as session:
session.download('https://www.scribd.com/document/352366744/', pages='1-3')
session.download('https://www.scribd.com/document/351688288/', pages='3-5')
for title in session.doc_titles:
print(title)
Clone it
$ git clone https://github.com/giannisterzopoulos/scribd-dl.git $ cd scribd-dl $ pip install .
or install from PyPI
$ pip install scribd-dl
Chromedriver is required in order to work. See all available chromedriver downloads here.
Put the chromedriver executable in the assets folder or in your system PATH variable.
Tested to work with chromedriver v2.37 and Chrome v65.0.
Scribd-dl supports Python 3.4-3.6