Skip to content

Latest commit

 

History

History

kindle

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Amazon Kindle

Intro

Amazon doesn't provide any API to interact with highlights instead, it does provide a website https://read.amazon.co.uk/notebook where you can see them. If you google around, you may find several solutions to parse that page and export the data in different formats.

I've tried a few of them and they didn't work for me because I wasn't able to log into my account in order parse the page. I suppose since those packages were developed, Amazon improved its security a lot and nowdays it takes more efforts to pass it through. All solutions I've found are built upon a "mechanize" package (Ruby version in particular, but I've tried my own implementation via python-requests) i.e. not a real browser.

The error message on the login page just adds confusion:

Enter a valid email or mobile number

of course I'm 100% sure that the credentials are correct.

After that I've tried the headless Chrome and even if I emulate JS events like keyup or mouse.click, Amazon somehow detects that something is wrong and asks me for captcha to complete. I didn't have much patience to explore the ways to fool that protection so I did choose the "semi-automatical" approach:

  • I do log in manually, solve the captcha if required and leave the session open.
  • After that, it's possible to run a script and through CDP parse the page.

Compile

The script has written on Go. You need to change your $GOPATH to the root of the repositry:

source ./activate

To manage dependencies I use dep you need install it globally and then do:

make install

To complile:

make build

To run:

make kindle

This command will build and start container required and will try to parse the page. If the authorisation will be required the script will stop. And you need to open http://localhost:9222/ page (in case of local usage), manually log in and re-run the script.

During the parsing, the script will try to send highlights found to API_ENTRYPOINT specified in the docker-compose.yml file. You can override it through the command line like this:

docker-compose run -e API_ENTRYPOINT=https://my.custom.com/api/highlights/ kindle

TODO

  • Make the Kindle parser a bit more flexable (debug levels, environment variables).