Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enh: refactoring repo organization #121

Merged
merged 16 commits into from
Mar 9, 2021

Conversation

gabrielleberanger
Copy link
Contributor

Issue

This PR addresses Issue #70.

Description

The purpose of this PR is to implement the following repo organization:

- nck/
-- readers/
--- <source>/
---- reader.py
---- cli.py
---- config.py
---- helper.py

For now, I implemented it for adobe and adobe_2_0 readers.
Before I replicate this structure to other readers, could you please confirm that this structure seems relevant to you?
Thank you!

@bibimorlet
Copy link
Contributor

What is our final decision about the _client files or a standard client entrypoint like an api_client file?
For now Adobe, CM, SA360 have client handlers

@gabrielleberanger
Copy link
Contributor Author

What is our final decision about the _client files or a standard client entrypoint like an api_client file?
For now Adobe, CM, SA360 have client handlers

I believe that our target is still to have a unique client class.... I see that we have a generic APIClient class, which is actually not generic (?) as it is used on Yandex readers only.

In the meantime, should I add a client.py file under each nck/readers/<source>/ (for Adobe, CM, SA360 and Yandex)?

@bibimorlet
Copy link
Contributor

Yes meanwhile, I think we can add the client files. The purpose is in fact to create a generic client entrypoint like api_client that is only used by Yandex fr now. This was a feature developped by @benoitgoujon.

@gabrielleberanger
Copy link
Contributor Author

Yes meanwhile, I think we can add the client files. The purpose is in fact to create a generic client entrypoint like api_client that is only used by Yandex fr now. This was a feature developped by @benoitgoujon.

I see ! So I should probably keep api_client.py in a high-level directory (to show that it is generic and not linked to a specific reader).

@benoitgoujon
Copy link
Contributor

Thank you @gabrielleberanger, this is way cleaner.

Regarding the decision about API clients. Indeed I had started something more generic for the Yandex reader, but this may not be possible for each reader and this is a long term vision. But one intermediary step could be to group clients from the same together. It would make sense, especially for Google readers. What do you think?

@gabrielleberanger gabrielleberanger changed the title [WIP] Enh: refactoring repo organization Enh: refactoring repo organization Mar 4, 2021
@bibimorlet
Copy link
Contributor

Thank you @gabrielleberanger, this is way cleaner.

Regarding the decision about API clients. Indeed I had started something more generic for the Yandex reader, but this may not be possible for each reader and this is a long term vision. But one intermediary step could be to group clients from the same together. It would make sense, especially for Google readers. What do you think?

yes indeed, it won't be possible for every readers.
I suggest that this PR handles the file reorganization and we will group clients in another one

@benoitgoujon
Copy link
Contributor

Yes of course! Let's keep our PR short 😜

@gabrielleberanger
Copy link
Contributor Author

gabrielleberanger commented Mar 4, 2021

Thank you for your clarification @benoitgoujon ! I also had some doubts on the fact that we would be able to standardize all clients into a single one, so grouping similar ones seems more realistic as preliminary step. I will detail this on Issue #69 😊

For now, I implemented the following structure:

  • Clients shared by several sources appear under the nck/clients/ directory.
  • Source-level clients appear under nck/readers/<source>/client.py modules.

Here is a zoom on the newly implemented nck/clients/ directory:

nck/clients/
- adobe_analytics/
-- client.py # Former nck/clients/adobe_client.py. This client was not moved to `nck/readers/<source>/` as it is shared by the two Adobe Analytics readers.
- api/
-- client.py # Former nck/clients/api_client.py
-- helper.py # Former nck/helpers/api_client_helper.py
- google/
-- client.py # Former nck/helpers/google_base.py

Does it make sense to you?
Also, with the new organization, I could completely remove our former nck/helpers/ directory.

Next steps

As this PR implements a lot of changes, I would like to run test commands for each source/destination to make sure that it will not break anything in production when it will be released. Using "SG" credentials, I can test:

  • Readers: Adobe Analytics 1.4, Adobe Analytics 2.0, Confluence, Facebook, Google Ads, Google Analytics, DBM, DCM, SA360, Search Console, The Trade Desk and Twitter
  • Writers: Google Cloud Storage

@tom-grivaud, if you are using sources/destinations that are not in this list, could you please provide me sample commands, so that I can make sure that everything goes well for "Si" also? Thank you! 😊

It will probably help you "trust" my changes more, so that you won't have to review the 180 changed files one by one 😛

@benoitgoujon
Copy link
Contributor

Yes, would make perfect sense!

Yes, we need to be careful on that before merging 👍

@gabrielleberanger
Copy link
Contributor Author

I launched test commands on all the sources/destinations @tom-grivaud and I are using, and it works fine 👌
So I will adapt the documentation, and we'll be good for merging.

README.md Outdated
@@ -1,90 +1,65 @@
# Nautilus Connectors Kit

**NCK is a Command-Line Interface (CLI), allowing you to easily request, stream and store raw reports, from the API source to the destination of your choice.**
**NCK is an ET(L) tool specialized in API data ingestion. It is accessible through a Command-Line Interface. The application allows you to easily extract, stream and load data (with minimum transformations), from the API source to the destination of your choice.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really the "L" that is optional?

Suggested change
**NCK is an ET(L) tool specialized in API data ingestion. It is accessible through a Command-Line Interface. The application allows you to easily extract, stream and load data (with minimum transformations), from the API source to the destination of your choice.**
**NCK is an E(T)L tool specialized in API data ingestion. It is accessible through a Command-Line Interface. The application allows you to easily extract, stream and load data (with minimum transformations), from the API source to the destination of your choice.**

Copy link
Contributor Author

@gabrielleberanger gabrielleberanger Mar 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, thank you for spotting this!

@@ -181,21 +181,44 @@ How to develop a new reader

To create a new reader, you should:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the purpose of this PR but the "Getting Started" is oriented towards contributions and not towards our end users. I'm wondering if it is a good thing. Maybe we should put this section in a "Contributing" section and create a real "Getting started" guide with info on installation and 1 or 2 simple commands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I'll create an issue for this matter.

@gabrielleberanger
Copy link
Contributor Author

I am good for merging, if you're fine with the changes 😊

Just a precision: in the end, I chose to keep all clients in the nck/clients folder (I found it disturbing to have some clients in the nck/clients folder, and others in the nck/readers/<source>/client.py modules).

@gabrielleberanger gabrielleberanger merged commit 13ce44d into dev Mar 9, 2021
@gabrielleberanger gabrielleberanger deleted the enh/refactoring_architecture branch March 9, 2021 08:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants