Skip to content

Commit

Permalink
feat: gotocompany migration (#1)
Browse files Browse the repository at this point in the history
  • Loading branch information
lavkesh authored Mar 14, 2023
1 parent e4e5977 commit 50da101
Show file tree
Hide file tree
Showing 379 changed files with 1,880 additions and 1,927 deletions.
8 changes: 0 additions & 8 deletions .github/ISSUE_TEMPLATE/config.yml

This file was deleted.

6 changes: 3 additions & 3 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ jobs:
run: cd docs && yarn build
- name: Deploy docs
env:
GIT_USER: ravisuhag
GIT_USER: anshuman-gojek
GIT_PASS: ${{ secrets.DOCU_RS_TOKEN }}
DEPLOYMENT_BRANCH: gh-pages
CURRENT_BRANCH: master
working-directory: docs
run: |
git config --global user.email "suhag.ravi@gmail.com"
git config --global user.name "ravisuhag"
git config --global user.email "anshuman.srivastava@gojek.com"
git config --global user.name "anshuman-gojek"
yarn deploy
4 changes: 2 additions & 2 deletions .github/workflows/package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,5 +45,5 @@ jobs:
context: .
push: true
tags: |
odpf/firehose:latest
odpf/firehose:${{ steps.get_version.outputs.version-without-v }}
gotocompany/firehose:latest
gotocompany/firehose:${{ steps.get_version.outputs.version-without-v }}
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ COPY --from=GRADLE_BUILD ./jolokia-jvm-agent.jar /opt/firehose
COPY --from=GRADLE_BUILD ./src/main/resources/log4j.xml /opt/firehose/etc/log4j.xml
COPY --from=GRADLE_BUILD ./src/main/resources/logback.xml /opt/firehose/etc/logback.xml
WORKDIR /opt/firehose
CMD ["java", "-cp", "bin/*:/work-dir/*", "io.odpf.firehose.launch.Main", "-server", "-Dlogback.configurationFile=etc/firehose/logback.xml", "-Xloggc:/var/log/firehose"]
CMD ["java", "-cp", "bin/*:/work-dir/*", "com.gotocompany.firehose.launch.Main", "-server", "-Dlogback.configurationFile=etc/firehose/logback.xml", "-Xloggc:/var/log/firehose"]
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Firehose

![build workflow](https://github.com/odpf/firehose/actions/workflows/build.yml/badge.svg)
![package workflow](https://github.com/odpf/firehose/actions/workflows/package.yml/badge.svg)
![build workflow](https://github.com/goto/firehose/actions/workflows/build.yml/badge.svg)
![package workflow](https://github.com/goto/firehose/actions/workflows/package.yml/badge.svg)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?logo=apache)](LICENSE)
[![Version](https://img.shields.io/github/v/release/odpf/firehose?logo=semantic-release)](Version)
[![Version](https://img.shields.io/github/v/release/goto/firehose?logo=semantic-release)](Version)

Firehose is a cloud native service for delivering real-time streaming data to destinations such as service endpoints (HTTP or GRPC) & managed databases (Postgres, InfluxDB, Redis, Elasticsearch, Prometheus and MongoDB). With Firehose, you don't need to write applications or manage resources. It can be scaled up to match the throughput of your data. If your data is present in Kafka, Firehose delivers it to the destination(SINK) that you specified.

Expand Down Expand Up @@ -47,28 +47,28 @@ Explore the following resources to get started with Firehose:

## Run with Docker

Use the docker hub to download firehose [docker image](https://hub.docker.com/r/odpf/firehose/). You need to have docker installed in your system.
Use the docker hub to download firehose [docker image](https://hub.docker.com/r/gotocompany/firehose/). You need to have docker installed in your system.

```
# Download docker image from docker hub
$ docker pull odpf/firehose
$ docker pull gotocompany/firehose
# Run the following docker command for a simple log sink.
$ docker run -e SOURCE_KAFKA_BROKERS=127.0.0.1:6667 -e SOURCE_KAFKA_CONSUMER_GROUP_ID=kafka-consumer-group-id -e SOURCE_KAFKA_TOPIC=sample-topic -e SINK_TYPE=log -e SOURCE_KAFKA_CONSUMER_CONFIG_AUTO_OFFSET_RESET=latest -e INPUT_SCHEMA_PROTO_CLASS=com.github.firehose.sampleLogProto.SampleLogMessage -e SCHEMA_REGISTRY_STENCIL_ENABLE=true -e SCHEMA_REGISTRY_STENCIL_URLS=http://localhost:9000/artifactory/proto-descriptors/latest odpf/firehose:latest
$ docker run -e SOURCE_KAFKA_BROKERS=127.0.0.1:6667 -e SOURCE_KAFKA_CONSUMER_GROUP_ID=kafka-consumer-group-id -e SOURCE_KAFKA_TOPIC=sample-topic -e SINK_TYPE=log -e SOURCE_KAFKA_CONSUMER_CONFIG_AUTO_OFFSET_RESET=latest -e INPUT_SCHEMA_PROTO_CLASS=com.github.firehose.sampleLogProto.SampleLogMessage -e SCHEMA_REGISTRY_STENCIL_ENABLE=true -e SCHEMA_REGISTRY_STENCIL_URLS=http://localhost:9000/artifactory/proto-descriptors/latest/gotocompany/firehose:latest
```

**Note:** Make sure your protos (.jar file) are located in `work-dir`, this is required for Filter functionality to work.

## Run with Kubernetes

- Create a firehose deployment using the helm chart available [here](https://github.com/odpf/charts/tree/main/stable/firehose)
- Create a firehose deployment using the helm chart available [here](https://github.com/goto/charts/tree/main/stable/firehose)
- Deployment also includes telegraf container which pushes stats metrics

## Running locally

```sh
# Clone the repo
$ git clone https://github.com/odpf/firehose.git
$ git clone https://github.com/goto/firehose.git

# Build the jar
$ ./gradlew clean build
Expand Down Expand Up @@ -101,11 +101,11 @@ Development of Firehose happens in the open on GitHub, and we are grateful to th

Read our [contributing guide](docs/docs/contribute/contribution.md) to learn about our development process, how to propose bugfixes and improvements, and how to build and test your changes to Firehose.

To help you get your feet wet and get you familiar with our contribution process, we have a list of [good first issues](https://github.com/odpf/firehose/labels/good%20first%20issue) that contain bugs which have a relatively limited scope. This is a great place to get started.
To help you get your feet wet and get you familiar with our contribution process, we have a list of [good first issues](https://github.com/goto/firehose/labels/good%20first%20issue) that contain bugs which have a relatively limited scope. This is a great place to get started.

## Credits

This project exists thanks to all the [contributors](https://github.com/odpf/firehose/graphs/contributors).
This project exists thanks to all the [contributors](https://github.com/goto/firehose/graphs/contributors).

## License

Expand Down
16 changes: 8 additions & 8 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ lombok {
sha256 = ""
}

group 'io.odpf'
version '0.7.4'
group 'com.gotocompany'
version '0.8.0'

def projName = "firehose"

Expand All @@ -54,7 +54,7 @@ private Properties loadEnv() {
properties
}

def mainClassName = "io.odpf.firehose.launch.Main"
def mainClassName = "com.gotocompany.firehose.launch.Main"

dependencies {
implementation group: 'com.google.protobuf', name: 'protobuf-java', version: '3.1.0'
Expand All @@ -71,7 +71,7 @@ dependencies {
implementation group: 'org.apache.commons', name: 'commons-jexl', version: '2.1'
implementation group: 'org.apache.commons', name: 'commons-lang3', version: '3.5'
implementation group: 'com.google.code.gson', name: 'gson', version: '2.7'
implementation group: 'io.odpf', name: 'stencil', version: '0.2.1' exclude group: 'org.slf4j'
implementation group: 'com.gotocompany', name: 'stencil', version: '0.4.0' exclude group: 'org.slf4j'
implementation group: 'software.amazon.awssdk', name: 's3', version: '2.17.129'
implementation group: 'org.influxdb', name: 'influxdb-java', version: '2.5'
implementation group: 'com.jayway.jsonpath', name: 'json-path', version: '2.4.0'
Expand Down Expand Up @@ -101,7 +101,7 @@ dependencies {
implementation 'com.google.cloud:google-cloud-storage:1.114.0'
implementation 'com.google.cloud:google-cloud-bigquery:1.115.0'
implementation 'org.apache.logging.log4j:log4j-core:2.17.1'
implementation group: 'io.odpf', name: 'depot', version: '0.3.8'
implementation group: 'com.gotocompany', name: 'depot', version: '0.4.0'
implementation group: 'com.networknt', name: 'json-schema-validator', version: '1.0.59' exclude group: 'org.slf4j'

testImplementation group: 'junit', name: 'junit', version: '4.11'
Expand Down Expand Up @@ -146,7 +146,7 @@ test {
events "passed", "skipped", "failed"
}
useJUnit {
excludeCategories 'io.odpf.firehose.test.categories.IntegrationTest'
excludeCategories 'com.gotocompany.firehose.test.categories.IntegrationTest'
}
doLast {
delete "$projectDir/src/test/resources/__files"
Expand All @@ -158,7 +158,7 @@ clean {
}
jar {
manifest {
attributes 'Main-Class': 'io.odpf.firehose.launch.Main'
attributes 'Main-Class': 'com.gotocompany.firehose.launch.Main'
duplicatesStrategy = 'exclude'
zip64 = true
}
Expand All @@ -181,7 +181,7 @@ publishing {
repositories {
maven {
name = "GitHubPackages"
url = "https://maven.pkg.github.com/odpf/firehose"
url = "https://maven.pkg.github.com/goto/firehose"
credentials {
username = System.getenv("GITHUB_ACTOR")
password = System.getenv("GITHUB_TOKEN")
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/concepts/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ _**Sink**_
- All the existing sink types follow the same contract/lifecycle defined in `AbstractSink.java`. It consists of two stages:
- Prepare: Transformation over-filtered messages’ list to prepare the sink-specific insert/update client requests.
- Execute: Requests created in the Prepare stage are executed at this step and a list of failed messages is returned \(if any\) for retry.
- Underlying implementation of AbstractSink can use implementation present in [depot](https://github.com/odpf/depot).
- Underlying implementation of AbstractSink can use implementation present in [depot](https://github.com/goto/depot).
- If the batch has any failures, Firehose will retry to push the failed messages to the sink

_**SinkPool**_
Expand Down
6 changes: 3 additions & 3 deletions docs/docs/concepts/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,11 +71,11 @@ Lastly, set up Telegraf to send metrics to InfluxDB, following the corresponding

#### Firehose deployed on Kubernetes _\*\*_

1. Follow[ this guide](https://github.com/odpf/charts/tree/main/stable/firehose#readme) for deploying Firehose on a Kubernetes cluster using a Helm chart.
2. Configure the following parameters in the default [values.yaml](https://github.com/odpf/charts/blob/main/stable/firehose/values.yaml) file and run -
1. Follow[ this guide](https://github.com/goto/charts/tree/main/stable/firehose#readme) for deploying Firehose on a Kubernetes cluster using a Helm chart.
2. Configure the following parameters in the default [values.yaml](https://github.com/goto/charts/blob/main/stable/firehose/values.yaml) file and run -

```text
$ helm install my-release -f values.yaml odpf/firehose
$ helm install my-release -f values.yaml goto/firehose
```

| Key | Type | Default | Description |
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/concepts/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ scale. This section explains the overall architecture of Firehose and describes
## [Monitoring Firehose with exposed metrics](monitoring.md)

Always know what’s going on with your deployment with
built-in [monitoring](https://github.com/odpf/firehose/blob/main/docs/assets/firehose-grafana-dashboard.json) of
built-in [monitoring](https://github.com/goto/firehose/blob/main/docs/assets/firehose-grafana-dashboard.json) of
throughput, response times, errors and more. This section contains guides, best practices and advises related to
managing Firehose in production.

Expand Down
10 changes: 5 additions & 5 deletions docs/docs/contribute/contribution.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
The following is a set of guidelines for contributing to Firehose. These are mostly guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request. Here are some important resources:

- The [Concepts](../guides/create_firehose.md) section will explain to you about Firehose architecture,
- Our [roadmap](https://github.com/odpf/firehose/blob/main/docs/roadmap.md) is the 10k foot view of where we're going, and
- Github [issues](https://github.com/odpf/firehose/issues) track the ongoing and reported issues.
- Our [roadmap](https://github.com/goto/firehose/blob/main/docs/roadmap.md) is the 10k foot view of where we're going, and
- Github [issues](https://github.com/goto/firehose/issues) track the ongoing and reported issues.

Development of Firehose happens in the open on GitHub, and we are grateful to the community for contributing bug fixes and improvements. Read below to learn how you can take part in improving Firehose.

Expand All @@ -23,14 +23,14 @@ The following parts are open for contribution:
- Provide suggestions to make the user experience better
- Provide suggestions to Improve the documentation

To help you get your feet wet and get you familiar with our contribution process, we have a list of [good first issues](https://github.com/odpf/firehose/labels/good%20first%20issue) that contain bugs that have a relatively limited scope. This is a great place to get started.
To help you get your feet wet and get you familiar with our contribution process, we have a list of [good first issues](https://github.com/goto/firehose/labels/good%20first%20issue) that contain bugs that have a relatively limited scope. This is a great place to get started.

## How can I contribute?

We use RFCs and GitHub issues to communicate ideas.

- You can report a bug or suggest a feature enhancement or can just ask questions. Reach out on Github discussions for this purpose.
- You are also welcome to add a new common sink in [depot](https://github.com/odpf/depot), improve monitoring and logging and improve code quality.
- You are also welcome to add a new common sink in [depot](https://github.com/goto/depot), improve monitoring and logging and improve code quality.
- You can help with documenting new features or improve existing documentation.
- You can also review and accept other contributions if you are a maintainer.

Expand All @@ -53,4 +53,4 @@ Please follow these practices for your change to get merged fast and smoothly:
- If you are introducing a completely new feature or making any major changes to an existing one, we recommend starting with an RFC and get consensus on the basic design first.
- Make sure your local build is running with all the tests and checkstyle passing.
- If your change is related to user-facing protocols/configurations, you need to make the corresponding change in the documentation as well.
- Docs live in the code repo under [`docs`](https://github.com/odpf/firehose/tree/7d0df99962507e6ad2147837c4536f36d52d5a48/docs/docs/README.md) so that changes to that can be done in the same PR as changes to the code.
- Docs live in the code repo under [`docs`](https://github.com/goto/firehose/tree/main/docs/docs/README.md) so that changes to that can be done in the same PR as changes to the code.
8 changes: 4 additions & 4 deletions docs/docs/contribute/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Configuration parameter variables of each sink can be found in the [Configuratio

When `INPUT_SCHEMA_DATA_TYPE is set to protobuf`, firehose uses Stencil Server as its Schema Registry for hosting Protobuf descriptors. The environment variable `SCHEMA_REGISTRY_STENCIL_ENABLE` must be set to `true` . Stencil server URL must be specified in the variable `SCHEMA_REGISTRY_STENCIL_URLS` . The Proto Descriptor Set file of the Kafka messages must be uploaded to the Stencil server.

Refer [this guide](https://github.com/odpf/stencil/tree/master/server#readme) on how to set up and configure the Stencil server, and how to generate and upload Proto descriptor set file to the server.
Refer [this guide](https://github.com/goto/stencil/tree/master/server#readme) on how to set up and configure the Stencil server, and how to generate and upload Proto descriptor set file to the server.

### Monitoring

Expand All @@ -56,7 +56,7 @@ Firehose sends critical metrics via StatsD client. Refer the[ Monitoring](../con

```bash
# Clone the repo
$ git clone https://github.com/odpf/firehose.git
$ git clone https://github.com/goto/firehose.git

# Build the jar
$ ./gradlew clean build
Expand All @@ -72,7 +72,7 @@ Set the generic variables in the local.properties file.
KAFKA_RECORD_PARSER_MODE = message
SINK_TYPE = log
INPUT_SCHEMA_DATA_TYPE=protobuf
INPUT_SCHEMA_PROTO_CLASS = io.odpf.firehose.consumer.TestMessage
INPUT_SCHEMA_PROTO_CLASS = com.gotocompany.firehose.consumer.TestMessage
```
Set the variables which specify the kafka server, topic name, and group-id of the kafka consumer - the standard values are used here.
```text
Expand All @@ -82,7 +82,7 @@ SOURCE_KAFKA_CONSUMER_GROUP_ID = sample-group-id
```

### Stencil Workaround
Firehose uses [Stencil](https://github.com/odpf/stencil) as the schema-registry which enables dynamic proto schemas. For the sake of this
Firehose uses [Stencil](https://github.com/goto/stencil) as the schema-registry which enables dynamic proto schemas. For the sake of this
quick-setup guide, we can work our way around Stencil setup by setting up a simple local HTTP server which can provide the static descriptor for TestMessage schema.


Expand Down
10 changes: 5 additions & 5 deletions docs/docs/guides/create_firehose.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ INPUT_SCHEMA_PROTO_CLASS=com.tests.TestMessage
Sample output of a Firehose log sink:

```text
2021-03-29T08:43:05,998Z [pool-2-thread-1] INFO i.o.firehose.Consumer- Execution successful for 1 records
2021-03-29T08:43:06,246Z [pool-2-thread-1] INFO i.o.firehose.Consumer - Pulled 1 messages
2021-03-29T08:43:06,246Z [pool-2-thread-1] INFO io.odpf.firehose.sink.log.LogSink -
2021-03-29T08:43:05,998Z [pool-2-thread-1] INFO com.gotocompany.firehose.Consumer- Execution successful for 1 records
2021-03-29T08:43:06,246Z [pool-2-thread-1] INFO com.gotocompany.firehose.Consumer - Pulled 1 messages
2021-03-29T08:43:06,246Z [pool-2-thread-1] INFO com.gotocompany.firehose.sink.log.LogSink -
================= DATA =======================
sample_field: 81179979
sample_field_2: 9897987987
Expand Down Expand Up @@ -133,11 +133,11 @@ _**Note:**_ [_**DATABASE**_](../sinks/influxdb-sink.md#sink_influx_db_name) _**a
- For INPUT_SCHEMA_DATA_TYPE = protobuf, this sink will generate bigquery schema from protobuf message schema and update bigquery table with the latest generated schema.
- The protobuf message of a `google.protobuf.Timestamp` field might be needed when table partitioning is enabled.
- For INPUT_SCHEMA_DATA_TYPE = json, this sink will generate bigquery schema by infering incoming json. In future we will add support for json schema as well coming from stencil.
- The timestamp column is needed incase of partition table. It can be generated at the time of ingestion by setting the config. Please refer to config `SINK_BIGQUERY_ADD_EVENT_TIMESTAMP_ENABLE` in [depot bigquery sink config section](https://github.com/odpf/depot/blob/main/docs/reference/configuration/bigquery-sink.md#sink_bigquery_add_event_timestamp_enable)
- The timestamp column is needed incase of partition table. It can be generated at the time of ingestion by setting the config. Please refer to config `SINK_BIGQUERY_ADD_EVENT_TIMESTAMP_ENABLE` in [depot bigquery sink config section](https://github.com/goto/depot/blob/main/docs/reference/configuration/bigquery-sink.md#sink_bigquery_add_event_timestamp_enable)
- Google cloud credential with some bigquery permission is required to run this sink.

## Create a Bigtable sink

- it requires the following environment [variables](https://github.com/odpf/depot/blob/main/docs/reference/configuration/bigtable.md) ,which are required by ODPF Depot library, to be set along with the generic firehose variables.
- it requires the following environment [variables](https://github.com/goto/depot/blob/main/docs/reference/configuration/bigtable.md) ,which are required by Depot library, to be set along with the generic firehose variables.

If you'd like to connect to a sink which is not yet supported, you can create a new sink by following the [contribution guidelines](../contribute/contribution.md)
Loading

0 comments on commit 50da101

Please sign in to comment.