Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Commit

Permalink
Updated README, project metadata, and consistent file headers
Browse files Browse the repository at this point in the history
  • Loading branch information
nikhilk committed Sep 19, 2015
1 parent 70c73a3 commit 494be12
Show file tree
Hide file tree
Showing 129 changed files with 941 additions and 923 deletions.
2 changes: 1 addition & 1 deletion AUTHORS
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# This is the official list of Data Studio authors for copyright purposes.
# This is the official list of Google Cloud DataLab authors for copyright purposes.
# This file is distinct from the CONTRIBUTORS files.
# See the latter for an explanation.

Expand Down
14 changes: 7 additions & 7 deletions CONTRIBUTORS
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# People who have agreed to one of the CLAs and can contribute patches.
# The AUTHORS file lists the copyright holders; this file
# lists people.
# The AUTHORS file lists the copyright holders; this file lists individuals.
#
# https://developers.google.com/open-source/cla/individual
# https://developers.google.com/open-source/cla/corporate
#

Dinesh Kulkarni <[email protected]>
Drew Bryant <[email protected]>
Graham Wheeler <[email protected]>
Nikhil Kothari <[email protected]>

Google Cloud DataLab Team
- Nikhil Kothari <[email protected]>
- Drew Bryant <[email protected]>
- Dinesh Kulkarni <[email protected]>
- Graham Wheeler <[email protected]>
- Bradley Jiang <[email protected]>
95 changes: 86 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,91 @@
# Google Cloud DataLab

Google Cloud DataLab brings interactive tools for big data scenarios on Google
Cloud Platform.
[Google Cloud DataLab](https://cloud.google.com/datalab) provides a productive, interactive, and
integrated tool to explore, visualize, analyze and transform data, bringing together the power of
python, SQL, and the [Google Cloud Platform](https://cloud.google.com) with services such as
[BigQuery](https://cloud.google.com/bigquery) and [Storage](https://cloud.google.com/storage)
to extract insights and harness the value of your data.

DataLab enables developers and data scientists to easily and efficiently
explore, transform, analyze and visualize their data and develop, test and
deploy data processing pipelines to run on the cloud. It caters to scenarios
ranging from ad-hoc and exploratory data analysis to development.
DataLab builds on the interactive notebooks, and the foundation of [Jupyter](http://jupyter.org)
(formerly IPython) to enable developers, data scientists and data analysts to easily work with
their data in exploratory scenarios and extends that metaphor to developing and deploying
data processing pipelines.

In its current form, DataLab enables using services such as Google BigQuery,
and using the combination of SQL, python (and libraries such as pandas and
matplotlib).
DataLab deeply integrates into Google Cloud Platform. It provides a secure environment for all the
members of your cloud project to effortlessly access all data and resources available to the
project, and manage and share notebooks within the project using the associated git repository.

You can see an example of the notebooks by browsing through the
[samples and documentation](https://github.com/GoogleCloudPlatform/datalab/tree/master/content/datalab/notebooks),
which are themselves written in the form of notebooks.


## Getting Started

DataLab is packaged as a docker container, and contains the DataLab experience, along with
Jupyter/IPython, and a variety of python libraries such as numpy, pandas, scikit-learn and
matplotlib, in a ready-to-use form.

The simplest way to start using DataLab is on Google Cloud Platform. Head over to the
[Google Cloud DataLab](https://datalab.cloud.google.com) site to deploy your own instance.

You can also run the docker container locally, as described in the
[wiki](https://github.com/googlecloudplatform/datalab/wiki/Getting-Started).


## Contacting Us

Please submit questions on using DataLab at
[StackOverflow](http://stackoverflow.com/questions/tagged/google-cloud-datalab) using the tag
`google-cloud-datalab`.

For any product issues, you can either submit issues here, or you can submit feedback using the
feedback link within the product.


## Developing DataLab

### Contributing

Contributions are welcome! Please see our [roadmap](https://github.com/GoogleCloudPlatform/datalab/wiki/Roadmap)
page. Please check the page on [contributing](https://github.com/GoogleCloudPlatform/datalab/wiki/Contributing)
for more details.

You can always contribute even without code submissions by submitting issues and suggestions to
help improve DataLab and building and sharing samples and being a member of the community.

### Building and Running

The [wiki](https://github.com/googlecloudplatform/datalab/wiki/Development-Environment) describes
the process of setting up a local development environment, as well as the steps to build and run,
and the developer workflow.

### Navigating the Repository

This is a quick description of the repository structure to help understand and
discover the relevant pieces.

All source code corresponding to product functionality that is built exists
within `/sources`. The following is a list of the individual components:

* `/sources/lib` - set of python libraries used to implement APIs to access Google
Cloud Platform services, and implement the DataLab interactive experience.
- api: Google Cloud Platform APIs (currently: BigQuery and Cloud Storage).
- datalab: interactive notebook experience to plug into Jupyter and IPython.

* `/sources/web` - the DataLab web server. This is implemented in node.js and
serves the DataLab front-end experience - both content and APIs, as well as backend
infrastructure such as notebook source control.
Some of the requests are proxied to the Jupyter notebook server, which manages notebooks and
associated kernel sessions.

* `/sources/tools` - miscellaneous other supporting tools.

Source code builds into the /build directory, and the generated build outputs are
consumed when building the DataLab docker container.

The build outputs are packaged in the form of a docker container.

* `/containers/datalab` - the only container for now. This is the container that is used as the
DataLab AppEngine module.

46 changes: 0 additions & 46 deletions REPO.md

This file was deleted.

2 changes: 2 additions & 0 deletions containers/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
This directory contains the definitions for the docker containers associated with the
Google Cloud DataLab product.
2 changes: 1 addition & 1 deletion containers/datalab/build.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down
4 changes: 2 additions & 2 deletions containers/datalab/config/ipython.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

"""Customized IPython configuration for Google Cloud DataLab."""
"""IPython configuration for Google Cloud DataLab."""

c = get_config()

Expand Down
6 changes: 3 additions & 3 deletions containers/datalab/content/run-cloud.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,8 +13,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# This script serves as the entrypoint for locally running the DataLab
# docker container in a VM on the cloud.
# Entrypoint script for running the container as an AppEngine module
# within Google Cloud Platform.

export DATALAB_ENV=cloud

Expand Down
5 changes: 2 additions & 3 deletions containers/datalab/content/run-local.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,8 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# This script serves as the entrypoint for locally running the DataLab
# docker container, i.e. outside a VM on the cloud.
# Runs the docker container locally.

export DATALAB_ENV=local
export METADATA_HOST=localhost
Expand Down
4 changes: 3 additions & 1 deletion containers/datalab/content/setup-env.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,6 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# Sets up various environment variables within the docker container.

export DATALAB_USER=`gcloud -q config list --format yaml | grep account | awk -F" " '{print $2}'`
export DATALAB_PROJECT_ID=`gcloud -q config list --format yaml | grep project | awk -F" " '{print $2}'`
Expand All @@ -22,3 +23,4 @@ fi
if [ -z $DATALAB_INSTANCE_NAME ]; then
export DATALAB_INSTANCE_NAME=$GAE_MODULE_VERSION
fi

7 changes: 4 additions & 3 deletions containers/datalab/content/setup-repo.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,8 +13,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# This script sets up cloud repository, including creating master branch,
# datalab branch, and named instance branch if they do not exist.
# Sets up the git repository and workspace used within the container. This
# also creates the branches (master, datalab, and datalab_instance) as needed.

git config --global user.email $DATALAB_USER
git config --global credential.helper gcloud.sh
Expand Down Expand Up @@ -61,3 +61,4 @@ create_branch ( ) {
create_branch "master"
create_branch "datalab"
create_branch "datalab_$DATALAB_INSTANCE_NAME"

4 changes: 2 additions & 2 deletions containers/datalab/logs.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# This script shows logs generated from running the docker container locally.
# Shows formatted logs generated by running the docker container locally.

LOGFILE=$HOME/datalab/log/custom_logs/app.log

Expand Down
4 changes: 2 additions & 2 deletions containers/datalab/release.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# Publishes the built docker image to the registry
# Releases the built docker image to the registry with the latest tag.

# Grant read permissions to all users on all objects added in the GCS bucket
# that holds docker image files by ACLing the bucket and setting the default
Expand Down
10 changes: 5 additions & 5 deletions containers/datalab/run.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,10 +13,10 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# This script allows running the docker container locally.
# Passing the 'shell' flag causes the docker container to break into a
# command prompt, rather than run the node.js server, which is useful
# for tinkering within the container before manually starting the server.
# Runs the docker container locally.
# Passing in 'shell' flag causes the docker container to break into a
# command prompt, which is useful for tinkering within the container before
# manually starting the server.

# In local mode the container picks up local notebooks, so it can be used
# to work on files saved on the file system.
Expand Down
4 changes: 2 additions & 2 deletions containers/datalab/stage.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Copyright 2014 Google Inc. All rights reserved.
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand All @@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# Publishes the built docker image to the registry
# Publishes the built docker image to the registry for testing purposes.

if [ "$1" == "" ]; then
TAG=$USER
Expand Down
15 changes: 15 additions & 0 deletions content/publish.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,19 @@
#!/bin/sh
# Copyright 2015 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Publishes content files to Google Cloud Storage as publicly accessible content.

# Synchronize content from the repository content directory to the
# cloud within gs://cloud-datalab/content
Expand Down
2 changes: 2 additions & 0 deletions externs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
This directory contains imports (such as TypeScript declarations), used for building DataLab
component sources.
2 changes: 1 addition & 1 deletion externs/ts/node/bunyan.d.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2014 Google Inc. All rights reserved.
* Copyright 2015 Google Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
* in compliance with the License. You may obtain a copy of the License at
Expand Down
3 changes: 1 addition & 2 deletions externs/ts/node/mkdirp.d.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2014 Google Inc. All rights reserved.
* Copyright 2015 Google Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
* in compliance with the License. You may obtain a copy of the License at
Expand All @@ -12,7 +12,6 @@
* the License.
*/


/**
* Type definitions for mkdirp node module v 0.5.0
*/
Expand Down
Loading

0 comments on commit 494be12

Please sign in to comment.