Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest Version Changes #235

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions Dockerfile.data-model
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@ FROM centos:7
MAINTAINER Saleem Ansari <[email protected]>

RUN yum install -y epel-release && \
yum install -y python-pip python-devel gcc && \
yum install -y python34-pip python34-devel gcc && \
yum install -y git && \
yum clean all

# --------------------------------------------------------------------------------------------------------------
Expand All @@ -21,15 +22,14 @@ RUN yum install -y epel-release && \

# Note: cron daemon ( crond ) will be invoked from within entry point
# --------------------------------------------------------------------------------------------------------------
RUN pip3 install git+https://[email protected]/fabric8-analytics/fabric8-analytics-version-comparator.git
RUN pip3 install git+https://github.com/fabric8-analytics/fabric8-analytics-utils.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to install specific versions of these dependencies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes i will make those changes in the final commit. Still its WIP as the tests are not passing.


# install python packages
COPY ./requirements.txt /
RUN pip install -r requirements.txt && rm requirements.txt

COPY ./ /tmp/f8a_data_model/
COPY ./src /src
RUN cd /tmp/f8a_data_model && pip3 install .

ADD scripts/entrypoint.sh /bin/entrypoint.sh
ADD populate_schema.py /populate_schema.py

ENTRYPOINT ["/bin/entrypoint.sh"]

8 changes: 4 additions & 4 deletions Dockerfile.data-model.rhel
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ LABEL author "Devtools <[email protected]>"

# Note: cron daemon ( crond ) will be invoked from within entry point
# --------------------------------------------------------------------------------------------------------------
RUN pip3 install git+https://[email protected]/fabric8-analytics/fabric8-analytics-version-comparator.git
RUN pip3 install git+https://github.com/fabric8-analytics/fabric8-analytics-utils.git@latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@latest?


# install python packages
COPY ./requirements.txt /
RUN pip install -r requirements.txt && rm requirements.txt

COPY ./ /tmp/f8a_data_model/
COPY ./src /src
RUN cd /tmp/f8a_data_model && pip3 install .

ADD scripts/entrypoint.sh /bin/entrypoint.sh
ADD populate_schema.py /populate_schema.py
Expand Down
4 changes: 2 additions & 2 deletions cico_setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ docker_login() {
prep() {
yum -y update
yum -y install docker git which epel-release python-virtualenv postgresql
yum -y install python-pip
pip install docker-compose
yum -y install python34-pip python34-devel
pip3 install docker-compose
systemctl start docker
}

Expand Down
2 changes: 1 addition & 1 deletion populate_schema.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/env python
#!/usr/bin/env python3
"""Populate graph schema."""

import logging
Expand Down
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ flake8==3.6.0 # via flake8-polyfill
flask-cors==3.0.7
flask==1.0.2
funcsigs==1.0.2 # via mock, pytest
futures==3.2.0 # via s3transfer
gevent==1.4.0
greenlet==0.4.15 # via gevent
gunicorn==19.9.0
Expand Down Expand Up @@ -61,3 +60,5 @@ sqlalchemy==1.2.15
urllib3==1.24.1 # via botocore, minio, requests
uuid==1.30
werkzeug==0.14.1 # via flask, pytest-flask
git+https://[email protected]/fabric8-analytics/fabric8-analytics-version-comparator.git@2e7eddc
git+https://github.com/fabric8-analytics/fabric8-analytics-utils.git@d7aaccf
20 changes: 9 additions & 11 deletions runtests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ function start_services {
function setup_virtualenv {
echo "Create Virtualenv for Python deps ..."

virtualenv --python /usr/bin/python2.7 env-test
virtualenv -p python3 venv && source venv/bin/activate

if [ $? -ne 0 ]
then
Expand All @@ -53,21 +53,19 @@ function setup_virtualenv {
fi
printf "%sPython virtual environment initialized%s\n" "${YELLOW}" "${NORMAL}"

source env-test/bin/activate

pip install -U pip
pip install -r requirements.txt
pip3 install -r requirements.txt

# Install profiling module
pip install pytest-profiling
pip3 install pytest-profiling

# Install pytest-coverage module
pip install pytest-cov
pip3 install pytest-cov
}

function destroy_virtualenv {
echo "Remove Virtualenv ..."
rm -rf env-test/
rm -rf venv/
}

echo JAVA_OPTIONS value: "$JAVA_OPTIONS"
Expand All @@ -76,9 +74,9 @@ start_services

setup_virtualenv

source env-test/bin/activate
source venv/bin/activate

PYTHONPATH=$(pwd)/src
PYTHONPATH=$(pwd)
export PYTHONPATH

export BAYESIAN_PGBOUNCER_SERVICE_HOST="localhost"
Expand All @@ -99,9 +97,9 @@ echo "*** Unit tests ***"
echo "*****************************************"
echo "Check for sanity of the connections..."

if python sanitycheck.py
if python3 sanitycheck.py
then
python populate_schema.py
python3 populate_schema.py
py.test --cov=src/ --cov-report term-missing --cov-fail-under=$COVERAGE_THRESHOLD -vv -s test/
codecov --token=3c1d9638-afb6-40e6-85eb-3fb193000d4b
else
Expand Down
4 changes: 2 additions & 2 deletions sanitycheck.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
"""Sanity check of the graph DB REST API."""

from graph_manager import BayesianGraph
from src.graph_manager import BayesianGraph
import time
import sys
import logging
import config
from src import config

logging.basicConfig()
logger = logging.getLogger(config.APP_NAME)
Expand Down
2 changes: 1 addition & 1 deletion scripts/data_importer_crontab
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
# 0 */6 * * * root . /root/project_env.sh; export PYTHONPATH=/src; python /src/data_importer.py -s S3
# 0 */6 * * * root . /root/project_env.sh; export PYTHONPATH=/src; python3 /src/data_importer.py -s S3
# An empty line is required at the end of this file for a valid cron file.
2 changes: 1 addition & 1 deletion scripts/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ do
done

if [ ! -z "$SKIP_SCHEMA" ]; then
python populate_schema.py
python3 populate_schema.py
fi

# Start data model service with time out
Expand Down
59 changes: 59 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env python3

# Copyright © 2019 Red Hat Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Author: Yusuf Zainee <[email protected]>
#

"""Project setup file for fabric8 analytics notifications project."""

from setuptools import setup, find_packages


def get_requirements():
"""Parse all packages mentioned in the 'requirements.txt' file."""
with open('requirements.txt') as fd:
lines = fd.read().splitlines()
reqs, dep_links = [], []
for line in lines:
if line.startswith('git+'):
dep_links.append(line)
else:
reqs.append(line)
return reqs, dep_links


# pip doesn't install from dependency links by default,
# so one should install dependencies by
# `pip install -r requirements.txt`, not by `pip install .`
# See https://github.com/pypa/pip/issues/2023
reqs, dep_links = get_requirements()
setup(
name='fabric8-analytics-data-model',
version='0.1',
scripts=[
],
packages=find_packages(exclude=['tests', 'tests.*']),
install_requires=reqs,
dependency_links=dep_links,
include_package_data=True,
author='Yusuf Zainee',
author_email='[email protected]',
description='data importer for fabric8 analytics',
license='ASL 2.0',
keywords='fabric8-analytics-data-model',
url=('https://github.com/fabric8-analytics/'
'fabric8-analytics-data-model')
)
6 changes: 3 additions & 3 deletions src/cve.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
"""This module encapsulates CVE related queries."""

import logging
from graph_populator import GraphPopulator
from graph_manager import BayesianGraph
from utils import get_timestamp, call_gremlin
from src.graph_populator import GraphPopulator
from src.graph_manager import BayesianGraph
from src.utils import get_timestamp, call_gremlin

logger = logging.getLogger(__name__)

Expand Down
21 changes: 15 additions & 6 deletions src/data_importer.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
"""Module with functions to fetch data from the S3 data source."""

from graph_populator import GraphPopulator
from src.graph_populator import GraphPopulator
import logging
import config
import traceback
from src import config
import json
import requests
from data_source.s3_data_source import S3DataSource
from src.data_source.s3_data_source import S3DataSource

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.orm.exc import NoResultFound
from f8a_utils.versions import get_latest_versions_for_ep

logger = logging.getLogger(config.APP_NAME)

Expand Down Expand Up @@ -51,8 +51,8 @@ def _other_key_info(data_source, other_keys, bucket_name=None):

def _get_exception_msg(prefix, e):
msg = prefix + ": " + str(e)
logger.error(msg)
tb = traceback.format_exc()
# logger.error(msg)
tb = logging.exception(msg)
logger.error("Traceback for latest failure in import call: %s" % tb)
return msg

Expand Down Expand Up @@ -80,11 +80,20 @@ def _import_keys_from_s3_http(data_source, epv_list):
'version': pkg_version,
'source_repo': pkg_source}

latest_version = get_latest_versions_for_ep(pkg_ecosystem, pkg_name)
latest_epv_list = [{
'ecosystem': pkg_ecosystem,
'name': pkg_name,
'version': latest_version
}]
create_graph_nodes(latest_epv_list)

try:
# Check other Version level information and add it to common object
if len(contents.get('ver_list_keys')) > 0:
first_key = contents['ver_key_prefix'] + '.json'
first_obj = _first_key_info(data_source, first_key, config.AWS_EPV_BUCKET)
first_obj['latest_version'] = latest_version
obj.update(first_obj)
ver_obj = _other_key_info(data_source, contents.get('ver_list_keys'),
config.AWS_EPV_BUCKET)
Expand Down
4 changes: 2 additions & 2 deletions src/data_source/s3_data_source.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
"""Data source that returns data read from the AWS S3 database."""

from data_source.abstract_data_source import AbstractDataSource
from src.data_source.abstract_data_source import AbstractDataSource
import botocore
import boto3
import json
import config
from src import config


class S3DataSource(AbstractDataSource):
Expand Down
2 changes: 1 addition & 1 deletion src/graph_manager.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""Template for a singleton object which will have reference to Graph object."""

import config
from src import config
import json
import requests
import os
Expand Down
17 changes: 12 additions & 5 deletions src/graph_populator.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
import time
from dateutil.parser import parse as parse_datetime
from six import string_types
import config
from utils import get_current_version
from src import config
from src.utils import get_current_version
from datetime import datetime
from f8a_utils.versions import get_latest_versions_for_ep

logger = logging.getLogger(config.APP_NAME)

Expand All @@ -22,6 +23,9 @@ def construct_graph_nodes(cls, epv):
pkg_name = epv.get('name')
version = epv.get('version')
source_repo = epv.get('source_repo', '')
latest_version = epv.get('latest_version', '')
if not latest_version:
latest_version = get_latest_versions_for_ep(ecosystem, pkg_name)
if ecosystem and pkg_name and version:
# Query to Create Package Node
# TODO: refactor into the separate module
Expand All @@ -33,8 +37,9 @@ def construct_graph_nodes(cls, epv):
"property('{ecosystem}_pkg_count',1)).iterate();" \
"graph.addVertex('ecosystem', '{ecosystem}', " \
"'name', '{pkg_name}', 'vertex_label', 'Package');}};" \
"pkg.property('latest_version', '{latest_version}');" \
"pkg.property('last_updated', {last_updated});".format(
ecosystem=ecosystem, pkg_name=pkg_name,
ecosystem=ecosystem, latest_version=latest_version, pkg_name=pkg_name,
last_updated=str(time.time())
)

Expand Down Expand Up @@ -487,6 +492,7 @@ def create_query_string(cls, input_json):
str_gremlin += str_gremlin_version
if not prp_package:
# TODO: refactor into the separate module
latest_version = get_latest_versions_for_ep(ecosystem, pkg_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why do we call get_latest_versions_for_ep on 3 different places in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my biggest issue with this PR :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that different APIs are triggering different functional flows, where gremlin queries are getting generated for the package node. So basically thats 2 of them. The 3rd one, construct graph node function, i have added it so that even in cases where we are creating dummy nodes during CVEs etc, the package node should get updated to the latest version. This is an extra check added to make sure that our package node is always updated.

str_gremlin += "pkg = g.V().has('ecosystem','{ecosystem}')." \
"has('name', '{pkg_name}').tryNext().orElseGet{{" \
"g.V().has('vertex_label','Count').choose(has('" \
Expand All @@ -496,9 +502,10 @@ def create_query_string(cls, input_json):
"'{ecosystem}_pkg_count',1)).iterate();graph.addVertex(" \
"'ecosystem', '{ecosystem}', 'name', '{pkg_name}', " \
"'vertex_label', 'Package');}};" \
"pkg.property('latest_version', '{latest_version}');" \
"pkg.property('last_updated', {last_updated});".format(
ecosystem=ecosystem, pkg_name=pkg_name,
last_updated=str(time.time())
ecosystem=ecosystem, latest_version=latest_version,
pkg_name=pkg_name, last_updated=str(time.time())
)
# TODO: refactor into the separate module
str_gremlin += "edge_c = g.V().has('pecosystem','{ecosystem}').has('pname'," \
Expand Down
12 changes: 6 additions & 6 deletions src/rest_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@
from flask_cors import CORS
import json
import sys
import data_importer
from graph_manager import BayesianGraph
from graph_populator import GraphPopulator
from cve import CVEPut, CVEDelete, CVEGet, CVEDBVersion
from src import data_importer
from src.graph_manager import BayesianGraph
from src.graph_populator import GraphPopulator
from src.cve import CVEPut, CVEDelete, CVEGet, CVEDBVersion
from raven.contrib.flask import Sentry
import config
from src import config as config
from werkzeug.contrib.fixers import ProxyFix
import logging
from flask import Blueprint, current_app
Expand Down Expand Up @@ -313,7 +313,7 @@ def cvedb_version_put():
def create_app():
"""Create Flask app object."""
new_app = Flask(config.APP_NAME)
new_app.config.from_object('config')
new_app.config.from_object('src.config')
CORS(new_app)
new_app.register_blueprint(api_v1)
return new_app
Expand Down
Loading