Skip to content

A report of all the work done in Google Summer of Code 23

Notifications You must be signed in to change notification settings

its-sushant/GSoC-23

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Google Summer of Code 2023

ViewCount GitHub GSoC @ FOSSology

Adding CycloneDX Report Generation to FOSSology

PROJECT OVERVIEW

FOSSology is a well-known open-source license scanning toolkit used to analyze and manage the licenses associated with software components in a codebase. Previously, FOSSology generated license compliance reports in SPDX (Software Package Data Exchange) format. However, there was a need to extend its capabilities by adding support for generating reports in CycloneDX format.

CycloneDX is an emerging industry standard for representing software bill of materials (SBOM) information, which includes details about software components, their dependencies, and associated licenses. The goal of this project was to enhance FOSSology's functionality by implementing a custom agent capable of generating CycloneDX reports, thereby enabling users to obtain SBOM information in a widely recognized and interoperable format.

The successful completion of this project not only added a new feature to FOSSology but also demonstrated the contributor's expertise in open-source development, license compliance, and software engineering. The newly added CycloneDX report generation capability enhances FOSSology's usefulness to the software development community, facilitating better license management and compliance efforts.

CONTRIBUTIONS

Custom CycloneDX Report Generation in FOSSology

Step 1: Retrieve Data from FOSSology DAO Layer

FOSSology has a database layer called the DAO (Data Access Object) layer that is used for accessing data from FOSSology's database. First task of agent is to get required scan data from FOSSology database. This data is then used to generate the CycloneDX report.

Step 2: Create a CycloneDX Agent

This agent is like a messenger between FOSSology and the report-making process. It connects with FOSSology's database system and helps move the right information from FOSSology and use report generator to creates the CycloneDX report.

Step 3: Extract Relevant Information

Using the CycloneDX agent, we grabbed all the exact details about licenses and copyrights for each file that was scanned. We were careful to get just the right amount of information needed to put into the CycloneDX report.

Step 4: Generate the CycloneDX Report

Once we had all the important information gathered up, we went ahead and created the CycloneDX report. This report follows the rules of CycloneDX, which makes it work really well with other tools and look consistent.

Report Structure

The CycloneDX report generated by FOSSology contains the following information:

  • bomFormat : Format of the SBOM (Software Bill of Materials), in this case, "CycloneDX".
  • $schema : The schema version of the CycloneDX specification.
  • specVersion : The version of the CycloneDX specification being used.
  • version : The version of the CycloneDX report.
  • serialNumber : A unique identifier for the report.
  • metadata : Metadata associated with the report, including timestamp and tools used.
    • timestamp : The date and time when the report was generated.
    • tools : Information about the tools used to generate the report.
      • vendor : The vendor of the tool (e.g., "FOSSology").
      • name : The name of the tool (e.g., "FOSSology").
      • version : The version of the tool (e.g., "4.0.0.502-rc1").
    • component : Details about the scanned component (e.g., software library).
      • type : The type of component (e.g., "library").
      • name : The name of the component (e.g., "vlc-master.zip").
      • mime-type : The MIME type of the component (e.g., "application/zip").
      • bom-ref : A reference ID for the component within the SBOM.
      • scope : The scope of the component (e.g., "required").
      • hashes : Hashes (checksums) of the component's content for integrity verification.
        • alg : The algorithm used for hashing (e.g., "SHA-1").
        • content : The hash value itself.
      • licenses : Licensing information associated with the component.
        • license : A specific license and its URL (e.g., "GPL-2.0-or-later").
        • expression : A license expression or reference (e.g., "LicenseRef-fossology-WebM").
  • components : A list of individual components found in the codebase.
    • type : The type of component (e.g., "file").
    • name : The name of the component (e.g., "cdx.py").
    • mime-type : The MIME type of the component (e.g., "application/zip").
    • bom-ref : A reference ID for the component within the SBOM.
    • scope : The scope of the component (e.g., "required").
    • hashes : Hashes (checksums) of the component's content for integrity verification.
      • alg : The algorithm used for hashing (e.g., "SHA-1").
      • content : The hash value itself.
    • licenses : Licensing information associated with the component.
      • license : A specific license and its URL (e.g., "GPL-2.0-or-later").
      • expression : A license expression or reference (e.g., "LicenseRef-fossology-WebM").
    • copyright : Copyright information associated with the component.

A part of simple report generated in FOSSology is shown below:

{
    "bomFormat": "CycloneDX",
    "schema": "http://cyclonedx.org/schema/bom-1.4.schema.json",
    "specVersion": "1.4",
    "version": 1,
    "serialNumber": "urn:uuid:58d75e04-2878-11ee-85a2-af752f88397f",
    "metadata": {
        "timestamp": "2023-07-22T15:43:04+05:30",
        "tools": [
            {
                "vendor": "FOSSology",
                "name": "FOSSology",
                "version": "4.0.0.402-rc1"
            }
        ],
        "component": {
            "type": "library",
            "name": "vlc-master.zip",
            "mime-type": "application/zip",
            "bom-ref": "10",
            "scope": "required",
            "hashes": [
                {
                    "alg": "SHA-1",
                    "content": "618A15A0F4BE8D46DCB9F427CA92295B8882F2E5"
                },
                {
                    "alg": "MD5",
                    "content": "C8667B9AA727C63978756AA045910887"
                },
                {
                    "alg": "SHA-256",
                    "content": "A82853551ED08E58F46B6F1FDC5DE1BD19E3A826D3F8DB696C8AB7DB8E88716B"
                }
            ],
            "licenses": [
                {
                    "license": {
                        "id": "GPL-2.0-or-later",
                        "url": "https://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html"
                    }
                },
                {
                    "license": {
                        "id": "LGPL-2.1-or-later",
                        "url": "https://www.gnu.org/licenses/old-licenses/lgpl-2.1-standalone.html"
                    }
                }
            ]
        }
    },
    "components": [
        {
            "type": "file",
            "name": "vlc-master.zip/vlc-master/COPYING.LIB",
            "mime-type": "text/plain",
            "bom-ref": "10-93924",
            "scope": "required",
            "hashes": [
                {
                    "alg": "SHA-1",
                    "content": "01A6B4BF79ACA9B556822601186AFAB86E8C4FBF"
                },
                {
                    "alg": "MD5",
                    "content": "4FBD65380CDD255951079008B364516C"
                },
                {
                    "alg": "SHA-256",
                    "content": "DC626520DCD53A22F727AF3EE42C770E56C97A64FE3ADB063799D8AB032FE551"
                }
            ],
            "licenses": [
                {
                    "license": {
                        "id": "LGPL-2.1-or-later",
                        "url": "https://www.gnu.org/licenses/old-licenses/lgpl-2.1-standalone.html"
                    }
                }
            ],
            "copyright": "Copyright (C) 1991, 1999 Free Software Foundation, Inc. 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed."
        }
    ]
}

(OPTIONAL) Enhancing ScanCode Agent in FOSSology

Introduction

The prevailing ScanCode agent in FOSSology utilized a command line interface (CLI) approach for scanning each individual file. However, this resulted in substantial time overhead due to the repetitive bootstrapping of ScanCode for each file. To address this, a novel approach was conceived, aimed at harnessing the ScanCode API to facilitate scanning every file at same time, thereby mitigating redundant invocations and enhancing overall efficiency.

The Strategic Solution

The key to optimizing the ScanCode agent's performance was to transition from a file-centric scanning approach to a consolidated, API-driven process. This involved several meticulous steps:

1. File Location Gathering
  • Employed the FOSSology ScanCode agent to collate file locations for each individual file.
  • The file locations were meticulously stored in a temporary text file, priming them for further processing.
2. Python Script Integration
  • For efficient orchestration, a dedicated Python script was crafted to orchestrate the scanning process using the ScanCode API.
  • The path to the aforementioned temporary text file, housing the file locations, was elegantly passed to this Python script.
3. Concurrent Scanning Script
  • A concurrent scanning script was conceived to parallelly handle the file scanning process.
  • This script employed a systematic loop to iterate through each file location detailed in the text file.
  • For every file location encountered:
    • The ScanCode API was invoked to initiate the scanning procedure.
    • The outcomes were meticulously captured and seamlessly appended to a centralized JSON file.
4. Database Integration
  • Post the successful execution of the concurrent scanning script:
    • The crucial data encapsulated within the generated JSON file was efficiently extracted.
    • The robust ScanCode agent was employed to retrieve this valuable data, which was subsequently channeled and stored within the FOSSology database.
5. Streamlined Cleanup
  • In the spirit of meticulous housekeeping, the finalization phase encompassed a comprehensive clean-up process:
    • Prudent deletion of both the temporary text file and the meticulously created JSON file, thus ensuring an uncluttered and efficient workspace.

Advantages

This meticulously devised overhaul of the ScanCode agent's workflow yielded a range of notable advantages:

  • A significant reduction in the temporal overhead linked with ScanCode's bootstrapping process for each individual file.
  • Optimization of the ScanCode toolkit's utility within the overarching FOSSology framework.
  • Efficient aggregation of scanning results into a structured JSON file, poised for seamless integration into the FOSSology database.

WPRS

MAJOR PULL REQUESTS

πŸ‘¨πŸ»β€πŸ« DELIVERABLES

Tasks Status Links
CycloneDX agent Successfully implemented agent Agent
Scancode agent improvement Done PR

REACH OUT TO ME!

About

A report of all the work done in Google Summer of Code 23

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published