Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should data collections throw if data files are invalid? #1736

Closed
caendesilva opened this issue Jun 24, 2024 · 8 comments
Closed

Should data collections throw if data files are invalid? #1736

caendesilva opened this issue Jun 24, 2024 · 8 comments

Comments

@caendesilva
Copy link
Member

Exploring ways to improve the file-based data collections feature for 2.x. One of these changes is to throw exceptions if data is invalid.

For example: Currently, a JSON file with bad syntax is returned as null, while YAML collections will throw an error, which is inconsistent. The main goal with HydePHP is to have great developer experience (DX), so I'm thinking if introducing exceptions for all invalid data types is the right approach.

My main advantage is direct and clear feedback on what has gone wrong, however a drawback is being interrupted when developing a data file.

@caendesilva
Copy link
Member Author

caendesilva commented Jun 24, 2024

GPT-4o response on the matter

Introducing error handling by throwing exceptions for invalid data in the HydePHP's file-based collections feature could indeed improve the developer experience (DX). However, it's essential to balance robustness with usability. Here are some considerations and potential improvements:

Considerations for Throwing Exceptions

  1. Immediate Feedback:

    • Throwing exceptions for invalid data (e.g., malformed JSON or YAML) provides immediate feedback to developers, helping them quickly identify and correct errors.
  2. Improved Debugging:

    • Exceptions can include detailed error messages and stack traces, making it easier for developers to pinpoint the exact issue and location.
  3. Consistency and Reliability:

    • Ensuring data validity across collections can lead to more consistent and reliable application behavior.

Potential Downsides

  1. Interruptions in Development Flow:

    • Exceptions will stop execution, which might be disruptive if developers frequently work with incomplete or evolving data files.
  2. Learning Curve:

    • Developers new to HydePHP or static site generation might find the strict error handling overwhelming initially.

Suggested Improvements

To enhance the file-based collections feature while maintaining great DX, consider implementing a hybrid approach that offers both strict error handling and flexibility. Here are some specific suggestions:

  1. Configurable Error Handling:

    • Allow developers to configure how errors are handled through a setting in the HydePHP configuration file. Options could include:
      • Strict Mode: Throws exceptions for any invalid data.
      • Warning Mode: Logs errors and continues execution, returning null for invalid files.
      • Silent Mode: Continues execution without logging, returning null for invalid files.

    Example configuration:

    // config/hyde.php
    'data_collection_error_handling' => 'strict', // Options: 'strict', 'warning', 'silent'
  2. Detailed Error Messages:

    • When throwing exceptions, provide detailed messages including the file path, type of error, and potential fixes. This can aid in quicker resolution.

    Example:

    throw new \Exception("Invalid JSON syntax in file 'resources/collections/testimonials/1.json': [specific error details]");
  3. Logging:

    • Implement a logging system that records errors in a dedicated log file or system. This can be particularly useful in warning mode to keep track of issues without interrupting execution.

    Example:

    \Log::warning("Invalid JSON syntax in file 'resources/collections/testimonials/1.json': [specific error details]");
  4. Validation Tools:

    • Provide command-line tools or a web interface for validating collections. This can be part of the development workflow to catch errors before deployment.

    Example CLI command:

    php hyde validate:collections
  5. Documentation Updates:

    • Update the documentation to reflect the new error handling configuration options and provide guidelines on best practices for managing data files.

Example Usage and Error Handling

Here's how the usage might look with configurable error handling:

use Hyde\Support\DataCollection;

// In strict mode, this will throw an exception if there's an invalid JSON file
try {
    $collection = DataCollection::json('testimonials');
} catch (\Exception $e) {
    echo "Error: " . $e->getMessage();
}

// In warning or silent mode, it will log the error (if configured) and continue
$collection = DataCollection::json('testimonials');
// Handle null entries if necessary
foreach ($collection as $file => $data) {
    if ($data === null) {
        echo "Warning: Invalid data in file $file\n";
    } else {
        // Process valid data
    }
}

Conclusion

By providing configurable error handling options, HydePHP can cater to both novice and experienced developers, offering immediate feedback during development while maintaining flexibility. This approach ensures a balance between robustness and usability, enhancing the overall developer experience.

Final Recommendation

Based on your goals of clear feedback and maintaining simplicity, I recommend proceeding with throwing exceptions for invalid data, accompanied by logging detailed error messages. This approach will provide the immediate and clear feedback you desire without introducing significant complexity.

@caendesilva
Copy link
Member Author

In response to GPT:

We could add a build warning, and I have considered previously adding a "strict" mode to control these things (though this is not the place I would first implement this), and we could add it to site validations.

But given that there is an inconsistency we should fix (json_decode does not throw, Yaml::parse does, so if we decide against throwing we should update Yaml ones to return null to match), I think throwing will result in the best DX and upgrade path all things considered.

@caendesilva
Copy link
Member Author

Additionally, consider the impact of a syntax error in a data file, worst case empty data is shipped to production, or the developer is confused. We solve both of this by letting them know that something is wrong.

@caendesilva
Copy link
Member Author

I'm considering if empty files should be syntax errors, json_validate considers them to be, and they otherwise return null which would require a ?? [] fallback, which I'm not sure we want to support = encourage given the reasoning behind this feature.

@caendesilva
Copy link
Member Author

caendesilva commented Jun 25, 2024

Note that the Symfony YAML parser throws ParseException if the Yaml is invalid, so we want to match this with Json.

We may want to add a custom throw if the file is empty, to match json_validate

@caendesilva
Copy link
Member Author

GPT 4-o response for "Should an empty file be considered invalid?"

Deciding whether an empty file should be considered invalid depends on the context and intended usage within HydePHP. Here are some considerations to help you determine the best approach:

Considerations

  1. Use Case and Context:

    • If an empty file indicates missing or incomplete data that should be populated, treating it as invalid makes sense.
    • If an empty file is a placeholder or an intentional absence of data, it might not be appropriate to consider it invalid.
  2. Developer Experience:

    • Treating empty files as invalid provides clear feedback that data is missing, prompting developers to address the issue.
    • However, this might interrupt the development flow if developers frequently use empty files as placeholders during initial development stages.
  3. Consistency:

    • Consistently treating empty files as invalid can enforce data completeness and integrity across the application.
  4. Flexibility:

    • Allowing empty files can provide flexibility during the development process, especially in early stages when data might not be fully available.

Recommendations

Given the goals of providing clear feedback and maintaining simplicity, here are two approaches you can consider:

1. Treat Empty Files as Invalid

Pros:

  • Ensures data completeness.
  • Provides clear feedback to developers about missing data.

Cons:

  • Can interrupt development flow if empty files are used as placeholders.

Implementation:

class DataCollection
{
    protected $logger;

    public function __construct(LoggerInterface $logger)
    {
        $this->logger = $logger;
    }

    public static function json(string $name, bool $asArray = false)
    {
        try {
            $filePath = "resources/collections/{$name}.json";
            $content = file_get_contents($filePath);

            // Check if the file is empty
            if (empty($content)) {
                throw new InvalidDataException('File is empty');
            }

            // Load and parse JSON file
            $data = json_decode($content, $asArray);
            if (json_last_error() !== JSON_ERROR_NONE) {
                throw new InvalidDataException('Invalid JSON syntax');
            }
            return $data;
        } catch (\Exception $e) {
            // Log detailed error message
            $this->logger->error("Error loading JSON collection '{$name}': " . $e->getMessage());
            // Throw exception for immediate feedback
            throw $e;
        }
    }
}

2. Allow Empty Files with Warnings

Pros:

  • Provides flexibility during development.
  • Allows developers to use empty files as placeholders without interruptions.

Cons:

  • May lead to incomplete data being overlooked if warnings are ignored.

Implementation:

class DataCollection
{
    protected $logger;

    public function __construct(LoggerInterface $logger)
    {
        $this->logger = $logger;
    }

    public static function json(string $name, bool $asArray = false)
    {
        try {
            $filePath = "resources/collections/{$name}.json";
            $content = file_get_contents($filePath);

            // Check if the file is empty and log a warning
            if (empty($content)) {
                $this->logger->warning("Warning: File '{$filePath}' is empty.");
                return null;
            }

            // Load and parse JSON file
            $data = json_decode($content, $asArray);
            if (json_last_error() !== JSON_ERROR_NONE) {
                throw new InvalidDataException('Invalid JSON syntax');
            }
            return $data;
        } catch (\Exception $e) {
            // Log detailed error message
            $this->logger->error("Error loading JSON collection '{$name}': " . $e->getMessage());
            // Throw exception for immediate feedback
            throw $e;
        }
    }
}

Final Recommendation

Given the importance of clear feedback and the potential interruptions during development, a hybrid approach might be most beneficial. Treat empty files as invalid by default, but consider logging a warning instead of throwing an exception initially. This provides flexibility while still notifying developers of potential issues.

This balanced approach ensures that data integrity is maintained without significantly disrupting the development workflow. You can gather feedback from developers and adjust the approach as needed based on their experiences.

caendesilva added a commit that referenced this issue Jun 25, 2024
This better matches the JSON exception throwing. See #1736
@caendesilva
Copy link
Member Author

caendesilva commented Jun 25, 2024

Should we throw if the wrong collection type is called?

For example, if "foo" only has ".md" files, but DataCollection::yaml('foo') is called?

Edit: Actually just having it be empty is prob sufficient here

@caendesilva
Copy link
Member Author

Fixed in #1732

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant