Upload target filesystem, Override File Naming #142

alexfornuto · 2019-12-06T16:26:23Z

Thanks to #139 I can now handle the JSON and HTML reports locally without having to upload the results to an LHCI server (thanks for that!).

In order to programmatically handle the files generated, I'd like to be able to define a consistent pattern for the files. For example:

IF I run lchi collect --url=https://pantheon.io/docs --url=https://some.staging.environment.io/docs,
AND I define some extra parameter like --format=$URL-$PASS_NUM (just spit-balling on this)
THEN the contents of .lighthouseci would look like:

.lighthouseci
|- lhr-https://pantheon.io/docs-1.html
|- lhr-https://pantheon.io/docs-1.json
|- lhr-https://pantheon.io/docs-2.html
|- lhr-https://pantheon.io/docs-2.json
|- lhr-https://pantheon.io/docs-3.html
|- lhr-https://pantheon.io/docs-3.json
|- lhr-https://some.staging.environment.io/docs-1.html
|- lhr-https://some.staging.environment.io/docs-1.json
|- lhr-https://some.staging.environment.io/docs-2.html
|- lhr-https://some.staging.environment.io/docs-2.json
|- lhr-https://some.staging.environment.io/docs-3.html
|- lhr-https://some.staging.environment.io/docs-3.json

Not that the filename needs to contain the full URL, this is just an example where it uses a parameter already defined in another flag. Anything consistently predictable would suite my needs.

The text was updated successfully, but these errors were encountered:

jzabala · 2019-12-06T17:48:03Z

Hey @alexfornuto is the incremental number sequence on the files something that you are looking for too?

In your example I see you used a pattern: {url}-{sequential-number}, but a result from the lib looks more like this:

Or based on the image the only thing you want to change is the prefix lhr-?

alexfornuto · 2019-12-06T17:56:33Z

Sorry for not being more precise. The numbers in my example correlate to the three passes I've configured on my end. Since I know that there are three passes, those numbers are predictable, whereas the current timestamp format is not.

I don't care about the lhr prefix, and my example includes it. The main point here is that the timestamp data can't be predicted, and I'd like to override it.

patrickhulce · 2019-12-06T20:12:12Z

@alexfornuto we really don't want to encourage folks to read directly from the .lighthouseci folder because it's an internal implementation detail we don't intend on treating as a public API.

If this is something you'd really like to see, I think we treat this as a feature request for upload --target=filesystem --report-name-pattern="%url%-%date%.%extension%" or something

jzabala · 2019-12-06T20:21:15Z

@alexfornuto we really don't want to encourage folks to read directly from the .lighthouseci folder because it's an internal implementation detail we don't intend on treating as a public API.

If this is something you'd really like to see, I think we treat this as a feature request for upload --target=filesystem --report-name-pattern="%url%-%date%.%extension%" or something

@patrickhulce that --target=filesystem option would probably be a better solution that writing the html files to the .lighthouseci folder as was introduced in #139. By writing the html's there, I suppose it gives the notion that you can use the content of the folder.

jzabala · 2019-12-06T20:22:19Z

Sorry for not being more precise. The numbers in my example correlate to the three passes I've configured on my end. Since I know that there are three passes, those numbers are predictable, whereas the current timestamp format is not.

I don't care about the lhr prefix, and my example includes it. The main point here is that the timestamp data can't be predicted, and I'd like to override it.

Thanks for the detailed response @alexfornuto 🙂

patrickhulce · 2019-12-06T20:29:09Z

Haha, it most certainly is the better option and sounds like something we'll eventually need to tackle. I was looking for a quick way to unblock you @alexfornuto, but I suppose if you give a mouse a cookie... 😉

For now I view this as pretty low priority because all of the information is already there to be able to rename the files yourself (by parsing JSON), so a naming pattern is more convenience than new power (unlike #139).

alexfornuto · 2019-12-06T20:37:26Z

Welp, the can of worms has been opened now. Sorry!

... all of the information is already there to be able to rename the files yourself (by parsing JSON)

Forgive my ignorance. I know that this must be the case, but I don't know how. If you could give me a brief example of how one would do this, I could probably unblock myself from there.

patrickhulce · 2019-12-06T21:05:52Z

Ah gotcha, no worries! Each of the .json files are a Lighthouse Result (LHR, for short), so they're structured like any regular Lighthouse result is structured.

You can pluck out .finalUrl or .fetchTime out of lhr-12345.json and use it to decide how to name lhr-12345.html :)

alexfornuto · 2019-12-06T21:42:32Z

OK, thanks! That definitely helps. But in order to build a script that plucked data out of those files, I'd need to know what they would be named, right? It seems like a bit of a catch-22 to me.

paulirish · 2019-12-06T21:48:22Z

start by reading all *.json files in the folder?

alexfornuto · 2019-12-06T21:49:48Z

@paulirish indeed - I suppose this is where my own limitations as a technical writer forced to dev his own platform are catching up to me. I'll go away now and see if I can't figure out globbing filenames in JS.

jzabala · 2019-12-07T01:27:37Z

hey @patrickhulce, just to clarify this issue:

To add a filesystem choice for the target option.
To add a --report-name-pattern required given the target=filesystem.
The --report-name-pattern is a string that will replace the special characters:
- %url%: url or file name (for static dir)
- %date%: timestamp
- %index%: the index number
Will override files if they exists.

Questions:

Will --report-name-pattern include the full path or only the filename and another option will be created to provide the dir name?
In case the dir doesn't exist do we create it?

patrickhulce · 2019-12-07T03:24:24Z

These questions raise good points @jzabala in that I don't think we should pursue this right now until we have more than one potential consumer to iron out what the API should look like :)

jzabala · 2019-12-07T17:59:02Z

These questions raise good points @jzabala in that I don't think we should pursue this right now until we have more than one potential consumer to iron out what the API should look like :)

Awesome @patrickhulce. If this issue gets a couple more +1, I wouldn't mind on returning the discussion about the API design and implementing it 👍🏼

muratgozel · 2019-12-19T13:55:11Z

I think this is the most important issue in this library. I would like to automate reading the results easily.

The only reasonable approach that comes to my mind right now is

To read the contents of the directory .lighthouseci,
Sort *.json files by creation date,
Get the first *.json file content,
Read scores.

That's too much and it actually would be another automation script. There may be a map between input URLs and output. The library may continue to create files as it wants in .lighthouseci folder but may give us a map file between URLs and scores of the last run.

patrickhulce · 2019-12-19T15:26:41Z

To read the contents of the directory .lighthouseci,
Sort *.json files by creation date,
Get the first *.json file content,
Read scores.

Can I ask what you're doing with the scores? There might be a higher level function we can automate within lighthouse-ci, and I'm not sure that a filesystem dump will help you here. You'll still have to read the directory to get the list of reports, do some parsing to find the one you're interested in, and then read the scores from the JSON whether this feature is implemented or not.

may give us a map file between URLs and scores of the last run.

I'm not sure I understand what you mean by this. Do you just want a summary JSON file that looks like

{
  "https://example.com/page1": [
    {"performance": 90, "pwa": 32},
    {"performance": 91, "pwa": 32},
    {"performance": 89, "pwa": 32},
  ],
  "https://example.com/page2": [
    // ...
  ]
}

muratgozel · 2019-12-19T16:31:27Z

I have a system that sends an email that contains a changelog, to the client for each deploy. I have started to think about attaching the scores of performance, accessibility etc. to the bottom of the email when I saw this tool.

The folder .lighthouseci is like a database for this tool I understand but then we need an API to access certain properties inside it.

The JSON summary you wrote works for me but one still may need to access more properties.

The way I'm going to use it will probably be like: (after parsing, inside an email)

Performance Scores Report

Homepage
Performance: 91  |  Accessibility: 92  |  Best Practices: 93  |  SEO: 94

Product Pages
Performance: 91  |  Accessibility: 92  |  Best Practices: 93  |  SEO: 94

Newly come to my mind that we may specify an identifier with the run, it may be a version number (1.3.44) or some other identifier and then we can access to the results in JSON format:

lhci autorun --id=1.3.44
# later on
lhci report --id=1.3.44
# and we have a JSON file: lhci-1.3.44.json :)

This would be better from the previous mapping idea.

patrickhulce · 2019-12-19T18:29:26Z

The folder .lighthouseci is like a database for this tool I understand but then we need an API to access certain properties inside it.

Ah, there's a slight disconnect here, .lighthouseci is the temporary storage folder for the reports from the most recent run until they get uploaded. It is cleared out every time you invoke lhci collect unless you use the --additive flag. It's just meant to store results for a single build, not a long-term database of reports.

As for your use case, I'm not sure there's much in a --target=filesystem that will significantly change the amount of work you need to do for that email. No matter what, we'd be saving each full report in its own file, so it'd still require inspecting the directory and/or parsing JSON to determine which report is the "homepage"/"product page"/etc that you need to link to. At best, I suppose a manifest replaces the fs.readdirSync('./dir') call with a require('./dir/manifest.json')? Perhaps you have a proposed data format for the manifest that would make your case easier, yet still generally applicable?

muratgozel · 2019-12-19T21:41:44Z

Yes. A manifest may save us from detecting the filenames of the report and it would be more sustainable. I'm not much interested with the --target=filesystem option actually. But the manifest idea can resolve both of our demands, I think.

pahan35 · 2020-01-29T15:22:55Z

@patrickhulce could you please specify expected API to be able to help you with it?

patrickhulce · 2020-01-29T16:08:06Z

I'm currently thinking that the usage looks something like this...

lhci upload --target=filesystem --output-dir=./path/to/dir --report-filename-pattern="%%ORIGIN%%-%%PATHNAME%%-%%DATE%%.report.%%EXTENSION%%"

and it would output the median LHR reports into ./path/to/dir like

lhci-manifest.json
localhost_8000-path_to_page-2020_01_30_15_12_12.report.html
localhost_8000-path_to_page-2020_01_30_15_12_12.report.json

contents of lhci-manifest.json would be

[
  {
    "url": "http://localhost:8080/path/to/page",
    "representative": true,
    "extension": "html",
    "filePath": "./localhost_8000-path_to_page-2020_01_30_15_12_12.report.html",
    "summary": {"performance": 0.52, "accessibility": 0.79, "seo": 1, "best-practices": 0.23}
  },
  // ...
]

eventually support options like --include-all-reports which would add %%INDEX%% and copy the non-median reports too.

Any thoughts on this API folks?

pahan35 · 2020-01-29T16:26:53Z

What if we replace remove extension from manifest and save here paths to HTML report and JSON data?

Also, with --include-all-reports I'd like to see those reports inside a property of representative record like

[
  {
    "url": "http://localhost:8080/path/to/page",
    "representative": true,
    "jsonPath": "./localhost_8000-path_to_page-2020_01_30_15_12_12.report.json,
    "reportPath": "./localhost_8000-path_to_page-2020_01_30_15_12_12.report.html",
    "summary": {"performance": 0.52, "accessibility": 0.79, "seo": 1, "best-practices": 0.23}
    // this is here only if  --include-all-reports
    "reports": [
      {
        "jsonPath": "path/to/report.json",
        "reportPath": "path/to/report.html",
        "summary": {"performance": 0.52, "accessibility": 0.79, "seo": 1, "best-practices": 0.23}
      }
    ]
  },
  // ...
]

Also, it would be nice to be able to use %%BRANCH_NAME%% inside file pattern to see it clearly from which branch each specific report is when sharing it out of PR context: discuss on a meeting, etc

staceytay · 2020-05-12T09:54:12Z

Are there any updates on this?

I'm working on running Lighthouse on each app release using https://github.com/treosh/lighthouse-ci-action. I would like to have the contents of lhci-manifest.json as output for that GH Action (as suggested by @patrickhulce in treosh/lighthouse-ci-action#41 (comment)) so that I can then send out the representative scores on each run. It seems that this issue is blocking treosh/lighthouse-ci-action#41.

I don't mind creating a PR for this too, but I've not very familiar with this project and might need some guidance if that's ok 🙂

patrickhulce · 2020-05-12T13:38:43Z

Thanks for the offer @staceytay! This is a slightly larger unscoped effort that has some interplay with breaking changes in #119 though, so I'm not sure it's a great candidate for external PR help at the moment.

FWIW, all of the data is available for the GH action to build this output themselves. If we just agree on an interface for the manifest then there's no need for this to block that issue.

How does everyone feel about...

interface EntrySummary {
  performance: number // all category scores on 0-1 scale
  accessibility: number
  'best-practices': number
  seo: number
  pwa: number
}

interface Entry {
  url: string // finalUrl of the run
  isRepresentativeRun: boolean // whether it was the median run for the URL
  jsonPath: string
  htmlPath: string
  summary: EntrySummary
}

// Manifest contains an array of entries, 1 for each run. 
// Let's say all are included by default, skipping over the `--include-all-reports` bit.
type Manifest = Array<Entry>

// Alternative, we group the entries by url
type Manifest = Record<string, Array<Entry>>

staceytay · 2020-05-13T08:03:57Z

@patrickhulce sure thing, thanks!

At least for the use case I'm thinking of, this interface works, with either Manifest shapes.

alekseykulikov · 2020-05-25T16:42:18Z

@patrickhulce +1 for the format as an array.

lighthouse-ci-action@v3 sets content from links.json and assertion-results.json as output to improve composition with other tools (treosh/lighthouse-ci-action@b2e4263).

It would be great to support something like reports.json on the LHCI level so that we could expose it directly.

It's possible to build this data manually, but it may be different from the future LHCI output, and the logic of isRepresentativeRun depends on the internals and is not available in .lighthouseci folder.

patrickhulce · 2020-05-25T17:53:32Z

+1 for the format as an array.

Interesting, I would've expected a vote in favor of the map to do things like output.reports['https://example.com'][0].summary.performance without any .find. I'm not even sure how that works in actions output though tbh

We can do the array, SGTM :)

alexfornuto · 2020-05-27T21:15:55Z

Unfortunately, #328 doesn't solve one of the main parts of this feature request, dictating the file naming. My job building CI processes to format this data is made harder by not having predictable and repeatable names for the the data files.

But thanks for the improvement!

EDIT:
P.S.: Unless manifest.json solves that. Is that what your example JSON is showing?

patrickhulce · 2020-05-27T21:20:44Z

dictating the file naming

That's what reportFilenamePattern is for.

--reportFilenamePattern [filesystem only] The pattern to use for naming Lighthouse reports.

manifest.json also lists all the reports and paths to them. I'm not sure how it can get any more configurable and predictable 😅

alexfornuto · 2020-05-27T21:33:16Z

Oh, excellent! --reportFilenamePattern wasn't in the PR description, and I hadn't yet dived in to the files changed.

👍

patrickhulce · 2020-05-27T21:33:20Z

Quick poll on the expected behavior if manifest.json already exists in the target folder, should LHCI...

🚀 overwrite it
😕 throw an error and bail

alexfornuto · 2020-05-27T21:37:08Z

What about:
🎶 each object includes a date key, and subsequent writes to manifest.json amended
?

patrickhulce · 2020-05-27T22:17:32Z

each object includes a date key, and subsequent writes to manifest.json amended

I don't want to assume that manifest.json is owned by us or encourage the idea that you're keeping some sort of database of all runs on local filesystem. If that's the direction I think we should throw :)

alexfornuto · 2020-05-28T14:53:08Z

I don't want to ... encourage the idea that you're keeping some sort of database of all runs on local filesystem. If that's the direction I think we should throw :)

For more context: I'm not keeping any data longer than a CI job. I know that this is against Lighthouse best practice or whatever, but I need to run the test a few times to pull out outliers and then average scores. The tests can be wildly unpredictable and inaccurate sometimes. I'm trying to mitigate that within the confines of a CI job reporting the affects of a code/content change from a single PR.

Having said all that, I've voted for 🚀 as my second choice.

patrickhulce · 2020-05-28T15:26:12Z

Gotcha thanks for the extra context!

I need to run the test a few times to pull out outliers and then average scores

I'm very interested to hear why increasing --numberOfRuns doesn't satisfy your use case. What you're describing is exactly what Lighthouse CI is trying to do for you already, so if there are missing features there we want to plug them. Best to discuss on a separate issue though.

alexfornuto · 2020-05-28T15:42:27Z

I'm very interested to hear why increasing --numberOfRuns doesn't satisfy your use case.

I'll be honest, I haven't looked at my CI testing since December, when I made this issue. When I can look at it again, I'll look into --numberOfRuns and make another issue if need me. Thanks again!

patrickhulce added enhancement New feature or request P3 labels Dec 6, 2019

patrickhulce changed the title ~~Feature Request - Override Default File Naming~~ Upload target filesystem, Override Default File Naming Dec 6, 2019

patrickhulce changed the title ~~Upload target filesystem, Override Default File Naming~~ Upload target filesystem, Override File Naming Dec 6, 2019

patrickhulce added the feedback wanted label Dec 7, 2019

patrickhulce mentioned this issue Dec 26, 2019

Can I specify path for local reports? #163

Closed

pahan35 mentioned this issue Jan 29, 2020

Allow collect representative results on filesystem #199

Closed

patrickhulce added P2 and removed P3 labels Jan 29, 2020

patrickhulce mentioned this issue May 2, 2020

Make LH Scores available as output parameters treosh/lighthouse-ci-action#41

Open

patrickhulce mentioned this issue May 24, 2020

Feature: Enable different data storage flows #324

Open

patrickhulce mentioned this issue May 27, 2020

feat(cli): add filesystem upload target #328

Merged

patrickhulce closed this as completed in #328 May 27, 2020

patrickhulce mentioned this issue May 28, 2020

fix(cli): throw on existing manifest.json #330

Closed

ashishkshrivastava mentioned this issue Dec 4, 2020

Median report not generating in case of target as filesystem #503

Closed

Upload target filesystem, Override File Naming #142

Upload target filesystem, Override File Naming #142

Comments

alexfornuto commented Dec 6, 2019 • edited Loading

jzabala commented Dec 6, 2019

alexfornuto commented Dec 6, 2019

patrickhulce commented Dec 6, 2019

jzabala commented Dec 6, 2019 • edited Loading

jzabala commented Dec 6, 2019

patrickhulce commented Dec 6, 2019

alexfornuto commented Dec 6, 2019

patrickhulce commented Dec 6, 2019

alexfornuto commented Dec 6, 2019

paulirish commented Dec 6, 2019

alexfornuto commented Dec 6, 2019

jzabala commented Dec 7, 2019 • edited Loading

patrickhulce commented Dec 7, 2019

jzabala commented Dec 7, 2019

muratgozel commented Dec 19, 2019

patrickhulce commented Dec 19, 2019

muratgozel commented Dec 19, 2019

patrickhulce commented Dec 19, 2019

muratgozel commented Dec 19, 2019

pahan35 commented Jan 29, 2020

patrickhulce commented Jan 29, 2020 • edited Loading

pahan35 commented Jan 29, 2020

staceytay commented May 12, 2020

patrickhulce commented May 12, 2020 • edited Loading

staceytay commented May 13, 2020

alekseykulikov commented May 25, 2020

patrickhulce commented May 25, 2020 • edited Loading

alexfornuto commented May 27, 2020 • edited Loading

patrickhulce commented May 27, 2020

alexfornuto commented May 27, 2020

patrickhulce commented May 27, 2020

alexfornuto commented May 27, 2020 • edited Loading

patrickhulce commented May 27, 2020

alexfornuto commented May 28, 2020

patrickhulce commented May 28, 2020

alexfornuto commented May 28, 2020

alexfornuto commented Dec 6, 2019 •

edited

Loading

jzabala commented Dec 6, 2019 •

edited

Loading

jzabala commented Dec 7, 2019 •

edited

Loading

patrickhulce commented Jan 29, 2020 •

edited

Loading

patrickhulce commented May 12, 2020 •

edited

Loading

patrickhulce commented May 25, 2020 •

edited

Loading

alexfornuto commented May 27, 2020 •

edited

Loading

alexfornuto commented May 27, 2020 •

edited

Loading