Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update 'how' and 'glossary' #533

Merged
merged 2 commits into from
Mar 22, 2022

Conversation

umarcor
Copy link
Contributor

@umarcor umarcor commented Mar 17, 2022

No description provided.

umarcor added 2 commits March 22, 2022 14:59
Signed-off-by: Unai Martinez-Corral <[email protected]>
Signed-off-by: Unai Martinez-Corral <[email protected]>
@umarcor umarcor force-pushed the umarcor/docs/update branch from ddaf182 to c978d02 Compare March 22, 2022 14:00
@umarcor umarcor marked this pull request as ready for review March 22, 2022 18:02
@mgielda
Copy link
Member

mgielda commented Mar 22, 2022

LGTM, great work!

@kgugala kgugala merged commit f4b1cc3 into chipsalliance:main Mar 22, 2022
@kgugala kgugala deleted the umarcor/docs/update branch March 22, 2022 20:32
Copy link
Collaborator

@kboronski-ant kboronski-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave a post-merge review on this one, since I believe there are still some things to be improved within glossary.

  1. Step - It's an alternative name to stage from flow definitions, right? I have nothing against that name, it might be even better than stage, but I think we should use a single name for that to avoid confusion, so either we keep it as stage or change the python code and flow definitions to refer to it as step. Also, I'm not sure what do you mean by saying that Steps might wrap a single or multiple tools. In f4pga (sfbuild) step (stage) is just an instance of a module.
  2. Target - What's there is correct but it would be helpful to specify that a target is in fact just a symbolic name given to an artifact(s) produced by some step, so there's a clear relationship between steps and targets. The point is also a bit misleading in a sense that it suggests that targets are used only for leaf steps.
  3. Resolution - Personally I'd try using a bit less technical term than topological sorting. Also, you put a link to a wikipedia page for "hash", but left this one with no explanation. Regarding the part "f4pga checks the existence of the assets" - maybe change "existence" to "availability". I know it's bit less specific, but that's the point. An asset might exist but be not up-to-date for example. It's only a minor detail tho and if you feel like it's easier to understand the paragraph with "existence", then leave it as it is.
  4. Module - the name is a bit unfortunate. While the module loader part of f4pga indeed uses the python modules mechanism under the hood in order to load up extra code at runtime, f4pga modules are in fact just classes that share a common interface. This isn't really explained here.
  5. Dependency - tools are not dependencies. I'm not sure what's the difference between an artifact and a file. Dependencies are always defined as a single file (eg. net) or a set (list) of files (eg. sources) and have a symbolic name. I guess an extra term that would unify target and dependency could be a good idea, because they are essentially the same thing, just presented in a different context. So far I've been using "artifact" as such term.
  6. Configuration - it would be nice to mention that configuration is basically a mechanism for overriding dependencies/values defined in flow definitions.
  7. Definition - might be worth noting that definitions may come with some default values. Also I've used the term "(platform) flow definition" before to make it more specific and avoid confusion with other terms that have "definition" in them, such as "architecture definition". I'm not sure if this will really be a problem tho.
  8. Missing entries - since we have an entry for a dependency, we may also want to have one for a value, since the play similar roles, yet there are some nuances that make them different (ie.: values are typically not file paths, aren't checked for presence on persistent storage, allow more complex structures (eg. dicts) and are not considered to be targets).

There's a bit of a mishmash of terms in there. Some of the terms are related strictly to f4pga (sfbuild), while others relate to other things. The inclusion of Tool is especially confusing. It mentions a python interface, but this interface as far as I get it has nothing to do with f4pga (sfbuild) and describes wrappers from #534, right? The proper python interface for f4pga (sfbuild) is Module, but Tool gets mentioned in a couple of paragraphs specific to f4pga (sfbuild). I think it would be good to separate terms that are universal. like CLI or Constraints, from the ones that are tool-specific, like Configuration or Module.

The inclusion of those little graphics is really nice and I appreciate how they show the structure of flow and step. I think at some point we should make a bigger illustration that shows a resolved graph for some example flow including values that come from flow definition as well as config, but that's really an extra effort on top of an already big one taken for documenting that tool.

@umarcor
Copy link
Contributor Author

umarcor commented Mar 24, 2022

I'm going to leave a post-merge review on this one, since I believe there are still some things to be improved within glossary.

Thanks ❤️

  1. Step - It's an alternative name to stage from flow definitions, right? I have nothing against that name, it might be even better than stage, but I think we should use a single name for that to avoid confusion, so either we keep it as stage or change the python code and flow definitions to refer to it as step.

Essentially, Step and Stage are interchangeable in a linear flow. In that sense, I don't have any strong preference, just pick one and stick with it.
However, when talking about a graph such as the one shown in Flow, the Stage might have a different meaning:

image

See, for example https://docs.gitlab.com/ee/ci/pipelines/:

Pipelines comprise:

  • Jobs, which define what to do. For example, jobs that compile or test code.
  • Stages, which define when to run the jobs. For example, stages that run tests after stages that compile the code.

Note that the terminology of GitLab and GitHub is slightly different:

F4PGA GitHub GitLab
Flow Workflow Pipeline
- (implicit) Stage
- Job Job
Step Step Script

Naturally, we are talking about very different use cases (CI vs a CLI tool), but the fundamental concepts about dealing with a flow are the same.
In fact, I did not add the "stage" or "job" terms in the glossary, in order to stay away from the CI terminology.

What are your thoughts from the point of view of sfbuild's current and future features? Are "stages" supported explicitly or implicitly?

Also, I'm not sure what do you mean by saying that Steps might wrap a single or multiple tools. In f4pga (sfbuild) step (stage) is just an instance of a module.

The assumption is that Tool is a 1-1 match to an executable binary or a shared library. Then, Step might involve a single call through say subprocess.check_call, or it might involve more than one.

From a technical point of view, a step is a function call. Different projects have various plumbings around (it might be a class, an inherited class, plain functions...), but in all cases there is one specific function call (higher or lower in the abstraction) related to "executing one step". That's a natural outcome, since most FPGA tooling wrappers are focused on interacting with CLIs from Python (https://edaa-org.github.io/workflows/index.html).

In many cases, one step involves a single execution of a CLI tool. However, some tools require more than one CLI call to achieve a task. For instance, it might be necessary to run a setup/configuration command before actually running the tool. When running simulations, it is common to first analyse and elaborate the sources and then execute them. That can be done through up to a dozen commands, depending on the tool and size of the project.

From a codebase complexity and maintenance burden point of view, there are cases which favour each CLI call being an independent Step; however, in other cases it is overkill. It mostly comes down to the reusability of the individual calls across "steps"/"jobs".

  1. Target - What's there is correct but it would be helpful to specify that a target is in fact just a symbolic name given to an artifact(s) produced by some step, so there's a clear relationship between steps and targets. The point is also a bit misleading in a sense that it suggests that targets are used only for leaf steps.

First off, I agree that the wording about leaf steps is misleading. That was a poor choice on my side. Actually, in dbhi/run I used the following syntax, which allowed executing any node (step) in the graph as a target:

Returns an ordered list of tasks/jobs required to execute the given target NODE. The > target can be any leaf or mid vertex. The optional argument FILTER allows to > filter the list to include only a subset of the tasks in the subgraphs corresponding > to the node. It can be either of:

  • >FNODE jobs that allow build FNODE.
  • FNODE> jobs that depend on FNODE.
  • >FNODE> jobs that allow to build FNODE and those that depend on it.

With regard to the relationship between target, step and artifact, I tried to keep it vague and not include the term artifact in the definition of target. The reason is that artifacts are explicitly related to some step. At the same time, it is true that whatever target the user asks for, it will affect which steps to execute and (implicitly) which artifacts are available. However, I feel we do not need to constraint here what the API looks like (whether the CLI argument is an artifact name, or it's an step name, or it's one of them and the other given as an option...). That's something we might want to include/add through a note/hint as part of #530.

Overall, I think the important idea about target is: the user asks what to do in a run (based on the multiple possibilities available for a given project + platform).

  1. Resolution - Personally I'd try using a bit less technical term than topological sorting. Also, you put a link to a wikipedia page for "hash", but left this one with no explanation. Regarding the part "f4pga checks the existence of the assets" - maybe change "existence" to "availability". I know it's bit less specific, but that's the point. An asset might exist but be not up-to-date for example. It's only a minor detail tho and if you feel like it's easier to understand the paragraph with "existence", then leave it as it is.

Please, propose a modification of this term. When writing I was not really sure about the level of "resolution". Maybe "availability and state" is more accurate?

  1. Module - the name is a bit unfortunate. While the module loader part of f4pga indeed uses the python modules mechanism under the hood in order to load up extra code at runtime, f4pga modules are in fact just classes that share a common interface. This isn't really explained here.

The important underlying idea here is that Python is an interpreted language and we expect users to extend the f4pga utilities without being forced to upstream their changes before they can use their custom functionality.

Apart from f4pga loading some python modules and modules being classes inheriting a common interface, other projects might wrap f4pga as one step of their larger flows. In fact, projects wrapping f4pga might extend it without loading python modules explicitly, but rather importing them. Therefore, I tried to avoid these details/variations, which might change as f4pga evolves.
Nonetheless, we might want to include/add it through a note/hint as part of #530.

  1. Dependency - tools are not dependencies. I'm not sure what's the difference between an artifact and a file. Dependencies are always defined as a single file (eg. net) or a set (list) of files (eg. sources) and have a symbolic name.

By definition, if a certain step is wrapping a tool, it depends on the tool being available in order to execute it. Whether this is explicit or implicit, or whether f4pga actually does anything to check it are different questions. Tool dependencies can be implicit and it can be assumed that they are available. However, when using e.g. containers, checking whether the corresponding container (and version) is available might be handled similarly to checking whether a "file dependency" exists.

With regard to the difference between artifact and files, an artifact is "some data/information" regardless on the representation, while a file is an entry in the filesystem typically with a "known" format. Particularly, when dealing with very verbose tools, logs might be piped through stdout and post-processed online to extract the meaningful data. Saving the full log to a file might be optional. Saving the meaningful data to a file might also be optional, since it can be passed to a later step as a Python object (a reference in memory).

I guess an extra term that would unify target and dependency could be a good idea, because they are essentially the same thing, just presented in a different context. So far I've been using "artifact" as such term.

I'm not sure I understand. Conceptually, target would be closer to artifact than dependency, isn't it?

  1. Configuration - it would be nice to mention that configuration is basically a mechanism for overriding dependencies/values defined in flow definitions.

I find this to be a (nice) implementation detail, more related to the "platform definition" and "project definition". In this term, I wanted to explain the general idea of "given a set of parameters that can be tweaked, a configuration is one specific combination of values". Whether default values exist (which is the case in sfbuild) or the specific procedure (CLI, config file, environment variables) might change/evolve.

FTR, this term is brought from VUnit. There, multiple sets of top-level generic values can be generated and each combination is named a configuration. Similarly, in the VHDL language, a configuration specifies the values to be used for certain parametrized features in the design.

  1. Definition - might be worth noting that definitions may come with some default values.

Agree, as part of #530.

Also I've used the term "(platform) flow definition" before to make it more specific and avoid confusion with other terms that have "definition" in them, such as "architecture definition". I'm not sure if this will really be a problem tho.

Actually, "architecture definition" should be added here, because it's another usage of definition. The glossary serves for disambiguation as well.

  1. Missing entries - since we have an entry for a dependency, we may also want to have one for a value, since the play similar roles, yet there are some nuances that make them different (ie.: values are typically not file paths, aren't checked for presence on persistent storage, allow more complex structures (eg. dicts) and are not considered to be targets).

I had overlooked value! By looking at https://f4pga--530.org.readthedocs.build/en/530/f4pga/Usage.html#built-in-values, python3 is what I'd consider "tool dependency" (related to the point above). With regard to others, I guess those are directory dependencies. Conceptually, files, directories and tools are "things that need to exist in the filesystem", the difference is which functions are used for checking/creating/removing them.

IMHO, whether to do checks or handle them raw is an orthogonal feature. Similarly, whether a dependency is global, shared or local is another orthogonal characteristic. So, from a technical point of view:

  • Dependency type: file, directory, binary, shared lib, container, ...
  • Dependency scope: step, shared, global.
  • Dependency checks: assuming that a type is a class, and all the dependency classes inherit from some common interface, the checks are functions in the common interface.

Please, let me know if I'm misunderstanding the meaning/usage of value in sfbuild.

There's a bit of a mishmash of terms in there. Some of the terms are related strictly to f4pga (sfbuild), while others relate to other things. The inclusion of Tool is especially confusing. It mentions a python interface, but this interface as far as I get it has nothing to do with f4pga (sfbuild) and describes wrappers from #534, right? The proper python interface for f4pga (sfbuild) is Module, but Tool gets mentioned in a couple of paragraphs specific to f4pga (sfbuild).

In sfbuild, Module == Tool (Interface). For instance, in common_modules/fasm.py the "Python Interface" is the set of functions of class FasmModule: map_io and execute. And the "CLI Tool" (which is called through a subprocess) is genfasm. Similarly, in common_modules/synth.py, the "Python Interface" is yosys_setup_tcl_env, yosys_synth, yosys_conv, etc. and the "CLI Tool" is yosys. Actually, the execute function is an example of "steps might wrap a single or multiple tools" (above). More precisely, "a Python Interface might wrap multiple CLI tools", since Module == Interface == Stage == Step.

As far as I understand, in sfbuild there is no conceptual distinction between a Tool Interface and an Step/Stage Interface (both being Python interfaces). The purpose is not to provide a Python API for VPR, Yosys, FASM, etc. per se, but the intended API is to provide higher abstraction commands, such as synth, place, route, etc. Hence, the fact that synth wraps yosys only and fasm wraps genfasm only is "incidental". In the future, synth might wrap "other synthesis tools". Looking into interchange, synthesizing with Vivado and doing P&R with VPR might be possible. In that context, the distinction between Tool and Step will be more obvious:

  • The Vivado Tool Interface might provide Python access to various features (i.e. RapidWright?), including synthesis, PnR, simulation, etc.
  • The Yosys Tool Interface might provide Python access to various Yosys features.
  • The Synth Step might consume (import) both of them, depending on the target in a given run.

Therefore, regardless of sfbuild having separated files for each tool, or gathering them by functionality, the distinction between Tool and Step persists: tools are defined externally and we can only interact with them through the existing APIs, while steps are "virtual" tools that we can shape freely (including making them "polymorphic").

I think it would be good to separate terms that are universal. like CLI or Constraints, from the ones that are tool-specific, like Configuration or Module.

We can look into making sections, however, that somehow defeats the purpose of the glossary (which is ordered alphabetically). The intended usage is the other way: in the documentation of sfbuild/f4pga (#530) we should use :term: (in rst) or {term} (in md) to create explicit references to the glossary. Overall, users are not expected to read the glossary from the beginning to the end. They are expected to reach there looking for some specific term they found somewhere.

The inclusion of those little graphics is really nice and I appreciate how they show the structure of flow and step. I think at some point we should make a bigger illustration that shows a resolved graph for some example flow including values that come from flow definition as well as config, but that's really an extra effort on top of an already big one taken for documenting that tool.

For some more context about dbhi/run, I did it as part of dbhi.github.io/concept/run. That is a graphviz dot diagram (you can turn the "Show code" switch on) defining multiple dependencies, steps and artifacts. Then, in the dropdown list, you can select one specific target, and it will filter the nodes. Each of those graphs was generated with run (using the syntax explained above).

run uses gonum to handle the graphs and the import/export from/to graphviz. However, there are similar libraries in Python, such as NetworkX. Therefore, I believe once sfbuild is stabilized it should be relatively straightforward to export the flows (including the resolutions, i.e. state) into a dot/SVG that users can include in the docs and/or browse interactively. For instance, the diagrams in https://hdl.github.io/containers/dev/Graphs.html are generated from dot files: https://github.com/hdl/containers/blob/main/doc/dev/Graphs.rst.

On the other hand, we might want to include some diagrams somewhere between https://f4pga.readthedocs.io/en/latest/how.html and hdl/awesome#98, which shows the main flows in F4PGA: HDL (Verilog, SV, VHDL,...) to AMD/Xilinx 7 Series bitstream and HDL to QuickLogic bitstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants