Skip to content

Commit

Permalink
Merge pull request #3548 from vespa-engine/kkraune/auto-fixes
Browse files Browse the repository at this point in the history
Kkraune/auto fixes
  • Loading branch information
kkraune authored Dec 20, 2024
2 parents 0a46fea + 5dcd926 commit a3aaaa9
Show file tree
Hide file tree
Showing 82 changed files with 324 additions and 337 deletions.
4 changes: 2 additions & 2 deletions en/access-logging.html
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ <h2 id="access-log-format">Vespa Access Log Format</h2>
<tr><td>port</td><td>number</td><td>The IP port number of the interface on which the request was received</td><td>yes</td></tr>
<tr><td>remoteaddr</td><td>string</td><td>The IP address of the <a href="#logging-remote-address-port">remote client</a> if specified in HTTP header</td><td>no</td></tr>
<tr><td>remoteport</td><td>string</td><td>The port used from the <a href="#logging-remote-address-port">remote client</a> if specified in HTTP header </td><td>no</td></tr>
<tr><td>peeraddr</td><td>string</td><td>Address of immediate client making request if different than <em>remoteaddr</em></td><td>no</td></tr>
<tr><td>peerport</td><td>string</td><td>Port used by immediate client making request if different than <em>remoteport</em></td><td>no</td></tr>
<tr><td>peeraddr</td><td>string</td><td>Address of immediate client making request if different from <em>remoteaddr</em></td><td>no</td></tr>
<tr><td>peerport</td><td>string</td><td>Port used by immediate client making request if different from <em>remoteport</em></td><td>no</td></tr>
<tr><td>user-principal</td><td>string</td><td>The name of the authenticated user (java.security.Principal.getName()) if principal is set</td><td>no</td></tr>
<tr><td>ssl-principal</td><td>string</td><td>The name of the x500 principal if client is authenticated through SSL/TLS</td><td>no</td></tr>
<tr><td>search</td><td>object</td><td>Object holding search specific fields</td><td>no</td></tr>
Expand Down
2 changes: 1 addition & 1 deletion en/attributes.html
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ <h2 id="attribute-memory-usage">Attribute memory usage</h2>
and up to 2 billion documents per node is supported.
</p>
<p>
<strong>Pro tip:</strong> The proton <em>/state/v1/</em> interface can be explored for attribute memory usage.
<strong>Pro-tip:</strong> The proton <em>/state/v1/</em> interface can be explored for attribute memory usage.
This is an undocumented debug-interface, subject to change at any moment - example:
<em>http://localhost:19110/state/v1/custom/component/documentdb/music/subdb/ready/attribute/artist</em>
</p>
Expand Down
4 changes: 2 additions & 2 deletions en/binarizing-vectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ any number > 0 will map to 1 - this threshold is configurable.
* `tensor<float>(x[8])` is 8 x sizeof(float) = 8 x 32 bits = 256 bits = 32 bytes
* `tensor<int8>(x[1])` is 1 x sizeof(int8) = 1 x 8 bits = 8 bits = 1 byte

In other words, a compression factor of 32, which is expected, mapping a 32 bit float into 1 bit.
In other words, a compression factor of 32, which is expected, mapping a 32-bit float into 1 bit.

As memory usage often is the cost driver for applications, this has huge potential.
However, there is a loss of precision, so the tradeoff must be evaluated.
Expand Down Expand Up @@ -432,7 +432,7 @@ rank-profile app_ranking_bin_full {
Notes:
* The first-phase ranking is as the binarized query above.
* The second-phase ranking is using the full-precision query vector query(q)
with a bit-precision vector casted to float for type match.
with a bit-precision vector cast to float for type match.
* Both query vectors must be supplied in the query.

Note the differences when using full values in the query tensor, see the relevance score for the results:
Expand Down
2 changes: 1 addition & 1 deletion en/build-install-vespa.html
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<p>
To develop with Vespa, follow the
<a href="https://github.com/vespa-engine/vespa#building">guide</a>
to setup a development environment on AlmaLinux 8 using Docker.
to set up a development environment on AlmaLinux 8 using Docker.
</p><p>
Build Vespa Java artifacts with Java &gt;= 17 and Maven &gt;= 3.6.3.
Once built, Vespa Java artifacts are ready to be used and one can build a Vespa application
Expand Down
9 changes: 4 additions & 5 deletions en/components/bundles.html
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ <h2 id="depending-on-non-osgi-ready-libraries">Depending on non-OSGi ready libra
</p>
<p>
Although this approach works for most non-OSGi libraries, it only works for
libraries where the jar file is <em>self contained</em>. If, on the other hand, the
libraries where the jar file is <em>self-contained</em>. If, on the other hand, the
library depends on other installed files, it must be treated as if it was a
<a href="#depending-on-JNI-libraries">JNI library</a>.
</p>
Expand Down Expand Up @@ -330,8 +330,7 @@ <h3 id="including-third-party-libraries">Including third-party libraries</h3>
</p>
<p>
If the external dependency is packaged as an OSGi bundle, it can be deployed
as-is by setting the scope to
<em>provided</em>:
as-is by setting the scope to <em>provided</em>:
</p>
<pre>
&lt;dependency&gt;
Expand Down Expand Up @@ -505,7 +504,7 @@ <h3 id="configuring-the-bundle-plugin">Configuring the Bundle-Plugin</h3>
<h3 id="bundle-plugin-troubleshooting">Bundle Plugin Troubleshooting</h3>
<!-- ToDo: Consider moving this to the troubleshooting section -->
<p>
A package <em>p</em> is imported if all of this holds:
A package <em>p</em> is imported if all of this hold:
</p>
<ol>
<li>Using a class in <em>p</em> directly (i.e. not with reflection) in the
Expand Down Expand Up @@ -609,7 +608,7 @@ <h3 id="could-not-load-class">Could not load class</h3>
If a component is added to services.xml, and its class cannot be found in the declared bundle,
the container will fail to start. For example:
</p>
{% highlight xml %}
<pre>{% highlight xml %}
<component id="com.example.MissingClass" bundle="my-bundle" />
{% endhighlight %}</pre>
<p>
Expand Down
2 changes: 1 addition & 1 deletion en/concrete-documents.html
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ <h2 id="factory-and-copy-constructor">Factory and copy constructor</h2>
Book book = new Book(bookGeneric, bookGeneric.getId());
</pre>
<p>
All of the accessor and mutator methods on <code>Document</code> will work as expected on concrete types.
All the accessor and mutator methods on <code>Document</code> will work as expected on concrete types.
Note that <code>getFieldValue()</code> will <em>generate</em> an
ad-hoc <code>FieldValue</code> <em>every time</em>,
since concrete types don't use them to store data.
Expand Down
6 changes: 3 additions & 3 deletions en/configuring-components.html
Original file line number Diff line number Diff line change
Expand Up @@ -210,13 +210,13 @@ <h2 id="adding-files-to-the-component-configuration">Adding files to the compone
The <code>myFile()</code> and <code>myModel()</code> getter returns a <code>java.nio.Path</code> object,
while the <code>myUrl()</code> getter returns a <code>java.io.File</code> object.
The container framework guarantees that these files are fully present at the given location before the component
constructor is invoked so they can always be accessed right away.
constructor is invoked, so they can always be accessed right away.
</p>
<p>
When the client asks for config that uses the <code>url</code> or <code>model</code> config
type with an URL, the content will be downloaded and cached on the nodes that need it. If
type with a URL, the content will be downloaded and cached on the nodes that need it. If
you want to change the content, the application package needs to be updated with a new URL
for the changed content and the application <a href="application-packages.html">deployed</a>,
otherwise the cached content will still be used. This avoids unintended changes to the
application without any change to it if the content of an URL changes.
application without any change to it if the content of a URL changes.
</p>
2 changes: 1 addition & 1 deletion en/content/buckets.html
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
</p><p>
Documents have string identifiers that maps to a 58 bit numeric location.
A bucket is defined as all the documents that shares a given amount
of least significant bits within the location.
of the least significant bits within the location.
The amount of bits used controls how many buckets will exist.
For instance, if a bucket contains all documents whose 8 LSB bits is 0x01,
the bucket can be split in two by using the 9th bit in the location to split them.
Expand Down
2 changes: 1 addition & 1 deletion en/content/idealstate.html
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ <h3 id="distribution-skew">Distribution skew</h3>
</p><p>
The distribution to distributors are done to create an even distribution between the nodes.
The distributors are free to split the buckets further if the backend wants buckets to contain less data.
They can not use less buckets than are needed for distribution though.
They can not use fewer buckets than are needed for distribution though.
By using a minimum amount of buckets for distribution,
the distributors have more freedom to control sizes of buckets.
</p>
Expand Down
2 changes: 1 addition & 1 deletion en/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This documents tells you what you need to know to contribute.

## Open development
All work on Vespa happens directly on GitHub,
using the [Github flow model](https://docs.github.com/en/get-started/quickstart/github-flow).
using the [GitHub flow model](https://docs.github.com/en/get-started/quickstart/github-flow).
We release the master branch a few times a week, and you should expect it to almost always work.
In addition to the [public Screwdriver build](https://cd.screwdriver.cd/pipelines/6386)
we have a large acceptance and performance test suite which
Expand Down
2 changes: 1 addition & 1 deletion en/contributing/cloudconfig-model-plugins.html
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ <h3 id="example">Example</h3>
<li>A search+content cluster storing and searching a partition</li>
</ul>
<p>
In addition it sets up the wiring between these clusters,
In addition, it sets up the wiring between these clusters,
the matching configuration of content and query processing and more.
</p><p>
Going from the simple user-facing config shown above to this complete system specification
Expand Down
4 changes: 2 additions & 2 deletions en/contributing/configapi-dev-cpp.html
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ <h2 id="subscribing-and-getting-config">Subscribing and Getting Config</h2>
To use the API, you must include the header of the generated classes
as well as the header shown in the example below.
The config API resides in the <code>config</code> namespace:
<pre>{% highlight xml %}
<pre>{% highlight cpp %}
#include <config/config.h>

using namespace config;
Expand Down Expand Up @@ -176,7 +176,7 @@ <h2 id="unit-testing">Unit Testing</h2>
{% include note.html content="When using builders for unit testing,
there is an underlying assumption that the configured application have subscribed to all configs
before the builders are mutated.
Otherwise, the application may try to retrieve a inconsistent configuration.
Otherwise, the application may try to retrieve an inconsistent configuration.
In general, try to design the application so that one can verify configuration changes in tests."%}


Expand Down
2 changes: 1 addition & 1 deletion en/contributing/configapi-dev-java.html
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ <h2 id="subscribing-and-getting-config">Subscribing and getting config</h2>
A <code>ConfigSubscriber</code> is capable of subscribing to one or more configs.
The example shown here uses simplified error handling:
</p>
<pre>{% highlight xml %}
<pre>{% highlight java %}
ConfigSubscriber subscriber = new ConfigSubscriber();
ConfigHandle<MotdConfig> handle = subscriber.subscribe(MotdConfig.class, "motdserver2/0");
if (!subscriber.nextConfig()) throw new RuntimeException("Config timed out.");
Expand Down
2 changes: 1 addition & 1 deletion en/contributing/configapi-dev.html
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,7 @@ <h3 id="guidelines-tips">Advice on Config Modelling</h3>
by ensuring that all of the configs comes from the same config reload.
</p><p>
<strong>Tip: </strong>
Setup your entire <em>tree</em> of configs in one thread to ensure consistency,
Set up your entire <em>tree</em> of configs in one thread to ensure consistency,
and configure your system once all of the configs have arrived.
This also maps best to the ConfigSubscriber, since it is not thread safe.
</p>
7 changes: 3 additions & 4 deletions en/cpu-support.html
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,11 @@
</p>

<p><strong>To start a Vespa Docker container using this image:</strong></p>
<div class="pre-parent">
<button class="d-icon d-duplicate pre-copy-button" onclick="copyPreContent(this)"></button>
<div class="pre-parent">
<button class="d-icon d-duplicate pre-copy-button" onclick="copyPreContent(this)"></button>
<pre data-test="exec">
$ docker run --detach --name vespa --hostname vespa-container \
--publish 8080:8080 --publish 19071:19071 \
vespaengine/vespa-generic-intel-x86_64
</pre>
</div>
</p>
</div>
2 changes: 1 addition & 1 deletion en/document-processing.html
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ <h2 id="document-processors">Document Processors</h2>
<p>
The <code>process()</code> method should loop through all
document operations in <code>Processing.getDocumentOperations()</code>, do
whatever it sees fit to them, and return a Progress:
whatever it sees fit to them, and return a <code>Progress</code>:
</p>
<pre>{% highlight java %}
public Progress process(Processing processing) {
Expand Down
2 changes: 1 addition & 1 deletion en/document-summaries.html
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ <h2 id="performance">Performance</h2>
and finally serialization to JSON (default rendering) + rendering and network.</li>
</ul>
<p>
The work, and thus latency increases with more <a href="reference/query-api-reference.html#hits">hits</a>.
The work, and thus latency, increases with more <a href="reference/query-api-reference.html#hits">hits</a>.
Use <a href="query-api.html#query-tracing">query tracing</a> to analyze performance.
</p>
<p>
Expand Down
2 changes: 1 addition & 1 deletion en/document-v1-api-guide.html
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ <h3 id="visiting-throughput">Visiting throughput</h3>

<h2 id="getting-started">Getting started</h2>
<p>
Pro tip: It is easy to generate a <code>/document/v1</code> request by using the <a href="vespa-cli.html">Vespa CLI</a>,
Pro-tip: It is easy to generate a <code>/document/v1</code> request by using the <a href="vespa-cli.html">Vespa CLI</a>,
with the <code>-v</code> option to output a generated <code>/document/v1</code> request - example:
</p>
<pre>
Expand Down
2 changes: 2 additions & 0 deletions en/documents.html
Original file line number Diff line number Diff line change
Expand Up @@ -110,11 +110,13 @@ <h3 id="docid-in-results">Document IDs in search results</h3>
It is therefore recommended to put your own unique identifier
(usually the "user-specified-identifier" above) in a document field,
typically named "myid" or "shortid" or similar:
</p>
<pre>
field shortid type string {
indexing: attribute | summary
}
</pre>
<p>
This enables using a
<a href="document-summaries.html">document-summary</a> with only
in-memory fields while still getting the identifier you actually
Expand Down
25 changes: 13 additions & 12 deletions en/embedding.html
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ <h2 id="embedding-a-query-text">Embedding a query text</h2>
<p>Both single and double quotes are permitted, and if you have only configured a single embedder,
you can skip the embedder id argument and the quotes.</p>

<p>The text argument can be supplied by a referenced parameter instead, using the <code>@parameter</code> syntax:
<p>The text argument can be supplied by a referenced parameter instead, using the <code>@parameter</code> syntax:</p>
<pre>{% highlight json %}
{
"yql": "select * from doc where {targetHits:10}nearestNeighbor(embedding_field, query_embedding)",
Expand All @@ -74,8 +74,6 @@ <h2 id="embedding-a-query-text">Embedding a query text</h2>
}
{% endhighlight %}</pre>

</p>

<p>Remember that regardless of whether you are using embedders, input tensors
must always be <a href="reference/schema-reference.html#inputs">defined in the schema's rank-profile</a>.</p>

Expand Down Expand Up @@ -425,7 +423,8 @@ <h3 id="splade-embedder">SPLADE embedder</h3>
</p>
<p>
The embedder destination tensor is defined in the <a href="schemas.html">schema</a>.
The following demonstrates how to use the SPLADE embedder in the document schema to <a href="#embedding-a-document-field">embed a document field</a>.</p>
The following demonstrates how to use the SPLADE embedder in the document schema to
<a href="#embedding-a-document-field">embed a document field</a>.
</p>

<pre>
Expand Down Expand Up @@ -459,7 +458,7 @@ <h3 id="splade-embedder">SPLADE embedder</h3>

<h4 id="splade-ranking">SPLADE ranking</h4>
<p>
See the splade <a href="https://github.com/vespa-engine/sample-apps/tree/master/splade">splade</a> sample application for how to use SPLADE in ranking,
See the <a href="https://github.com/vespa-engine/sample-apps/tree/master/splade">splade</a> sample application for how to use SPLADE in ranking,
including also how to use the SPLADE embedder with an array of strings (representing chunks).
</p>

Expand Down Expand Up @@ -660,12 +659,12 @@ <h3 id="model-download-failure">Model download failure</h3>
</p>

<p>This will also be visible in the Vespa status output as the container will not listen to its port:</p>
<pre>
<pre>
vespa status -t http://127.0.0.1:8080
Container at http://127.0.0.1:8080 is not ready: unhealthy container at http://127.0.0.1:8080/status.html: Get "http://127.0.0.1:8080/status.html": EOF
Error: services not ready: http://127.0.0.1:8080
</pre>
</p>
</pre>


<h3 id="tensor-shape-mismatch">Tensor shape mismatch</h3>
<p>
Expand All @@ -688,8 +687,9 @@ <h3 id="tensor-shape-mismatch">Tensor shape mismatch</h3>
</p>

<h3 id="input-names">Input names</h3>
<p>The native embedder implementations expect that the ONNX model accepts certain input names. If the names are incorrect, it will cause the Vespa container service to not start and
you will see an error message in the vespa log like:</p>
<p>The native embedder implementations expect that the ONNX model accepts certain input names.
If the names are incorrect, it will cause the Vespa container service to not start,
and you will see an error message in the vespa log like:</p>
<pre>
WARNING container Container.com.yahoo.container.di.Container
Caused by: java.lang.IllegalArgumentException: Model does not contain required input: 'input_ids'. Model contains: my_input
Expand All @@ -710,8 +710,9 @@ <h3 id="input-names">Input names</h3>
{% endhighlight %}</pre>

<h3 id="output-names">Output names</h3>
<p>The native embedder implementations expect that the ONNX model produces certain output names. It will cause the Vespa stateless container service to not start and
you will see an error message in the vespa log like:</p>
<p>The native embedder implementations expect that the ONNX model produces certain output names.
It will cause the Vespa stateless container service to not start,
and you will see an error message in the vespa log like:</p>
<pre>
Model does not contain required output: 'test'. Model contains: last_hidden_state
</pre>
Expand Down
2 changes: 1 addition & 1 deletion en/features.html
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ <h3 id="ranking-and-inference">Ranking and inference</h3>

<ul>
<li>All results are ranked using a configured ranking function, selected in the query.
<li>A ranking function may be any mathematical function over scalars or tensors (multi-dimensional arrays).
<li>A ranking function may be any mathematical function over scalars or tensors (multidimensional arrays).
<li>Scalar functions include an "if" function to express business logic and decision trees.
<li>Tensor functions include a powerful set of primitives and composite functions which
allows expression of advanced machine-learned ranking functions such as e.g. deep neural nets.
Expand Down
2 changes: 1 addition & 1 deletion en/geo-search.html
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ <h3 id="query-syntax">Query Syntax</h3>
This document does not describe how to write a searcher plugin for the Container,
refer to the <a href="searcher-development.html">container documentation</a>.
However, let us review the syntax expected by <em>DistanceToPath</em>.
As noted in the the <a href="reference/rank-features.html#distanceToPath(name).distance">
As noted in the <a href="reference/rank-features.html#distanceToPath(name).distance">
rank features reference</a>,
the path is supplied as a query parameter by name of the feature and the <code>path</code> keyword:
</p>
Expand Down
Loading

0 comments on commit a3aaaa9

Please sign in to comment.