Skip to content

Commit

Permalink
Improves user documentation intro (#1376)
Browse files Browse the repository at this point in the history
Closes #1369 

### Changes
- Adds improved getting started steps and intro contact information to
the User Guide homepage
- Adds a small section about the execution minutes graph for orgs with a
quota set
- Moves existing signup content to a dedicated signup page
- Changes admonitions from using em dashes to using colons.
- Em dashes are great and I love em.... But sometimes I love them a
little _too_ much and they were a bad fit here.
- Fixes user guide homepage link
- Fixes `ReplayWeb.page` and `ArchiveWeb.page` names
- Fixes broken links (would be good to have a CI system for this I
think)

---------
Co-authored-by: Emma Segal-Grossman <[email protected]>
Co-authored-by: Tessa Walsh <[email protected]>
Co-authored-by: Ilya Kreymer <[email protected]>
  • Loading branch information
Shrinks99 authored Nov 16, 2023
1 parent b23eed5 commit ae8804d
Show file tree
Hide file tree
Showing 9 changed files with 52 additions and 28 deletions.
12 changes: 6 additions & 6 deletions docs/develop/docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,20 +110,20 @@ There are a lot of different options provided by Material for MkDocs — So many
???+ Note
The default call-out, used to highlight something if there isn't a more relevant one — should generally be expanded by default but can be collapsable by the user if the note is long.

!!! Tip "Tip May have a title stating the tip or best practice"
!!! Tip "Tip: May have a title stating the tip or best practice"
Used to highlight a point that is useful for everyone to understand about the documented subject — should be expanded and kept brief.

???+ Info "Info Must have a title describing the context under which this information is useful"
???+ Info "Info: Must have a title describing the context under which this information is useful"
Used to deliver context-based content such as things that are dependant on operating system or environment — should be collapsed by default.

???+ Example "Example Must have a title describing the content"
???+ Example "Example: Must have a title describing the content"
Used to deliver additional information about a feature that could be useful in a _specific circumstance_ or that might not otherwise be considered — should be collapsed by default.

???+ Question "Question Must have a title phrased in the form of a question"
???+ Question "Question: Must have a title phrased in the form of a question"
Used to answer frequently asked questions about the documented subject — should be collapsed by default.

!!! Warning "Warning Must have a title stating the warning"
!!! Warning "Warning: Must have a title stating the warning"
Used to deliver important information — should always be expanded.

!!! Danger "Danger Must have a title stating the warning"
!!! Danger "Danger: Must have a title stating the warning"
Used to deliver information about serious unrecoverable actions such as deleting large amounts of data or resetting things — should always be expanded.
6 changes: 3 additions & 3 deletions docs/user-guide/archived-items.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Archived Items

Archived Items consist of one or more WACZ files created by a Crawl Workflow, or uploaded to Browsertrix. They can be individually replayed, or combind with other Archived Items in a a [Collection](collections.md). The Archived Items page lists all items in the organization.
Archived Items consist of one or more WACZ files created by a Crawl Workflow, or uploaded to Browsertrix. They can be individually replayed, or combined with other Archived Items in a a [Collection](collections.md). The Archived Items page lists all items in the organization.

## Uploading Web Archives

Expand All @@ -24,11 +24,11 @@ For more details on navigating web archives within ReplayWeb.page, see the [Repl

### Files

The Fies tab lists the individually downloadable WACZ files that make up the Archived Item as well as their file sizes.
The Files tab lists the individually downloadable WACZ files that make up the Archived Item as well as their file sizes.

### Error Logs

The Error Logs tab displays a list of errors encountered durring crawling. Clicking an errors in the list will reveal additional information.
The Error Logs tab displays a list of errors encountered during crawling. Clicking an errors in the list will reveal additional information.

All log entries with that were recorded in the creation of the Archived Item can be downloaded in JSONL format by pressing the _Download Logs_ button.

Expand Down
4 changes: 2 additions & 2 deletions docs/user-guide/browser-profiles.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Browser Profiles

Browser profiles are saved instances of a web browsing session that can be reused to crawl websites as they were configued, with any cookies or saved login sessions. Using a pre-configured profile also means that content that can only be viewed by logged in users can be archived, without archiving the actual login credentials.
Browser profiles are saved instances of a web browsing session that can be reused to crawl websites as they were configured, with any cookies or saved login sessions. Using a pre-configured profile also means that content that can only be viewed by logged in users can be archived, without archiving the actual login credentials.

!!! tip "Best practice Create and use web archiving-specific accounts for crawling with browser profiles"
!!! tip "Best practice: Create and use web archiving-specific accounts for crawling with browser profiles"

For the following reasons, we recommend creating dedicated accounts for archiving anything that is locked behind login credentials but otherwise public, especially on social media platforms.

Expand Down
6 changes: 3 additions & 3 deletions docs/user-guide/collections.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

Collections are the primary way of organizing and combining archived items into groups for presentation.

!!! tip "Tip Combining items from multiple sources"
If the crawler has not captured every resource or interaction on a webpage, the [ArchiveWebpage browser extension](https://archiveweb.page/) can be used to manually capture missing content and upload it directly to your org.
!!! tip "Tip: Combining items from multiple sources"
If the crawler has not captured every resource or interaction on a webpage, the [ArchiveWeb.page browser extension](https://archiveweb.page/) can be used to manually capture missing content and upload it directly to your org.

After adding the crawl and the upload to a collection, the content from both will become available in the replay viewer.

Expand All @@ -19,4 +19,4 @@ Collections are private by default, but can be made public by marking them as sh

After a collection has been made public, it can be shared with others using the public URL available in the share collection dialogue. The collection can also be embedded into other websites using the provided embed code. Unsharing the collection will break any previously shared links.

For further resources on embedding archived web content into your own website, see the [ReplayWebpage docs page on embedding](https://replayweb.page/docs/embedding).
For further resources on embedding archived web content into your own website, see the [ReplayWeb.page docs page on embedding](https://replayweb.page/docs/embedding).
30 changes: 21 additions & 9 deletions docs/user-guide/index.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,29 @@
# Getting Started
# Browsertrix User Guide

## Signup
Welcome to the Browsertrix User Guide. This page covers the basics of using Browsertrix, Webrecorder's high-fidelity web archiving system.

### Invite Link
## Getting Started

If you have been sent an [invite](org-settings#members), enter a password and name to create a new account. Your account will be added to the organization you were invited to by an organization admin.
To get started crawling with Browsertrix:

### Open Registration

If the server has enabled signups and you have been given a registration link, enter your email address, password, and name to create a new account. Your account will be added to the server's default organization.
1. Create an account and join an Organization [as described here](signup).
2. After being redirected to the organization's [Overview page](overview), click the _Create New_ button in the top right and select _[Crawl Workflow](crawl-workflows)_ to begin configuring your first crawl!
3. For a simple crawl, choose the _Seeded Crawl_ option, and enter a page url in the _Crawl Start URL_ field. By default, the crawler will archive all pages under the starting path.
4. Next, click _Review & Save_, and ensure the _Run on Save_ option is selected. Then click _Save Workflow_.
5. Wait a moment for the crawler to start and watch as it archives the website!

---

## Start Crawling!
After running your first crawl, check out the following to learn more about Browsertrix's features:

- A detailed list of [crawl workflow setup](workflow-setup) options.
- Adding [exclusions](workflow-setup/#exclusions) to limit your crawl's scope and evading crawler traps by [editing exclusion rules while crawling](crawl-workflows/#live-exclusion-editing).
- Best practices for crawling with [browser profiles](browser-profiles) to capture content only available when logged in to a website.
- Managing archived items, including [uploading previously archived content](archived-items/#uploading-web-archives).
- Organizing and combining archived items with [collections](collections) for sharing and export.
- If you're an admin: [Inviting collaborators to your org](org-settings/#members).


### Have more questions?

A [Crawl Workflow](crawl-workflows) must be created in order to crawl websites automatically. A detailed list of all available workflow configuration options can be found on the [Crawl Workflow Setup](workflow-setup) page.
While our aim is to create intuitive interfaces, sometimes the complexities of web archiving require a little more explanation. If there's something that you found especially confusing or frustrating [please get in touch](mailto:[email protected])!
6 changes: 4 additions & 2 deletions docs/user-guide/overview.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Overview
# Org Overview

The overview page delivers key statistics about the organization's resource usage. It also lets users create crawl workflows, uploaded archived items, collections, and browser profiles through the _Create New ..._ button.

Expand All @@ -12,7 +12,9 @@ For all organizations the storage panel displays the total number of archived it

## Crawling

The crawling panel lists the amount of currently running and waiting crawls as well as the number of total pages captured.
For organizations with a set execution minute limit, the crawling panel displays a graph of how much execution time has been used and how much is currently remaining. Monthly execution time limits reset on the first of each month at 12:00 AM GMT.

The crawling panel also lists the number of currently running and waiting crawls, as well as the total number of pages captured.

## Collections

Expand Down
9 changes: 9 additions & 0 deletions docs/user-guide/signup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Signup

## Invite Link

If you have been sent an [invite](../org-settings/#members), enter a name and password to create a new account. Your account will be added to the organization you were invited to by an organization admin.

## Open Registration

If the server has enabled signups and you have been given a registration link, enter your email address, name, and password to create a new account. Your account will be added to the server's default organization.
4 changes: 2 additions & 2 deletions docs/user-guide/workflow-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ It is also available under the _Additional URLs_ section for Seeded Crawls where
When enabled, the crawler will visit all the links it finds within each page defined in the _List of URLs_ field.

??? example "Crawling tags & search queries with URL List crawls"
This setting can be useful for crawling the content of specific tags or searh queries. Specify the tag or search query URL(s) in the _List of URLs_ field, e.g: `https://example.com/search?q=tag`, and enable _Include Any Linked Page_ to crawl all the content present on that search query page.
This setting can be useful for crawling the content of specific tags or search queries. Specify the tag or search query URL(s) in the _List of URLs_ field, e.g: `https://example.com/search?q=tag`, and enable _Include Any Linked Page_ to crawl all the content present on that search query page.

### Fail Crawl on Failed URL

Expand Down Expand Up @@ -205,7 +205,7 @@ Leave optional notes about the workflow's configuration.

### Tags

Apply tags to the workflow. Tags applied to the workflow will propigate to every crawl created with it at the time of crawl creation.
Apply tags to the workflow. Tags applied to the workflow will propagate to every crawl created with it at the time of crawl creation.

### Collection Auto-Add

Expand Down
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,13 @@ nav:
- develop/frontend-dev.md
- develop/docs.md
- User Guide:
- user-guide/overview.md
- user-guide/index.md
- user-guide/signup.md
- Crawling:
- user-guide/crawl-workflows.md
- user-guide/workflow-setup.md
- user-guide/browser-profiles.md
- user-guide/overview.md
- user-guide/archived-items.md
- user-guide/collections.md
- user-guide/org-settings.md
Expand Down

0 comments on commit ae8804d

Please sign in to comment.