Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate databases/queries from one-another #235

Merged
merged 29 commits into from
Dec 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
5eb2038
A version that won't build
marcua Nov 21, 2023
7b1efe3
Uncomment
marcua Nov 21, 2023
591f7cd
Newline
marcua Nov 21, 2023
f1e9f40
Newline
marcua Nov 21, 2023
1231f24
Compiles
marcua Nov 22, 2023
5225ec8
No need for only temp
marcua Nov 22, 2023
4b86391
Compiles AND runs
marcua Nov 25, 2023
c2382a1
Move toward nsjail
marcua Dec 5, 2023
e38f428
Make room for nsjail, but still as a noop
marcua Dec 9, 2023
7d236ca
Bring in #234
marcua Dec 9, 2023
a273155
Works end-to-end (need to implement 'touch' for new DBs)
marcua Dec 22, 2023
b03b696
Create DB file in create_database
marcua Dec 22, 2023
24d839d
Move isolated runner into original crate as second binary, dynamicall…
marcua Dec 23, 2023
8eae1f3
Remove hosted_db_runner
marcua Dec 23, 2023
ef0aeba
Move nsjail builder to scripts dir
marcua Dec 23, 2023
647929f
Resolve conflicts
marcua Dec 23, 2023
9159053
fmt
marcua Dec 23, 2023
0ca9cbe
tokio typo
marcua Dec 23, 2023
665b6d0
New AybError variants
marcua Dec 23, 2023
c52aad1
Code review part 1
marcua Dec 25, 2023
a222227
Update docs, remove binary, add nsjail build step
marcua Dec 25, 2023
5ac62f1
Testing docs and fmt
marcua Dec 25, 2023
6822d7d
Fix build command
marcua Dec 25, 2023
db13cb8
nsjail requirements
marcua Dec 25, 2023
59f4974
More nsjail requirements
marcua Dec 25, 2023
ce97699
Docs cleanup
marcua Dec 25, 2023
a37f9ab
Clippy and code review
marcua Dec 25, 2023
3d07ada
Warn if not fully isolated
marcua Dec 26, 2023
ea30777
Clean up for clarity
marcua Dec 27, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,5 +37,9 @@ jobs:
run: cargo fmt --check
- name: Ensure clippy finds no issues
run: cargo clippy
- name: Install nsjail requirements
run: sudo apt-get install -y libprotobuf-dev protobuf-compiler libnl-route-3-dev
- name: Build nsjail
run: scripts/build_nsjail.sh && mv nsjail tests/
- name: Run tests
run: cargo test --verbose
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ tests/ayb_data_postgres
tests/ayb_data_sqlite
tests/smtp_data_10025
tests/smtp_data_10026
tests/nsjail
13 changes: 11 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ description = "ayb makes it easy to create, host, and share embedded databases l
homepage = "https://github.com/marcua/ayb"
marcua marked this conversation as resolved.
Show resolved Hide resolved
documentation = "https://github.com/marcua/ayb#readme"
marcua marked this conversation as resolved.
Show resolved Hide resolved
license = "Apache-2.0"
marcua marked this conversation as resolved.
Show resolved Hide resolved
default-run = "ayb"

[dependencies]
actix-web = { version = "4.4.0" }
Expand All @@ -19,14 +20,14 @@ fernet = { version = "0.2.1" }
lettre = { version = "0.10.4", features = ["tokio1-native-tls"] }
quoted_printable = { version = "0.5.0" }
reqwest = { version = "0.11.22", features = ["json"] }
rusqlite = { version = "0.27.0", features = ["bundled"] }
rusqlite = { version = "0.27.0", features = ["bundled", "limits"] }
regex = { version = "1.10.2"}
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0.108" }
serde_repr = { version = "0.1.17" }
sqlx = { version = "0.6.3", features = ["runtime-actix-native-tls", "postgres", "sqlite"] }
toml = { version = "0.8.8" }
tokio = { version = "1.35.1", features = ["macros", "rt"] }
tokio = { version = "1.35.1", features = ["macros", "process", "rt"] }
prefixed-api-key = { version = "0.1.0", features = ["sha2"]}
prettytable-rs = { version = "0.10.0"}
urlencoding = { version = "2.1.3" }
Expand All @@ -36,3 +37,11 @@ url = { version = "2.5.0", features = ["serde"] }
[dev-dependencies]
assert_cmd = "2.0"
assert-json-diff = "2.0.2"

[[bin]]
marcua marked this conversation as resolved.
Show resolved Hide resolved
name = "ayb"
path = "src/bin/ayb.rs"

[[bin]]
name = "ayb_isolated_runner"
path = "src/bin/ayb_isolated_runner.rs"
52 changes: 52 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,58 @@ $ curl -w "\n" -X POST http://127.0.0.1:5433/v1/marcua/test.sqlite/query -H "aut
{"fields":["name","score"],"rows":[["PostgreSQL","10"],["SQLite","9"],["DuckDB","9"]]}
```

### Isolation
`ayb` allows multiple users to run queries against databases that are
stored on the same machine. Isolation enables you to prevent one user
from accessing another user's data, and allows you to restrict the
resources any one user is able to utilize.

By default, `ayb` uses
[SQLITE_DBCONFIG_DEFENSIVE](https://www.sqlite.org/c3ref/c_dbconfig_defensive.html)
flag and sets
[SQLITE_LIMIT_ATTACHED](https://www.sqlite.org/c3ref/c_limit_attached.html#sqlitelimitattached)
to `0` in order to prevent users from corrupting the database or
attaching to other databases on the filesystem.

For further isolation, `ayb` uses [nsjail](https://nsjail.dev/) to
isolate each query's filesystem access and resources. When this form
of isolation is enabled, `ayb` starts a new `nsjail`-managed process
to execute the query against the database. We have not yet benchmarked
the performance overhead of this approach.

To enable isolation, you must first build `nsjail`, which you can do
through [scripts/build_nsjail.sh](scripts/build_nsjail.sh). Note that
`nsjail` depends on a few other packages. If you run into issues
building it, it might be helpful to see its
[Dockerfile](https://github.com/google/nsjail/blob/master/Dockerfile)
to get a sense of those requirements.

Once you have a path to the
`nsjail` binary, add the following to your `ayb.toml`:

```toml
[isolation]
nsjail_path = "path/to/nsjail"
```

## Testing
`ayb` is largely tested through [end-to-end
tests](tests/e2e.rs) that mimic as realistic an environment as
possible. Individual modules may also provide more specific unit
tests. To run the tests, type:

```bash
cargo test --verbose
```

Because the tests cover [isolation](#isolation), an `nsjail` binary is
required for running the end-to-end tests. To build and place `nsjail`
in the appropriate directory, run:

```bash
scripts/build_nsjail.sh && mv nsjail tests/
```

## FAQ

### Who is `ayb` for?
Expand Down
8 changes: 8 additions & 0 deletions scripts/build_nsjail.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/usr/bin/env bash

git clone https://github.com/google/nsjail.git nsjail-checkout
cd nsjail-checkout
make
mv nsjail ..
cd ..
rm -rf nsjail-checkout
File renamed without changes.
23 changes: 23 additions & 0 deletions src/bin/ayb_isolated_runner.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
use ayb::hosted_db::sqlite::query_sqlite;
use std::env;
use std::path::PathBuf;

/// This binary runs a query against a database and returns the
/// result in QueryResults format. To run it, you would type:
/// $ ayb_isolated_runner database.sqlite SELECT xyz FROM ...
///
/// This command is meant to be run inside a sandbox that isolates
/// parallel invocations of the command from accessing each
/// others' data, memory, and resources. That sandbox can be found
/// in src/hosted_db/sandbox.rs.
fn main() -> Result<(), serde_json::Error> {
marcua marked this conversation as resolved.
Show resolved Hide resolved
let args: Vec<String> = env::args().collect();
let db_file = &args[1];
let query = (args[2..]).to_vec();
let result = query_sqlite(&PathBuf::from(db_file), &query.join(" "));
match result {
Ok(result) => println!("{}", serde_json::to_string(&result)?),
Err(error) => eprintln!("{}", serde_json::to_string(&error)?),
}
Ok(())
}
15 changes: 11 additions & 4 deletions src/hosted_db.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
pub mod paths;
mod sqlite;
mod sandbox;
pub mod sqlite;

use crate::ayb_db::models::DBType;
use crate::error::AybError;
use crate::hosted_db::sqlite::run_sqlite_query;
use crate::hosted_db::sqlite::potentially_isolated_sqlite_query;
use crate::http::structs::AybConfigIsolation;
use prettytable::{format, Cell, Row, Table};
use serde::{Deserialize, Serialize};
use std::path::PathBuf;
Expand Down Expand Up @@ -53,9 +55,14 @@ impl QueryResult {
}
}

pub fn run_query(path: &PathBuf, query: &str, db_type: &DBType) -> Result<QueryResult, AybError> {
pub async fn run_query(
path: &PathBuf,
query: &str,
db_type: &DBType,
isolation: &Option<AybConfigIsolation>,
) -> Result<QueryResult, AybError> {
match db_type {
DBType::Sqlite => Ok(run_sqlite_query(path, query)?),
DBType::Sqlite => Ok(potentially_isolated_sqlite_query(path, query, isolation).await?),
_ => Err(AybError::Other {
message: "Unsupported DB type".to_string(),
}),
Expand Down
17 changes: 13 additions & 4 deletions src/hosted_db/paths.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,22 @@ pub fn database_path(
entity_slug: &str,
database_slug: &str,
data_path: &str,
create_database: bool,
) -> Result<PathBuf, AybError> {
let mut path: PathBuf = [data_path, entity_slug].iter().collect();
if let Err(e) = fs::create_dir_all(&path) {
return Err(AybError::Other {
message: format!("Unable to create entity path for {}: {}", entity_slug, e),
});
if create_database {
if let Err(e) = fs::create_dir_all(&path) {
return Err(AybError::Other {
message: format!("Unable to create entity path for {}: {}", entity_slug, e),
});
}
}

path.push(database_slug);

if create_database && !path.exists() {
fs::File::create(path.clone())?;
}

Ok(path)
}
146 changes: 146 additions & 0 deletions src/hosted_db/sandbox.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
/* Retrieved and modified from
https://raw.githubusercontent.com/Defelo/sandkasten/83f629175d02ebc70fbb16b8b9e05663ea67ccc7/src/sandbox.rs
On December 6, 2023.
Original license:

MIT License

Copyright (c) 2023 Defelo

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
*/

use crate::error::AybError;
use serde::{Deserialize, Serialize};
use std::env::current_exe;
use std::fs::canonicalize;
use std::{
path::{Path, PathBuf},
process::Stdio,
};
use tokio::io::{AsyncReadExt, BufReader};

#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
pub struct RunResult {
/// The exit code of the processes.
pub status: i32,
/// The stdout output the process produced.
pub stdout: String,
/// The stderr output the process produced.
pub stderr: String,
}

pub async fn run_in_sandbox(
nsjail: &Path,
db_path: &PathBuf,
query: &str,
) -> Result<RunResult, AybError> {
let mut cmd = tokio::process::Command::new(nsjail);

cmd.arg("--really_quiet") // log fatal messages only
marcua marked this conversation as resolved.
Show resolved Hide resolved
.arg("--iface_no_lo")
.args(["--mode", "o"]) // run once
.args(["--hostname", "ayb"])
.args(["--bindmount_ro", "/lib:/lib"])
.args(["--bindmount_ro", "/lib64:/lib64"])
.args(["--bindmount_ro", "/usr:/usr"]);

// Set resource limits for the process. In the future, we will
// allow entities to control the resources they dedicate to
// different databases/queries.
cmd.args(["--mount", "none:/tmp:tmpfs:size=100000000"]) // ~95 MB tmpfs
.args(["--max_cpus", "1"]) // One CPU
.args(["--rlimit_as", "64"]) // 64 MB memory limit
.args(["--time_limit", "10"]) // 10 second maximum run
.args(["--rlimit_fsize", "75"]) // 75 MB file size limit
.args(["--rlimit_nofile", "10"]) // 10 files maximum
.args(["--rlimit_nproc", "2"]); // 2 processes maximum

// Generate a /local/path/to/file:/tmp/file mapping.
let absolute_db_path = canonicalize(db_path)?;
let db_file_name = absolute_db_path
.file_name()
.ok_or(AybError::Other {
message: format!(
"Could not parse file name from path: {}",
absolute_db_path.display()
),
})?
.to_str()
.ok_or(AybError::Other {
message: format!(
"Could not convert path to string: {}",
absolute_db_path.display()
),
})?;
let tmp_db_path = Path::new("/tmp").join(db_file_name);
let db_file_mapping = format!("{}:{}", absolute_db_path.display(), tmp_db_path.display());
cmd.args(["--bindmount", &db_file_mapping]);

// Generate a /local/path/to/ayb_isolated_runner:/tmp/ayb_isolated_runner mapping.
marcua marked this conversation as resolved.
Show resolved Hide resolved
// We assume `ayb` and `ayb_isolated_runner` will always be in the same directory,
// so we see what the path to the current `ayb` executable is to build the path.
let ayb_path = current_exe()?;
let isolated_runner_path = ayb_path
.parent()
.ok_or(AybError::Other {
message: format!(
"Unable to find parent directory of ayb from {}",
ayb_path.display()
),
})?
.join("ayb_isolated_runner");
cmd.args([
"--bindmount_ro",
&format!(
"{}:/tmp/ayb_isolated_runner",
isolated_runner_path.display()
),
]);

let mut child = cmd
.arg("--")
.arg("/tmp/ayb_isolated_runner")
.arg(tmp_db_path)
.arg(query)
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()?;

let mut stdout_reader = BufReader::new(child.stdout.take().unwrap());
let mut stderr_reader = BufReader::new(child.stderr.take().unwrap());

let output = child.wait_with_output().await?;

// read stdout and stderr from process
let mut stdout = Vec::new();
let mut stderr = Vec::new();
stdout_reader.read_to_end(&mut stdout).await?;
stderr_reader.read_to_end(&mut stderr).await?;
let stdout = String::from_utf8_lossy(&stdout).into_owned();
let stderr = String::from_utf8_lossy(&stderr).into_owned();

Ok(RunResult {
status: output.status.code().ok_or(AybError::Other {
message: "Process exited with signal".to_string(),
})?,
stdout,
stderr,
})
}
Loading