-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
28 changed files
with
914 additions
and
736 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
name: Formatter CI | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
types: [opened, reopened, synchronize] | ||
|
||
jobs: | ||
formatter: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: psf/black@stable |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,9 @@ | ||
{ | ||
"editor.formatOnSave": true, | ||
"editor.formatOnSaveMode": "file", | ||
"files.trimTrailingWhitespace": true | ||
"editor.formatOnSave": true, | ||
"editor.formatOnSaveMode": "file", | ||
"files.trimTrailingWhitespace": true, | ||
"python.formatting.provider": "none", | ||
"[python]": { | ||
"editor.defaultFormatter": "ms-python.black-formatter" | ||
} | ||
} |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,124 +1,197 @@ | ||
# CamTools | ||
|
||
Tools for handling pinhole camera parameters and plotting cameras. | ||
Camtools: camera tools for computer vision. Useful for plotting, converting, | ||
projecting, and ray casting with camera parameters. | ||
|
||
## Install | ||
<a href="https://github.com/yxlao/camtools/actions/workflows/formatter.yml"> | ||
<img src="https://github.com/yxlao/camtools/actions/workflows/formatter.yml/badge.svg" alt="Formatter"> | ||
</a> | ||
|
||
```bash | ||
# For development. | ||
pip install -e . | ||
|
||
# For install. | ||
pip install . | ||
``` | ||
|
||
<!-- ```bash | ||
mkdir build && cd build | ||
cmake .. | ||
make pip-install -j$(nproc) | ||
python -c "import camtools as ct; print(ct.__version__)" | ||
``` --> | ||
|
||
## Matrix conventions | ||
|
||
``` | ||
K : (3, 3) # Camera intrinsic matrix. | ||
# [[fx, s, cx], | ||
# [ 0, fy, cy], | ||
# [ 0, 0, 1]] | ||
# x: goes from top-left to top-right. | ||
# y: goes from top-left to bottom-left. | ||
R : (3, 3) # Rotation matrix. | ||
Rc : (3, 3) # Rc = R.T = R.inv(). | ||
t : (3,) # Translation. | ||
T : (4, 4) # Extrinsic matrix with (0, 0, 0, 1) row below. | ||
# T = [R | t | ||
# 0 | 1] | ||
# T projects world space coordinate to the camera space | ||
# (a.k.a view space or eye space). The camera center in world | ||
# space projected by T becomes [0, 0, 0, 1]^T, i.e. the camera | ||
# space has its origin at the camera center. | ||
# T @ [[|], = [[0], | ||
# [C], [0], | ||
# [|], [0], | ||
# [1]] [1]] | ||
P : (3, 4) # World-to-pixel projection matrix. P = K @ [R | t] = K @ T[:3, :]. | ||
W2P : (4, 4) # World-to-pixel projection matrix. It is P with (0, 0, 0, 1) | ||
# row below. When using W2P @ point_homo, the last | ||
# element is always 1, thus it is ignored. | ||
pose : (4, 4) # Camera pose. pose = T.inv(). pose[:3, :3] = R.T = Rc. pose[:3, 3] = C. | ||
C : (3,) # Camera center. | ||
``` | ||
|
||
## Coordinate conventions | ||
|
||
### 3D to 2D projection | ||
|
||
Project 3D point `[X, Y, Z, 1]` to 2D `[x, y, 1]` pixel, e.g. with | ||
`pixels = ct.project.points_to_pixel(points, K, T)`. | ||
|
||
```python | ||
# 0 -------> 1 (x) | ||
# | | ||
# | | ||
# v (y) | ||
|
||
cols = pixels[:, 0] # cols, width, x, top-left to top-right | ||
rows = pixels[:, 1] # rows, height, y, top-left to bottom-left | ||
cols = np.round(cols).astype(np.int32) | ||
rows = np.round(rows).astype(np.int32) | ||
cols[cols >= width] = width - 1 | ||
cols[cols < 0] = 0 | ||
rows[rows >= height] = height - 1 | ||
rows[rows < 0] = 0 | ||
``` | ||
|
||
It can be confusing to use `x, y, u, v`. Prefer `row` and `col`. | ||
|
||
|
||
### UV coordinates | ||
|
||
```python | ||
# OpenGL convention: | ||
# 1 (v) | ||
# ^ | ||
# | | ||
# | | ||
# 0 -------> 1 (u) | ||
|
||
# The following conversion accounts for pixel size | ||
us = 1 / width * (0.5 + cols) | ||
vs = 1 / height * (0.5 + (height - rows - 1)) | ||
``` | ||
## Installation | ||
|
||
## Notes on vector vs. matrix | ||
|
||
We choose to use 1D array for vector values like `t` and `C`. For example, `t` | ||
is of shape `(3, )` instead of `(3, 1)`. | ||
|
||
```python | ||
# The `@` operator can be directly used to dot a matrix and a vector | ||
# - If both arguments are 2-D they are multiplied like conventional matrices. | ||
# - If either argument is N-D, N > 2, it is treated as a stack of matrices | ||
# residing in the last two indexes and broadcast accordingly. | ||
# - If the first argument is 1-D, it is promoted to a matrix by prepending a 1 | ||
# to its dimensions. After matrix multiplication the prepended 1 is removed. | ||
# - If the second argument is 1-D, it is promoted to a matrix by appending a 1 | ||
# to its dimensions. After matrix multiplication the appended 1 is removed. | ||
|
||
# t is (3, ) and it is promoted to be (3, 1). | ||
C = - R.T @ t | ||
``` | ||
```bash | ||
# Option 1: install from pip. | ||
pip install camtools | ||
|
||
## Unit tests | ||
# Option 2: install from git. | ||
pip install git+https://github.com/yxlao/camtools.git | ||
|
||
```bash | ||
pytest . -s | ||
pytest camtools -s | ||
# Option 3: install from source. | ||
git clone https://github.com/yxlao/camtools.git | ||
cd camtools | ||
pip install -e . # Dev mode, if you want to modify camtools. | ||
pip install . # Install mode, if you want to use camtools only. | ||
``` | ||
|
||
## What can you do with CamTools? | ||
|
||
## TODO | ||
1. Plot cameras. Useful for debugging 3D reconstruction and NeRFs! | ||
|
||
- Full unit tests | ||
- PyTorch/Numpy wrapper (e.g. with `eagerpy`) | ||
```python | ||
import camtools as ct | ||
import open3d as o3d | ||
cameras = ct.camera.create_camera_ray_frames(Ks, Ts) | ||
o3d.visualization.draw_geometries([cameras]) | ||
``` | ||
|
||
<p align="center"> | ||
<img src="./camtools/assets/camera_frames.png" width="360" /> | ||
</p> | ||
|
||
2. Convert camera parameters. | ||
|
||
```python | ||
pose = ct.convert.T_to_pose(T) # Convert T to pose | ||
R, t = ct.convert.T_to_R_t(T) # Convert T to R and t | ||
C = ct.convert.pose_to_C(pose) # Convert pose to camera center | ||
K, T = ct.convert.P_to_K_T(P) # Decompose projection matrix to K and T | ||
# And more... | ||
``` | ||
|
||
3. Projection and ray casting. | ||
|
||
```python | ||
# Project 3D points to pixels. | ||
pixels = ct.project.points_to_pixel(points, K, T) | ||
|
||
# Back-project depth image ot 3D points. | ||
points = ct.project.im_depth_to_points(depth, K, T) | ||
|
||
# Ray cast a triangle mesh to depth image. | ||
im_depth = ct.raycast.mesh_to_depths(mesh, Ks, Ts, height, width) | ||
|
||
# And more... | ||
``` | ||
|
||
4. Image I/O and depth I/O with no surprises. | ||
|
||
```python | ||
ct.io.imread() | ||
ct.io.imwrite() | ||
|
||
ct.io.imread_detph() | ||
ct.io.imwrite_depth() | ||
``` | ||
|
||
Strict type checks and range checks are enforced. These APIs are specifically | ||
designed to solve the following pain points: | ||
|
||
- Is my image `float32` or `uint8`? | ||
- Does it has range `[0, 1]` or `[0, 255]`? | ||
- Is it RGB or BGR? | ||
- Do my image have alpha channel? | ||
- When saving depth image as integer-based `.png`, is it correctly scaled? | ||
|
||
5. Useful command-line tools (run in terminal). | ||
|
||
```bash | ||
# Crop image boarders. | ||
ct crop-boarders *.png --pad_pixel 10 --skip_cropped --same_crop | ||
|
||
# Draw synchronized bounding boxes interactively. | ||
ct draw-bboxes path/to/a.png path/to/b.png | ||
|
||
# For more help. | ||
ct --help | ||
``` | ||
|
||
<p align="center"> | ||
<img src="https://user-images.githubusercontent.com/1501945/241416210-e11ff3bf-22e6-46c0-8ba0-d177a0015323.png" width="400" /> | ||
</p> | ||
|
||
6. And more. | ||
- Solve line intersections. | ||
- COLMAP tools. | ||
- Points normalization. | ||
- ... | ||
|
||
## Camera conventions | ||
|
||
<p align="center"> | ||
<img src="./camtools/assets/camera_coordinates.svg" width="360" /> | ||
</p> | ||
|
||
We follow the standard pinhole camera model: | ||
|
||
- **Camera coordinate:** right-handed, with $Z$ pointing away from the camera | ||
towards the view direction and $Y$ axis pointing down. Note that this is | ||
different from the Blender convention, where $Z$ points towards the opposite | ||
view direction and the $Y$ axis points up. | ||
- **Image coordinate:** starts from the top-left corner of the image, with $x$ | ||
pointing right (corresponding to the image width) and $y$ pointing down | ||
(corresponding to the image height). This is also consistent with OpenCV, but | ||
pay attention that the 0-th dimension in the image array is the height (i.e., | ||
$y$) and the 1-th dimension is the width (i.e., $x$). That is: | ||
- $x$ <=> width <=> column <=> the 1-th dimension | ||
- $y$ <=> height <=> row <=> the 0-th dimension | ||
- `K`: `(3, 3)` camera intrinsic matrix. | ||
```python | ||
K = [[fx, s, cx], | ||
[ 0, fy, cy], | ||
[ 0, 0, 1]] | ||
``` | ||
- `T` or `W2C`: `(4, 4)` camera extrinsic matrix. | ||
```python | ||
T = [[R | t = [[R_01, R_02, R_03, t_0], | ||
0 | 1]] [R_11, R_12, R_13, t_1], | ||
[R_21, R_22, R_23, t_2], | ||
[ 0, 0, 0, 1]] | ||
``` | ||
- `T` is also known as the world-to-camera `W2C` matrix, which transforms a | ||
point in the world coordinate to the camera coordinate. | ||
- `T`'s shape is `(4, 4)`, not `(3, 4)`. | ||
- `T` must be invertible, where `np.linalg.inv(T) = pose`. | ||
- The camera center `C` in world coordinate is projected to `[0, 0, 0, 1]` in | ||
camera coordinate, i.e., | ||
```python | ||
T @ C = np.array([0, 0, 0, 1]).T | ||
``` | ||
- `R`: `(3, 3)` rotation matrix. | ||
```python | ||
R = T[:3, :3] | ||
``` | ||
- `R` is a rotation matrix. It is an orthogonal matrix with determinant 1, as | ||
rotations preserve volume and orientation. | ||
- `R.T == np.linalg.inv(R)` | ||
- `np.linalg.norm(R @ x) == np.linalg.norm(x)`, where `x` is a `(3, )` vector. | ||
- `t`: `(3,)` translation vector. | ||
```python | ||
t = T[:3, 3] | ||
``` | ||
- `t`'s shape is `(3,)`, not `(3, 1)`. | ||
- `pose` or `C2W`: `(4, 4)` camera pose matrix. It is the inverse of `T`. | ||
```python | ||
pose = T.inv() | ||
``` | ||
- `pose` is also known as the camera-to-world `C2W` matrix, which transforms a | ||
point in the camera coordinate to the world coordinate. | ||
- `pose` is the inverse of `T`, i.e., `pose == np.linalg.inv(T)`. | ||
- `C`: camera center. | ||
```python | ||
C = pose[:3, 3] | ||
``` | ||
- `C`'s shape is `(3,)`, not `(3, 1)`. | ||
- `C` is the camera center in world coordinate. It is also the translation | ||
vector of `pose`. | ||
- `P`: `(3, 4)` the camera projection matrix. | ||
- `P` is the world-to-pixel projection matrix, which projects a point in the | ||
homogeneous world coordinate to the homogeneous pixel coordinate. | ||
- `P` is the product of the intrinsic and extrinsic parameters. | ||
```python | ||
# P = K @ [R | t] | ||
P = K @ np.hstack([R, t[:, None]]) | ||
``` | ||
- `P`'s shape is `(3, 4)`, not `(4, 4)`. | ||
- It is possible to decompose `P` into intrinsic and extrinsic matrices by QR | ||
decomposition. | ||
- Don't confuse `P` with `pose`. | ||
- For more details, please refer to the following blog posts: | ||
[part 1](https://ksimek.github.io/2012/08/14/decompose/), | ||
[part 2](https://ksimek.github.io/2012/08/22/extrinsic/), | ||
and [part 3](https://ksimek.github.io/2013/08/13/intrinsic/). | ||
|
||
## Future works | ||
|
||
- Refined APIs. | ||
- Full PyTorch/Numpy compatibility. | ||
- Unit tests. |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.