Conversation
MET-33 Implement `Intersector`
Definition of done:
Look into mocking for pytest (if i need the data strucs to do something |
mugamma
left a comment
There was a problem hiding this comment.
I've added some comments to help with your guys's review. I recommend first merging MET-32: FMB #24 and then moving on to this one, as it is based off of that branch.
| /* | ||
| * Confidence module bindings | ||
| */ | ||
|
|
||
| nb::module_ confidence = m.def_submodule("confidence"); | ||
| nb::class_<ZeroParameterConfidence>(confidence, "ZeroParameterConfidence") | ||
| .def(nb::init<>()) | ||
| .def("get_confidence", &ZeroParameterConfidence::get_confidence); | ||
|
|
||
| nb::class_<TwoParameterConfidence>(confidence, "TwoParameterConfidence") | ||
| .def(nb::init<float, float>()) | ||
| .def("get_confidence", &TwoParameterConfidence::get_confidence); | ||
|
|
There was a problem hiding this comment.
Moved the confidence bindings up so the modules are defined in alphabetical order.
| .def("cov_inv_apply", &FMB::cov_inv_apply, | ||
| "apply the inverse covariance matrix to the given vector", nb::arg("vec")) |
There was a problem hiding this comment.
Added a helper method for applying the inverse covariance matrix. This simplifies the implementation of the intersector and the computation of the quadratic form.
| auto [t, d] = LinearIntersector::intersect(fmb, ray, cam_pose); | ||
| return std::make_tuple(t, d); |
There was a problem hiding this comment.
This is necessary since nanobind can't bind cuda::std::tuple to a python tuple.
| CUDA_CALLABLE __forceinline__ Vec3D vecdiv(const Vec3D u, const Vec3D v) { | ||
| return {u.x / v.x, u.y / v.y, u.z / v.z}; | ||
| } |
There was a problem hiding this comment.
helper function for this file specifically.
There was a problem hiding this comment.
maybe I should make this a lambda inside cov_inv_apply?
There was a problem hiding this comment.
I'm fine with leaving this here since we're not in the header file anyways.
| template <> | ||
| __host__ FMBScene<MemoryLocation::HOST>::FMBScene(size_t size) | ||
| : fmbs_{new FMB[size]}, log_weights_{new float[size]}, size_{size} {} | ||
|
|
||
| template <> | ||
| __host__ FMBScene<MemoryLocation::DEVICE>::FMBScene(size_t size) : size_{size} { | ||
| CUDA_CHECK(cudaMalloc(&fmbs_, size * sizeof(FMB))); | ||
| CUDA_CHECK(cudaMalloc(&log_weights_, size * sizeof(float))); | ||
| } | ||
|
|
||
| template <> | ||
| __host__ FMBScene<MemoryLocation::HOST>::~FMBScene() { | ||
| delete[] fmbs_; | ||
| delete[] log_weights_; | ||
| } | ||
|
|
||
| template <> | ||
| __host__ FMBScene<MemoryLocation::DEVICE>::~FMBScene() { | ||
| CUDA_CHECK(cudaFree(fmbs_)); | ||
| CUDA_CHECK(cudaFree(log_weights_)); | ||
| } |
There was a problem hiding this comment.
The enum type helps us specialize the memory management of the scene type using constructors that are specialized for each MemoryLocation value.
| @@ -1,20 +1,119 @@ | |||
| #pragma once | |||
|
|
|||
| #include <cuda/std/span> | |||
There was a problem hiding this comment.
I should get rid of this.
| if (x_extent <= 0 || y_extent <= 0 || z_extent <= 0) | ||
| throw std::domain_error("a metaball cannot have negative extent"); | ||
| extent_ = {x_extent, y_extent, z_extent}; |
There was a problem hiding this comment.
as we move towards the backward pass this will be unnecessary as the FMB will use an inverse-softplus parameterization
There was a problem hiding this comment.
Still good to have as a check anyways? In such a case we should never expect this error to trigger. Unless you meant this re: speedup
| float w0 = 0.0f, tf = 0.0f, sumexpd = 0.0f; | ||
| for (const auto& fmb : fmb_getter->get_metaballs(ray)) { | ||
| const auto& [t, d] = Intersector::intersect(fmb, ray); | ||
| const auto& [t, d] = Intersector::intersect(fmb, ray, extr); |
There was a problem hiding this comment.
Note that the intersector now accepts the camera pose.
There was a problem hiding this comment.
Nice, so we don't really need to keep the starting location within Ray anymore :)
There was a problem hiding this comment.
hooray to memory savings
| */ | ||
| CUDA_CALLABLE static cuda::std::tuple<float, float> intersect(const FMB& fmb, const Ray& ray, | ||
| const Pose& cam_pose) { | ||
| const auto v = cam_pose.get_rot().apply(ray.direction); |
There was a problem hiding this comment.
This v here is meant to match the in-progress math write-up. I should mention this and fix the rest to match it as well.
There was a problem hiding this comment.
Sounds good! We should end up following whatever notation you end up choosing, both in the code and the final write-up
| for (auto [fmb, w] : scene) { | ||
| _num_fmbs += 1; | ||
| } |
There was a problem hiding this comment.
I'm just testing here that we can indeed use a range-based for loop over an FMBScene inside a kernel.
horizon-blue
left a comment
There was a problem hiding this comment.
This LGTM! Similar to #24, it'd be great to have some documentations on the intersector :). Feel free to merge this in when you feel ready and I can rebase the camera stuff on top of yours
| CUDA_CALLABLE __forceinline__ Vec3D vecdiv(const Vec3D u, const Vec3D v) { | ||
| return {u.x / v.x, u.y / v.y, u.z / v.z}; | ||
| } |
There was a problem hiding this comment.
I'm fine with leaving this here since we're not in the header file anyways.
| float w0 = 0.0f, tf = 0.0f, sumexpd = 0.0f; | ||
| for (const auto& fmb : fmb_getter->get_metaballs(ray)) { | ||
| const auto& [t, d] = Intersector::intersect(fmb, ray); | ||
| const auto& [t, d] = Intersector::intersect(fmb, ray, extr); |
There was a problem hiding this comment.
Nice, so we don't really need to keep the starting location within Ray anymore :)
| cov_inv = fmb_rotmat.T @ np.diag(1 / fmb_extent) @ fmb_rotmat | ||
| t = ((fmb_mu - cam_tran) @ cov_inv @ v) / (v @ cov_inv @ v) | ||
| d = mahalanobis(fmb_mu, cam_tran + t * v, cov_inv) ** 2 | ||
| # cuda computation |
There was a problem hiding this comment.
You meant our C++ implementation? (I think this specific instance is actually not running on cuda haha)
|
Sick stuff! iirc the Linear Intersector, is the one used in the paper, so this is good |
5c46437 to
4b6e3a1
Compare
No description provided.