Support for bivariate convolution & normalization with offset (#126)

* Add bivariate-convolution * Add pair-wise convolution + tests + benches * Add take_cnv_pvec_[left/right] to Scratch & updated CHANGELOG.md * cross-base2k normalization with positive offset * clippy & fix CI doctest avx compile error * more streamlined bounds derivation for normalization * Working cross-base2k normalization with pos/neg offset * Update normalization API & tests * Add glwe tensoring test * Add relinearization + preliminary test * Fix GGLWEToGGSW key infos * Add (X,Y) convolution by const (1, Y) poly * Faster normalization test + add bench for cnv_by_const * Update changelog
2026-02-10 05:06:44 +01:00 · 2025-12-21 16:56:42 +01:00
parent 76424d0ab5
commit 4e90e08a71
219 changed files with 6571 additions and 5041 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,10 +1,101 @@
 # CHANGELOG

-## [0.4.0] - 2025-10-27
+## [0.4.1] - 2025-11-26
+
+### Summary
+- Update convolution API to match spqlios-arithmetic & removed API for bivariate tensoring.
+
+## `poulpy-hal`
+- Removed `Backend` generic from `VecZnxBigAllocBytesImpl`.
+- Add `CnvPVecL` and `CnvPVecR` structs.
+- Add `CnvPVecBytesOf` and `CnvPVecAlloc` traits.
+- Add `Convolution` trait, which regroups the following methods:
+  - `cnv_prepare_left_tmp_bytes`
+  - `cnv_prepare_left`
+  - `cnv_prepare_right_tmp_bytes`
+  - `cnv_prepare_right`
+  - `cnv_by_const_apply`
+  - `cnv_by_const_apply_tmp_bytes`
+  - `cnv_apply_dft_tmp_bytes`
+  - `cnv_apply_dft`
+  - `cnv_pairwise_apply_dft_tmp_bytes`
+  - `cnv_pairwise_apply_dft`
+- Add the following Reim4 traits:
+  - `Reim4Convolution`
+  - `Reim4Convolution1Coeff`
+  - `Reim4Convolution2Coeffs`
+  - `Reim4Save1BlkContiguous`
+- Add the following traits:
+  - `i64Save1BlkContiguous`
+  - `i64Extract1BlkContiguous`
+  - `i64ConvolutionByConst1Coeff`
+  - `i64ConvolutionByConst2Coeffs`
+- Update signature `Reim4Extract1Blk` to `Reim4Extract1BlkContiguous`.
+- Add fft64 backend reference code for 
+  - `reim4_save_1blk_to_reim_contiguous_ref`
+  - `reim4_convolution_1coeff_ref`
+  - `reim4_convolution_2coeffs_ref`
+  - `convolution_prepare_left`
+  - `convolution_prepare_right`
+  - `convolution_apply_dft_tmp_bytes`
+  - `convolution_apply_dft`
+  - `convolution_pairwise_apply_dft_tmp_bytes`
+  - `convolution_pairwise_apply_dft`
+  - `convolution_by_const_apply_tmp_bytes`
+  - `convolution_by_const_apply`
+- Add `take_cnv_pvec_left` and `take_cnv_pvec_right` methods to `ScratchTakeBasic` trait.
+- Add the following tests methods for convolution:
+  - `test_convolution`
+  - `test_convolution_by_const`
+  - `test_convolution_pairwise`
+- Add the following benches methods for convolution:
+  - `bench_cnv_prepare_left`
+  - `bench_cnv_prepare_right`
+  - `bench_cnv_apply_dft`
+  - `bench_cnv_pairwise_apply_dft`
+  - `bench_cnv_by_const`
+- Update normalization API and OEP to take `res_offset: i64`. This allows the user to specify a bit-shift (positive or negative) applied to the normalization. Behavior-wise, the bit-shift is applied before the normalization (i.e. before applying mod 1 reduction). Since this is an API break, opportunity was taken to also re-order inputs for better consistency.
+  - `VecZnxNormalize` & `VecZnxNormalizeImpl`
+  - `VecZnxBigNormalize` & `VecZnxBigNormalizeImpl`
+  This change completes the road to unlocking full support for cross-base2k normalization, along with arbitrary positive/negative offset. Code is not ensured to be optimal, but correctness is ensured. 
+
+## `poulpy-cpu-ref`
+- Implemented `ConvolutionImpl` OPE on `FFT64Ref` backend.
+- Add benchmark for convolution.
+- Add test for convolution.
+
+## `poulpy-cpu-avx`
+- Implemented `ConvolutionImpl` OPE on `FFT64Avx` backend.
+- Add benchmark for convolution.
+- Add test for convolution.
+- Add fft64 AVX code for
+  - `reim4_save_1blk_to_reim_contiguous_avx`
+  - `reim4_convolution_1coeff_avx`
+  - `reim4_convolution_2coeffs_avx`
+
+## `poulpy-core`
+- Renamed `size` to `limbs`.
+- Add `GLWEMulPlain` trait:
+  - `glwe_mul_plain_tmp_bytes`
+  - `glwe_mul_plain`
+  - `glwe_mul_plain_inplace`
+- Add `GLWEMulConst` trait:
+  - `glwe_mul_const_tmp_bytes`
+  - `glwe_mul_const`
+  - `glwe_mul_const_inplace`
+- Add `GLWETensoring` trait:
+  - `glwe_tensor_apply_tmp_bytes`
+  - `glwe_tensor_apply`
+  - `glwe_tensor_relinearize_tmp_bytes`
+  - `glwe_tensor_relinearize`
+- Add method tests:
+  - `test_glwe_tensoring`
+
+## [0.4.0] - 2025-11-20

 ### Summary
 - Full support for base2k operations.
- Many improvments to BDD arithmetic.
+- Many improvements to BDD arithmetic.
 - Removal of **poulpy-backend** & spqlios backend.
 - Addition of individual crates for each specific backend.
 - Some minor bug fixes.
@@ -28,7 +119,7 @@
 - Improved Cmux speed

 ### `poulpy-cpu-ref`
- A new crate that provides the refernce CPU implementation of **poulpy-hal**. This replaces the previous **poulpy-backend/cpu_ref**.
+- A new crate that provides the reference CPU implementation of **poulpy-hal**. This replaces the previous **poulpy-backend/cpu_ref**.

 ### `poulpy-cpu-avx`
 - A new crate that provides an AVX/FMA accelerated CPU implementation of **poulpy-hal**. This replaces the previous **poulpy-backend/cpu_avx**.
@@ -76,7 +167,7 @@
 - Added functionality-based traits, which removes the need to import the low-levels traits of `poulpy-hal` and makes backend agnostic code much cleaner. For example instead of having to import each individual traits required for the encryption of a GLWE, only the trait `GLWEEncryptSk` is needed.

 ### `poulpy-schemes`
- - Added basic framework for binary decicion circuit (BDD) arithmetic along with some operations.
+ - Added basic framework for binary decision circuit (BDD) arithmetic along with some operations.

 ## [0.2.0] - 2025-09-15