zig/lib/std/math
Marc Tiehuis 53e6c719ef std/math: optimize division with divisors less than a half-limb
This adds a new path which avoids using compiler_rt generated div
udivmod instructions in the case that a divisor is less than half the
max usize value. Two half-limb divisions are performed instead which
ensures that non-emulated division instructions are actually used. This
does not improve the udivmod code which should still be reviewed
independently of this issue.

Notably this improves the performance of the toString implementation of
non-power-of-two bases considerably.

Division performance is improved ~1000% based on some coarse testing.

The following test code is used to provide a rough comparison between
the old vs. new method.

```
const std = @import("std");
const Managed = std.math.big.int.Managed;

const allocator = std.heap.c_allocator;

fn fib(a: *Managed, n: usize) !void {
    var b = try Managed.initSet(allocator, 1);
    defer b.deinit();
    var c = try Managed.init(allocator);
    defer c.deinit();

    var i: usize = 0;
    while (i < n) : (i += 1) {
        try c.add(a.toConst(), b.toConst());

        a.swap(&b);
        b.swap(&c);
    }
}

pub fn main() !void {
    var a = try Managed.initSet(allocator, 0);
    defer a.deinit();

    try fib(&a, 1_000_000);

    // Note: Next two lines (and printed digit count) omitted on no-print version.
    const as = try a.toString(allocator, 10, .lower);
    defer allocator.free(as);

    std.debug.print("fib: digit count: {}, limb count: {}\n", .{ as.len, a.limbs.len });
}
```

```
==> time.no-print <==
limb count: 10849

________________________________________________________
Executed in   10.60 secs    fish           external
   usr time   10.44 secs    0.00 millis   10.44 secs
   sys time    0.02 secs    1.12 millis    0.02 secs

==> time.old <==
fib: digit count: 208988, limb count: 10849

________________________________________________________
Executed in   22.78 secs    fish           external
   usr time   22.43 secs    1.01 millis   22.43 secs
   sys time    0.03 secs    0.13 millis    0.03 secs

==> time.optimized <==
fib: digit count: 208988, limb count: 10849

________________________________________________________
Executed in   11.59 secs    fish           external
   usr time   11.56 secs    1.03 millis   11.56 secs
   sys time    0.03 secs    0.12 millis    0.03 secs
```

Perf data for non-optimized and optimized, verifying no udivmod is
generated by the new code.

```
$ perf report -i perf.data.old --stdio
- Total Lost Samples: 0
-
- Samples: 90K of event 'cycles:u'
- Event count (approx.): 71603695208
-
- Overhead  Command  Shared Object     Symbol
- ........  .......  ................  ...........................................
-
    52.97%  t        t                 [.] compiler_rt.udivmod.udivmod
    45.97%  t        t                 [.] std.math.big.int.Mutable.addCarry
     0.83%  t        t                 [.] main
     0.08%  t        libc-2.33.so      [.] __memmove_avx_unaligned_erms
     0.08%  t        t                 [.] __udivti3
     0.03%  t        [unknown]         [k] 0xffffffff9a0010a7
     0.02%  t        t                 [.] std.math.big.int.Managed.ensureCapacity
     0.01%  t        libc-2.33.so      [.] _int_malloc
     0.00%  t        libc-2.33.so      [.] __malloc_usable_size
     0.00%  t        libc-2.33.so      [.] _int_free
     0.00%  t        t                 [.] 0x0000000000004a80
     0.00%  t        t                 [.] std.heap.CAllocator.resize
     0.00%  t        libc-2.33.so      [.] _mid_memalign
     0.00%  t        libc-2.33.so      [.] sysmalloc
     0.00%  t        libc-2.33.so      [.] __posix_memalign
     0.00%  t        t                 [.] std.heap.CAllocator.alloc
     0.00%  t        ld-2.33.so        [.] do_lookup_x

$ perf report -i perf.data.optimized --stdio
- Total Lost Samples: 0
-
- Samples: 46K of event 'cycles:u'
- Event count (approx.): 36790112336
-
- Overhead  Command  Shared Object     Symbol
- ........  .......  ................  ...........................................
-
    79.98%  t        t                 [.] std.math.big.int.Mutable.addCarry
    15.14%  t        t                 [.] main
     4.58%  t        t                 [.] std.math.big.int.Managed.ensureCapacity
     0.21%  t        libc-2.33.so      [.] __memmove_avx_unaligned_erms
     0.05%  t        [unknown]         [k] 0xffffffff9a0010a7
     0.02%  t        libc-2.33.so      [.] _int_malloc
     0.01%  t        t                 [.] std.heap.CAllocator.alloc
     0.01%  t        libc-2.33.so      [.] __malloc_usable_size
     0.00%  t        libc-2.33.so      [.] systrim.constprop.0
     0.00%  t        libc-2.33.so      [.] _mid_memalign
     0.00%  t        t                 [.] 0x0000000000000c7d
     0.00%  t        libc-2.33.so      [.] malloc
     0.00%  t        ld-2.33.so        [.] check_match
```

Closes #10630.
2022-02-06 21:39:34 -05:00
..
big std/math: optimize division with divisors less than a half-limb 2022-02-06 21:39:34 -05:00
complex remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
__rem_pio2.zig std/math: fix __rem_pio2 underflow 2021-12-06 01:02:09 +13:00
__rem_pio2_large.zig std/math: replace golang sin/cos/tan with musl sin/cos/tan 2021-12-06 01:02:09 +13:00
__rem_pio2f.zig std/math: replace golang sin/cos/tan with musl sin/cos/tan 2021-12-06 01:02:09 +13:00
__trig.zig std/math: replace golang sin/cos/tan with musl sin/cos/tan 2021-12-06 01:02:09 +13:00
acos.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
acosh.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
asin.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
asinh.zig Fix bug where std.math.asinh64 doesn't respect signedness for negative values (#9940) 2021-10-15 13:55:40 -04:00
atan.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
atan2.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
atanh.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
big.zig std/math: optimize division with divisors less than a half-limb 2022-02-06 21:39:34 -05:00
cbrt.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
ceil.zig zig libc: export floorl and ceill 2021-10-22 10:48:45 -07:00
complex.zig std lib API deprecations for the upcoming 0.9.0 release 2021-11-30 00:13:07 -07:00
copysign.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
cos.zig std/math: hide internal cos/tan functions 2021-12-06 01:17:01 +13:00
cosh.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
epsilon.zig std: add f80 bits 2022-01-28 11:45:04 -07:00
exp.zig Fix copy-paste error that results in incorrect results from exp64() 2021-11-15 19:40:03 -05:00
exp2.zig Fix bug in exp2() (#9999) 2021-10-26 18:57:58 -04:00
expm1.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
expo2.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
fabs.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
floor.zig zig libc: export floorl and ceill 2021-10-22 10:48:45 -07:00
fma.zig freestanding libc: export fmal 2021-10-05 16:56:46 -07:00
frexp.zig add support for f128 @mulAdd 2021-10-05 12:32:26 -07:00
hypot.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
ilogb.zig add support for f128 @mulAdd 2021-10-05 12:32:26 -07:00
inf.zig std: add f80 bits 2022-01-28 11:45:04 -07:00
isfinite.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
isinf.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
isnan.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
isnormal.zig Fix overflow in std.math.isNormal when applied to -Inf or a negative NaN 2022-01-29 18:11:49 +02:00
ldexp.zig std/math: add ldexp and make scalbn an alias 2021-11-23 14:47:01 -05:00
ln.zig AstGen: use reachableExpr for return operand 2021-11-24 14:47:33 -07:00
log.zig AstGen: use reachableExpr for return operand 2021-11-24 14:47:33 -07:00
log1p.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
log2.zig AstGen: use reachableExpr for return operand 2021-11-24 14:47:33 -07:00
log10.zig AstGen: use reachableExpr for return operand 2021-11-24 14:47:33 -07:00
modf.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
nan.zig std: add f80 bits 2022-01-28 11:45:04 -07:00
pow.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
powi.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
round.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
scalbn.zig std/math: add ldexp and make scalbn an alias 2021-11-23 14:47:01 -05:00
signbit.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
sin.zig std/math: Add test cases for #9901 2021-12-06 01:02:09 +13:00
sinh.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
sqrt.zig AstGen: use reachableExpr for return operand 2021-11-24 14:47:33 -07:00
tan.zig std/math: hide internal cos/tan functions 2021-12-06 01:17:01 +13:00
tanh.zig remove redundant license headers from zig standard library 2021-08-24 12:25:09 -07:00
trunc.zig libc: Export truncl 2021-10-24 19:17:55 +02:00