Last 12 weeks · 1 commit
2 of 6 standards met
I'm opening this ticket to discuss a potential change in Hyperfine's default behavior. For now, we always use an intermediate shell to run benchmarked programs. This allows users to make use of the full shell syntax. For example, you can run things like or that wouldn't be possible without a shell. The shell-spawn time is automatically subtracted by Hyperfine. But it still adds noise to the benchmark. The potential new default would be which can already be used today. This would have several advantages: There is no additional noise in the measurement from launching the shell. We run the program directly. All performance metrics/counters directly apply to the benchmarked program. We do not have to perform any calibration or subtraction to account for the intermediate shell. Not having to perform calibration saves some time (~100 ms) Disadvantages are: you need to do things like write instead of , expand environment variables yourself, etc. Or use when needed. Note that quoting is supported in no-shell mode though. So works fine. This might be potentially confusing to users who expect shell syntax to be available (all Hyperfine 1.0 users). If we simply fail, that's probably okay. But there might be cases where it's harder to debug what's going on ( passed to the program unexpanded).
Repository: sharkdp/hyperfine. Description: A command-line benchmarking tool Stars: 27647, Forks: 448. Primary language: Rust. Languages: Rust (93%), Python (7%). License: Apache-2.0. Topics: benchmark, cli, command-line, rust, terminal, tool. Latest release: v1.20.0 (3mo ago). Open PRs: 17, open issues: 49. Last activity: 2w ago. Community health: 42%. Top contributors: sharkdp, dependabot[bot], dependabot-preview[bot], dependabot-support, stevepentland, berombau, ZephyrDRH, ppaulweber, jasonpeacock, scampi and others.
Rust
Hi @sharkdp . First, thanks for this great utility! It would be nice if, after a batch of runs is complete and statistical data is available, I could have the option to auto-discard outliers and re-compute the statistical data. Threshold should be configurable, but I think +3-sigma is a reasonable "sane default" (debatable though). We could also consider a limit for the number of discards possible to prevent accidentally missing a "double hump" histogram profile that would actually be good to know about. Something like "no more than 5% of runs can be discarded w/o warning". I don't want to overcomplicate it. Just a simple filter would be a great start. Thanks again!