GitShow/mholt/archives
mholt

archives

Cross-platform library to create & extract archives, compress & decompress files, and walk virtual file systems across various formats

by mholt
7ziparchivesbrotlibzip2compressionextractfsgo
Star on GitHubForkWebsite

Go

427 stars40 forks18 contributorsActive · 1mo agoSince 2024v0.1.5MIT

Meet the team

See all 18 on GitHub →
mholt
mholt31 contributions
M0Rf30
M0Rf303 contributions
sephriot
sephriot3 contributions
darkliquid
darkliquid2 contributions
dpgarrick
dpgarrick2 contributions
dirkmueller
dirkmueller2 contributions
Gusted
Gusted1 contribution
solvingj
solvingj1 contribution

Languages

View on GitHub →
Go100%

Commit activity

Last 12 weeks · 4 commits

Full graph →

Community health

2 of 6 standards met

Community profile →
57
✓README✓License○Contributing○Code of Conduct○Issue Template○PR Template

Recent PRs & issues

Active · 2 in progress · Last activity 1mo ago
See all on GitHub →
unxed
Support parallel multi-core XZ decompression when input is seekable (io.ReaderAt)OpenIssue

Hello! I would like to propose an optimization for XZ decompression. Currently, XZ decompression is performed sequentially in a single thread. We have implemented a concurrent parallel block decompressor in a fork of the underlying dependency at (which functions as a drop-in replacement). By leveraging the XZ format's native index block boundaries, the parallel decompressor parses the index backwards in O(1) time and decompresses independent blocks concurrently using a worker pool. To keep memory utilization bounded and prevent Garbage Collector overhead, we also introduced pooling for both decompressed block buffers and the large LZMA decoder dictionary slices. Benchmarks (on a 2-core / 4-thread CPU with a 20MB payload): Vanilla decompressor: ~14 MB/s Optimized sequential decompressor: ~30 MB/s Optimized parallel decompressor:** ~70 MB/s On systems with 4, 8, or more physical cores, the throughput scales near-linearly, easily exceeding 120+ MB/s. We can utilize this optimization when the input source implements and the total compressed stream size is known. Integration Instruction To utilize the parallel reader within your Go code when dealing with random-access inputs, you can check if the underlying stream supports seeking and pass it to the parallel decompressor.

unxed · 1w ago
unxed
Enable parallel XZ decompression for seekable input streams.OpenPR

Summary This pull request implements multi-threaded, concurrent XZ decompression for independent blocks when the input stream supports random access. This change is a direct response to the issue described in #75. Performance Impact By leveraging block-level parallelism, decompression performance scales near-linearly with available CPU cores on multi-block XZ files. Sequential baseline: ~14–32 MB/s Parallel decompression (multi-core): Up to 70–80 MB/s (on modern quad-core CPUs) Changes & Implementation Details 1. Dependency Switch: Replaced the original dependency with the fork. The fork includes a performance-optimized sequential decoder (using register caching and manual inlining) and the new worker pool implementation. 2. Interface Check: The implementation in now checks if the incoming implements (supporting both and ). 3. Parallel Activation: When seek/read-at capabilities are verified, we calculate the stream size, account for potential starting offsets (by wrapping with ), and launch the concurrent . 4. Resilient Fallback: If the stream is unseekable (standard sequential pipes) or initialization fails, the decoder safely falls back to standard sequential decompression without errors. 5. Testing: Added a comprehensive suite of unit tests in covering happy paths, offset streams, sequential fallbacks, seek error recovery, and partial reads. Fixes #75

unxed · 1w ago
SinTan1729
Support skipping UID and GID preservation during extrationOpenIssue

What would you like to have changed? I think this mostly applied to , but may apply to others as well. There's currently a way to skip preserving UID and GID during the archive creation process. I'd like to request the same for extraction as well. Why is this feature a useful, necessary, and/or important addition to this project? I'm using it inside a project where I don't really care about the UID and GID because they're modified later anyway. My terminal is flooded with a lot of warnings which I'd rather not see. What alternatives are there, or what are you doing in the meantime to work around the lack of this feature? I'm thinking about processing the output, and potentially filtering these warnings out. Please link to any relevant issues, pull requests, or other discussions. N/A

SinTan1729 · 2w ago

Recent fixes

View closed PRs →
jeremyje
ArchiveFS.ReadDir does not list implicit directories in ZIP archives.ClosedIssue

What version of the package or command are you using? github.com/mholt/archives v0.1.5 Regressed from: https://github.com/mholt/archiver/issues/338 What are you trying to do? with a zip file that does not have explicit directory entries. Archiver works fine when there are explicit directory entries like: What steps did you take? This returns: What did you expect to happen, and what actually happened instead? Should Return How do you think this should be fixed? I'm guessing the call should be smarter to understand that if there are files listed with directories not represented in the dir list that they should have a entry stubbed in. Please link to any related issues, pull requests, and/or discussion Original Fix: https://github.com/mholt/archiver/pull/339 Original Bug: https://github.com/mholt/archiver/issues/338 Bonus: What do you use this package for, and do you have any other suggestions or feedback? to browse different archives and archive file nesting.

jeremyje · 1w ago
iki
Upgrade sevenzip to 1.6.2 to fix `sevenzip: unsupported compression algorithm` on 7z ARM64 archivesClosedIssue

What version of the package or command are you using? v0.1.5 What are you trying to do? Unpack ImageMagick release archive What steps did you take? Created PR to add ImageMagick to aqua registry in https://github.com/aquaproj/aqua-registry/pull/50281 and run CI tests that failed to unpack the ARM64 archive What did you expect to happen, and what actually happened instead? Expected unpacking archive OK, got error , see https://github.com/aquaproj/aqua/issues/4639 How do you think this should be fixed? According to @bodgit, author of , it can be fixed by upgrading to v1.6.2, released 3 weeks ago with support for ARM64 executable compression, see https://github.com/bodgit/sevenzip/issues/449 Please link to any related issues, pull requests, and/or discussion Original PR that hit the error: https://github.com/aquaproj/aqua-registry/pull/50281 Aqua issue: https://github.com/aquaproj/aqua/issues/4639 Sevenzip issue: https://github.com/bodgit/sevenzip/issues/449 Bonus: What do you use this package for, and do you have any other suggestions or feedback? Aqua is a package management tool and uses archives to unpack release archives.

iki · 4w ago
Structured data for AI agents

Repository: mholt/archives. Description: Cross-platform library to create & extract archives, compress & decompress files, and walk virtual file systems across various formats Stars: 427, Forks: 40. Primary language: Go. Languages: Go (100%). License: MIT. Homepage: https://pkg.go.dev/github.com/mholt/archives Topics: 7zip, archives, brotli, bzip2, compression, extract, fs, go, golang, gzip, lz4, lzip, rar, snappy, streams, tar, xz, zip, zlib, zstandard. Latest release: v0.1.5 (8mo ago). Open PRs: 2, open issues: 6. Last activity: 1mo ago. Community health: 57%. Top contributors: mholt, M0Rf30, sephriot, darkliquid, dpgarrick, dirkmueller, Gusted, solvingj, joonas, mikelolasagasti and others.

·@ofershap

Replace github.com with gitshow.dev