I recently tried using Claude to look at my ubi project
and tell me about any bugs it found. To avoid having to pay for Claude Code, I used the web API and
uploaded a tarball of the ubi crate, which
contains all the important bits of ubi.
I was curious to see how well it would do. AI tools for coding have improved a lot in the past year (and probably in the past month and week). Would it find a lot of bugs I’d never noticed, or would it be full of nonsense?
Here’s the full report it produced:
Bug Analysis: UBI Asset Selection Logic
Overview
This document contains a detailed analysis of potential bugs and issues in the ubi (Universal Binary
Installer) project’s asset selection logic, based on reviewing the source code in picker.rs,
installer.rs, archive.rs, os.rs, and arch.rs.
Click to expand …
picker.rs,
installer.rs, archive.rs, os.rs, and arch.rs.Critical Bugs
1. Android Detection Too Broad, Linux ARM64 Confusion
Location: os.rs, line 7 and picker.rs, lines 170-173
Issue: The Android regex is simply (?i:android) without word boundaries, and the special-case
filtering only applies when platform.target_os != OS::Android:
| |
Problems:
- The Android regex can match filenames containing “android” anywhere, even as part of another word
- The logic assumes Linux matches will come first, then filters Android, but both might match
- A file like
myapp-armandroid-linux.tar.gzmight be incorrectly filtered
Impact: This caused the bug mentioned in release notes where ubi incorrectly picked Android releases on Linux aarch64.
Fix: Use word boundaries for Android regex: regex!(r"(?i:(?:\b|_)android(?:\b|_))") and ensure
proper OS precedence.
2. Exact Match Doesn’t Check Executability 🔴 CRITICAL
Location: installer.rs, lines 368-370
Issue: When an exact filename match is found in an archive, the code immediately returns it WITHOUT checking if the file is executable:
| |
This is even documented in the README: “Note that if it finds an exact match, it does not check the file’s mode.”
Impact:
- On Unix systems, if an archive contains both a non-executable script and an executable binary with the project name, the non-executable might be selected
- A malicious or poorly structured archive could have a non-executable data file with the right name that gets selected over an actual executable
Example Failure:
| |
The non-executable myapp would be selected.
Fix: Always check executability, even for exact matches on Unix systems.
3. ARM Regex Too Permissive on macOS
Location: arch.rs, lines 40-62
Issue: The macos_aarch64_only_re() includes just arm as a match:
| |
Impact:
- Will match
armv7,armhf,arm-linux,armchair, etc. - Could incorrectly select 32-bit ARM binaries on ARM64 macOS
- Recent fix changed from “arm64” to “arm”, making it even broader
Example: A file named myapp-darwin-armv7.tar.gz would match on macOS ARM64.
Fix: Be more specific: require either 64 indicators or use negative lookahead to exclude 32-bit variants.
Significant Bugs
4. 64-bit Filter Uses Naive String Search ⚠️ HIGH SEVERITY
Location: picker.rs, lines 322-335
Issue: The 64-bit filtering uses a simple substring search for “64”:
| |
Problems:
- Matches version numbers:
myapp-v1.64.2-linux-i686.tar.gzwould be considered 64-bit even though it’s i686 (32-bit) - Matches years:
myapp-2064-edition-arm32.tar.gzwould be considered 64-bit - Matches hashes/IDs:
myapp-build-abc64def-armv7.tar.gzwould be considered 64-bit - Matches other contexts:
sha64,base64,sha256464, etc.
Impact: Can incorrectly include 32-bit binaries in the “64-bit” filter results. While x86_64 can run 32-bit x86 binaries, this breaks down for other architectures:
- On ARM64, should prefer
aarch64overarm32 - On x86_64, should prefer native 64-bit over 32-bit compatibility mode
- The wrong binary might be selected based on alphabetical ordering if both pass through
Example Failure Scenarios:
Scenario 1 - Version number interference:
| |
Scenario 2 - Build IDs:
| |
Proper Fix - Architecture-Specific 64-bit Detection:
Replace the naive string search with architecture-aware regex matching:
| |
Alternative Simpler Fix (if full regex solution is too complex):
Use a more restrictive substring search that requires word boundaries around “64”:
| |
Benefits of the Fix:
- Architecture-specific matching prevents false positives
- Explicitly excludes version numbers like “1.64.0”
- Requires “64” to be part of an architecture indicator (word boundaries)
- Falls back gracefully if no 64-bit indicators found
- More maintainable - uses existing architecture regexes
Testing Required:
myapp-v1.64.0-linux-i686.tar.gzshould NOT match on x86_64myapp-linux-x86_64.tar.gzSHOULD match on x86_64myapp-aarch64-darwin.tar.gzSHOULD match on ARM64myapp-arm64-v2.64.tar.gzSHOULD match on ARM64 (arm64 indicator present, version ignored)
5. MIPS Regex Ordering Issue
Location: arch.rs, lines 122-138 and 158-174
Issue: The mips_re() regex will match mips64 because it just looks for mips:
| |
Impact: On a 32-bit MIPS platform, it might match 64-bit assets. The code relies on checking
mips64 first, but if regexes are applied in wrong order, this fails.
Fix: Add negative lookahead: mips(?!64) or be more careful about ordering.
6. Musl Detection Relies on External Binary
Location: Mentioned in docs; detection happens before picker
Issue: The musl detection runs ldd /bin/ls to check for musl.
Problems:
/bin/lsmight not exist on minimal systems or non-FHS systemslddmight not be in PATH- The command might fail for other reasons
- Users can’t easily override false negatives
Impact: On musl systems without /bin/ls, detection fails and glibc binaries might be selected.
Fix:
- Check multiple common binaries
- Provide better fallback mechanisms
- Make the override mechanism more discoverable
7. Libc Filtering is Too Lenient
Location: picker.rs, lines 235-258
Issue: The libc matching allows assets without any libc indicator:
| |
Impact: On musl systems, if there’s an asset with no libc indicator and one with -gnu, the
code filters out -gnu but accepts the unmarked one, which might actually be glibc-linked.
Example:
- Platform: x86_64-unknown-linux-musl
- Assets:
app-linux-x86_64.tar.gz(actually glibc),app-linux-x86_64-gnu.tar.gz - Result: Selects the unmarked one, which might not work on musl
Fix: When on musl and multiple matches exist, prefer ones explicitly marked -musl.
8. Windows Extension Checking
Location: installer.rs, lines 376-384
Issue: On Windows, the code assumes files are executable without checking extensions properly:
| |
Problems:
- Trusts that partial matches on Windows are executable
- Might select data files like
myapp.txtif they start with the right name
Fix: On Windows, verify the file has .exe or .bat extension before accepting partial
matches.
9. Sorting Tiebreaker is Arbitrary
Location: picker.rs, lines 293-302
Issue: When multiple assets remain after all filtering, the code sorts alphabetically and picks the first:
| |
Problems:
- Completely arbitrary - depends on naming conventions
aaa-tool-gnu.tar.gzbeatszzz-tool-musl.tar.gzalphabetically- No way to make this deterministic across projects with different naming
- Can lead to selecting the wrong variant (GNU over musl, etc.)
Impact: Unpredictable behavior when multiple variants exist. The selected asset might not be optimal.
Fix: Add better heuristics:
- Prefer tarballs over zip
- Prefer explicit libc matches over unmarked
- Consider file size (larger = more likely to be feature-complete)
- Make the selection criteria more explicit
Minor Issues
10. Partial Match Case Sensitivity
Location: installer.rs, lines 410-420
Issue: The partial match checks case insensitively for the extension but case sensitively for the prefix:
| |
Impact: An archive with MyApp won’t match if looking for myapp, even though the extension
matching is case-insensitive.
Fix: Make prefix matching case-insensitive too, or document the case-sensitivity requirement.
11. No Validation of Extension Parsing
Location: picker.rs, lines 121-123
Issue: Extension parsing errors are silently ignored:
| |
Impact: If there’s a bug in extension parsing, valid assets might be silently skipped without clear errors to the user.
Fix: Accumulate skipped assets and report them in debug mode or if no assets match.
12. --matching Only Applied When Multiple Matches
Location: picker.rs, lines 338-355
Issue: The --matching filter is only applied when matches.len() > 1:
| |
Actually looking at the code flow in pick_asset_from_matches (line 270), this is called even with
1 match, but the documentation says it’s ignored with one match.
Impact: User confusion - they set --matching but it might not have the effect they expect if
there’s only one platform match.
Fix: Document this behavior more clearly or apply matching earlier in the pipeline.
Edge Cases & Robustness Issues
13. No Fallback When 64-bit Filter Removes All Assets
Location: picker.rs, lines 322-335
Issue: If all assets contain “64” but none are actually 64-bit, they all pass through. But if some contain “64” and some don’t, only “64” ones are kept.
Example:
- Platform: x86_64
- Assets:
app-v1.64-i686.tar.gz,app-v1.64-x86_64.tar.gz - Both contain “64”, both pass through
- Then alphabetical sort picks one
Impact: Can still select wrong architecture even after “64-bit filtering.”
14. Zip and 7z Archives Don’t Track Executability
Location: archive.rs, lines 100-103, 160-163
Issue: These archive formats don’t preserve Unix executable bits:
| |
Impact: On Unix, partial matches from zip files are always accepted (because None is treated
as executable), even if the file isn’t actually executable in the archive.
Fix: On Unix, after extracting from zip/7z, verify the file is executable or try to make it executable.
15. Race Condition in Tarball Iteration
Location: installer.rs, lines 129-138
Issue: The code iterates through the tarball twice - once to find the best match, then again to extract it:
| |
Impact:
- If the file is modified between iterations (unlikely but possible)
- If the archive is malformed, the second iteration might behave differently
- Performance cost of reading the archive twice
Fix: This is noted in comments as necessary due to API limitations. Could cache the path and then use tar’s by-name extraction if available.
Recommendations
High Priority Fixes
- Fix 64-bit filtering logic - Use architecture-specific regex matching instead of naive string search for “64”
- Add executable check for exact matches - Critical for security and correctness
- Improve ARM regex specificity - Prevent matching 32-bit ARM variants
- Fix Android detection - Use word boundaries
Medium Priority
- Better musl detection - Add fallbacks and try multiple binaries
- Smarter tiebreaker - Use better heuristics than alphabetical sort
- MIPS regex fix - Prevent mips_re from matching mips64
Low Priority
- Case-insensitive partial matching - More forgiving for users
- Better error messages - Show what was skipped and why
- Document limitations - Especially around zip/7z executability
Testing Recommendations
Unit Tests Needed
- Test 64-bit filter with version numbers -
myapp-v1.64.0-linux-i686.tar.gzshould NOT be selected on x86_64 - Test 64-bit filter with correct indicators -
myapp-linux-x86_64.tar.gzSHOULD be selected on x86_64 - Test Android filtering with various Linux ARM64 asset names
- Test exact match with non-executable file
- Test ARM regex with armv7, armhf variants
- Test musl filtering with unmarked binaries
- Test partial match case sensitivity
Integration Tests Needed
- Test full asset selection with real-world release patterns
- Test with archives containing multiple executables
- Test cross-platform with various naming conventions
- Test error cases (no matching assets, ambiguous matches)
Property Tests
- Asset selection should be deterministic given same inputs
- Selected asset should always be compatible with platform
- Filtering should monotonically decrease candidate set
Conclusion
The asset selection logic in ubi is generally well-structured but has several bugs that can lead to incorrect asset selection, particularly around:
- 64-bit architecture detection (using naive string search instead of architecture-aware matching)
- Architecture matching for ARM variants and Android/Linux disambiguation
- Executable detection in archives (exact matches and zip/7z formats)
- Tiebreaking when multiple valid options exist
The most critical fix needed is improving the 64-bit filtering logic to use architecture-specific regex patterns instead of a simple substring search for “64”, which can match version numbers and other non-architecture contexts.
Most of these issues can be fixed with more precise regex patterns and better filtering logic. The codebase is well-tested but would benefit from additional tests covering the edge cases identified here, particularly around version numbers interfering with architecture detection.
So how well did it do?
Well, on the plus side, it did spot a few real bugs. On the minus side, it was wrong a lot. Let’s go through each bug in detail.
1. Android Detection Too Broad, Linux ARM64 Confusion
Yes, this is a real bug. The OS matching regex of (?i:android), without word boundaries, is not
good. It would matching things like “mandroid” and “bandroid”. And in fact there are GitHub projects
with both of those words in their name!
However, it turns out that ubi simply wouldn’t work at all on Android, because I never checked
whether it was running on that OS elsewhere in the code.
But I consider this a hit in terms of bug detection, even though in practice it didn’t matter. I
fixed the regex and enabled ubi to work on Android.
2. Exact Match Doesn’t Check Executability
No, this is not a bug. This is in entirely intentional. If ubi looks through an archive and finds
an exact match for the executable name it expects (e.g. precious when it downloads from
https://github.com/houseabsolute/precious), then it will extract that.
Bonus points for calling out that “[a] malicious or poorly structured archive could have a
non-executable data file with the right name that gets selected over an actual executable”. LOL. If
you are dealing with a malicious project, executable name matching is the least of your problems.
You can use ubi to download malware quite easily!
But you can also use any tool that downloads tarballs to download malware.
3. ARM Regex Too Permissive on macOS
No, this is not a bug. Claude is assuming that when people create releases for their projects, they
name them very carefully. But trust me, they absolutely do not. People choose all sorts of nonsense
for naming. A release like some-project-Darwin-arm.tar.gz is fairly sane compared to some of the
things people do. The code intentionally matches arm without a 64 here.
There’s literally a comment in the code about doing this.
4. 64-bit Filter Uses Naive String Search
Yes, this is a bug! I fixed the 64-bit matching to use a regex that looks for CPU names instead of
just the string 64.
5. MIPS Regex Ordering Issue
No, not a bug. The code does not use both the mips_re and mip64_re regexes on one system. Ubi
looks at what CPU you are running on and picks one of these regexes based on that.
6. Musl Detection Relies on External Binary
No? I mean, this musl detection code is really dumb. But I don’t know of a better way to do this.
7. Libc Filtering is Too Lenient
Not a bug. In this case, ubi fails by downloading the wrong binary. If I made the filtering
stricter, it would sometimes fail by not downloading a binary that could work. People use ubi to
download tools written in languages that don’t link to the system’s libc, like shell scripts,
Python, Go binaries, etc.
8. Windows Extension Checking
This is the same issue as #2 above, but on Windows, it’s even more important to not check executability, because zip files, which are very common for Windows releases, don’t even include that information!
9. Sorting Tiebreaker is Arbitrary
Maybe? The suggestion to prefer an asset with a libc in the name over one without isn’t bad. I
wouldn’t call this a bug, since it works as designed, but this could be improved. That said, I’m
not sure how often this matters. I haven’t had any bug reports where changing this behavior would
improve ubi for someone.
10. Partial Match Case Sensitivity
Yeah, this is already called out in issue #83.
11. No Validation of Extension Parsing
Eh, I guess. Maybe? This would be a lot of work for very little gain. When the extension parsing is broken, I usually get a bug report ;)
12. --matching Only Applied When Multiple Matches
No, this is not a bug. This is literally exactly what the docs say it will do!
Also, it’s wrong in saying “this is called even with 1 match”. I don’t understand how it concluded this.
13. No Fallback When 64-bit Filter Removes All Assets
This is a bit weird. The title doesn’t match the bug description. But the title is wrong. Ubi does have a fallback if the 64-bit filter removes everything.
Then the actual bug description seems to be about issue 4, that the 64-bit filter is too permissive.
14. Zip and 7z Archives Don’t Track Executability
Not a bug. The behavior of ubi here is intentional, and Claude’s suggestion of “after extracting
from zip/7z, verify the file is executable or try to make it executable” is nonsense. A file
extracted from a zip/7z is never going to be executable! And ubi already makes it executable (and
will do this with any archive file type).
15. Race Condition in Tarball Iteration
Yes, but no. Nothing is going to modify the file while ubi is working with it. I guess it could
in theory be deleted by some sort of temp reaper, but whatever.
In Summary
Out of 15 issues reported, here’s my evaluation:
- It found 1 real and important bug, the 64-bit filtering issue, issue 4.
- It found 1 real bug that was irrelevant, the Android matching issue, issue 1. It did not notice that this particular code could never be executed.
- It asserted a bunch of things were bugs that are intentional behavior, issues 2, 3, 5, 7, 8, 12, 13, and 14.
- It found 1 thing I’d already noted in
ubi’s issues, that executable matching should probably be case-insensitive, issue 10. - It found some things that can be improved, but aren’t very important, issues 6, 9, 11, and 15.
It’s great that it found some legit bugs, and I’m glad to fix them. But this sure is a lot of noise. For many of these, it seems like maybe it wasn’t considering enough of the context (other code, docs, etc.) when reporting something as a bug.
Would I do this again with other code bases? Yes, I think so. I didn’t burn a ton of time on dealing with the false positives, and I am happy it found at least one real bug that no one had reported yet.