Last 12 weeks · 21 commits
4 of 6 standards met
With + , percent-encodes the structural nesting-separator dots, not just the literal dots inside a key. That breaks the documented -> round-trip for any nested object whose keys do not contain a literal dot. Repro Deeper nesting clobbers every separator except the last: The control case ( alone) round-trips correctly, so the regression is specific to . What the option means Per the README, encodes the dots that appear in the keys of an object: The dot in is part of the key and must be encoded; the dot before / is the separator inserts and must stay literal. The bug encodes both. Root cause : is the accumulated key path, so this replace runs again at every level of recursion and re-encodes separator dots that were inherited from parent levels. The literal dots inside an individual key are already handled per segment a few lines down (), so the prefix-wide replace is the source of the over-encoding. Fix Encode the dots once per key segment in the top-level driver (the same place and the same way nested child keys are already encoded), and let be the plain accumulated path. Two lines in . Relationship to #562 This is a different defect from the open #562, which fixes the opposite failure: a top-level key that contains a literal dot and has a primitive value was under-encoded ( produced instead of ). #562 never touches the line, so applying it leaves this over-encoding bug present: The two fixes share one line (moving the per-segment encode into the driver), because removing the prefix-wide replace requires the top-level key's own literal dots to be encoded there instead, otherwise the README example regresses. The distinct defect addressed here is the prefix-wide re-encoding on the recursion path. Verification Added a test for the nested separators plus a round-trip assertion; it fails on current HEAD (red) and passes with the fix (green). Full suite green: assertions pass. passes, so every documented example still produces its documented output (the literal-dot cases are unchanged). clean (no new warnings on the changed lines).
Closes #558. This changes parser behavior for malformed nested keys that start a bracket group after a parent key but never close it. Instead of treating the rest of the key as a literal segment, now throws a , which matches the maintainer direction in the issue discussion. Existing behavior is preserved for bracket-prefixed keys and for keys with unterminated inner bracket groups that are already handled as literal segments. Verification:
With , a literal in a key or value is emitted unencoded, so it round-trips back as a space: The iso-8859-1 path encodes via , which leaves untouched, but a in a query string decodes to a space, so the value is silently corrupted. The utf-8 path already percent-encodes it (). I encode as in the iso-8859-1 branch too, matching utf-8, so values containing a survive a round-trip.
With , a key that contains a dot is supposed to survive a → round-trip. That works when the value is an object, but breaks when the value is a primitive at the top level: So the dot in the key is silently treated as structure and the original key is lost. The dot-encoding was only applied on the recursion path (when the key's value is another object) and on nested child keys, but a top-level key with a primitive value takes the leaf path and was passed through unencoded. I moved the encoding to the top-level driver so the key is encoded the same way regardless of its value's type, matching how nested keys are already handled. The existing tests only ever used object values, which is why this slipped through; I added a case covering a primitive value (plus a round-trip assertion). Full test suite, lint, and readme checks pass locally.
Bug With (in charset), a numeric character reference for an astral code point — anything above U+FFFF, i.e. emoji and many CJK-extension characters — is decoded into the wrong character: Cause uses : operates on UTF-16 code units (0 – 0xFFFF) and truncates larger values to 16 bits. For , , so it yields (a lone Private-Use-Area char) rather than the surrogate pair for U+1F600. BMP references (e.g. the existing → ☺, and the checkmark used by the charset sentinel) happen to be unaffected because they already fit in 16 bits. Fix Use , which produces the correct surrogate pair across the full Unicode range. throws a for values above the Unicode maximum (U+10FFFF), which never did — so guard against that and leave out-of-range entities (, , …) as the literal text instead of throwing. For valid BMP references the output is byte-for-byte identical to before. Tests Added a case under the existing tests in : (U+1F600) round-trips to 😀. An out-of-range reference () is left untouched and does not throw. Verification: — 404 passing (was 402 + 2 failing without the fix; the two new assertions fail on and pass with the change). — 939 passing. — 0 errors (pre-existing warnings only, none on the changed lines). /the WHATWG URL encoder and browsers all resolve to U+1F600; this brings in line.
Repository: ljharb/qs. Description: A querystring parser and serializer with nesting support Stars: 8942, Forks: 888. Primary language: JavaScript. Languages: JavaScript (100%). License: BSD-3-Clause. Topics: browsers, encoding, javascript, node, nodejs, parse, querystrings, stringify, url-parsing. Open PRs: 21, open issues: 51. Last activity: 2d ago. Community health: 85%. Top contributors: ljharb, nlf, papandreou, dead-horse, Connormiha, mizozobu, geek, Jokero, tdzienniak, elidoran and others.